From ea70134ed3891934b262b3fd140354b1d9098c31 Mon Sep 17 00:00:00 2001
From: k8s-ci-robot Teach ingress-nginx about availability zones where endpoints are running in. This way ingress-nginx pod will do its best to proxy to zone-local endpoint. When users run their services across multiple availability zones they usually pay for egress traffic between zones. Providers such as GCP, Amazon EC charges money for that.
-ingress-nginx when picking an endpoint to route request to does not consider whether the endpoint is in different zone or the same one. That means it's at least equally likely
-that it will pick an endpoint from another zone and proxy the request to it. In this situation response from the endpoint to ingress-nginx pod is considered as
-inter zone traffic and costs money. At the time of this writing GCP charges $0.01 per GB of inter zone egress traffic according to https://cloud.google.com/compute/network-pricing.
-According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs-and-how-to-minimize-them/ Amazon also charges the same amount of money sa GCP for cross zone, egress traffic. When users run their services across multiple availability zones they usually pay for egress traffic between zones. Providers such as GCP, and Amazon EC2 usually charge extra for this feature.
+ingress-nginx when picking an endpoint to route request to does not consider whether the endpoint is in a different zone or the same one. That means it's at least equally likely
+that it will pick an endpoint from another zone and proxy the request to it. In this situation response from the endpoint to the ingress-nginx pod is considered
+inter-zone traffic and usually costs extra money. At the time of this writing, GCP charges $0.01 per GB of inter-zone egress traffic according to https://cloud.google.com/compute/network-pricing.
+According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs-and-how-to-minimize-them/ Amazon also charges the same amount of money as GCP for cross-zone, egress traffic. This can be a lot of money depending on once's traffic. By teaching ingress-nginx about zones we can eliminate or at least decrease this cost. Arguably inter-zone network latency should also be better than cross zone. Arguably inter-zone network latency should also be better than cross-zone. The idea here is to have controller part of ingress-nginx to (1) detect what zone its current pod is running in and (2) detect the zone for every endpoints it knows about.
-After that it will post that data as part of endpoints to Lua land. Then Lua balancer when picking an endpoint will try to pick zone-local endpoint first and
-if there is no zone-local endpoint then it will fallback to current behaviour. This feature at least in the beginning should be optional since it is going to make it harder to reason about the load balancing and not everyone might want that. The idea here is to have the controller part of ingress-nginx
+(1) detect what zone its current pod is running in and
+(2) detect the zone for every endpoint it knows about.
+After that, it will post that data as part of endpoints to Lua land.
+When picking an endpoint, the Lua balancer will try to pick zone-local endpoint first and
+if there is no zone-local endpoint then it will fall back to current behaviour. Initially, this feature should be optional since it is going to make it harder to reason about the load balancing and not everyone might want that. How does controller know what zone it runs in?
-We can have the pod spec do pass node name using downward API as an environment variable.
-Then on start controller can get node details from the API based on node name. Once the node details is obtained
-we can extract the zone from How do we extract zones for endpoints?
We can have the controller watch create and update events on nodes in the entire cluster and based on that keep the map of nodes to zones in the memory.
And when we generate endpoints list, we can access node name using Alternatively, we can get the list of nodes only when there's no node in the memory for given node name. This is probably a better solution
-because then we would avoid watching for API changes on node resources. We can eagrly fetch all the nodes and build node name to zone mapping on start.
-And from thereon sync it during endpoints building in the main event loop iff there's no entry exist for the node of an endpoint.
+ Alternatively, we can get the list of nodes only when there's no node in the memory for the given node name. This is probably a better solution
+because then we would avoid watching for API changes on node resources. We can eagerly fetch all the nodes and build node name to zone mapping on start.
+From there on, it will sync during endpoint building in the main event loop if there's no existing entry for the node of an endpoint.
This means an extra API call in case cluster has expanded. How do we make sure we do our best to choose zone-local endpoint?
-This will be done on Lua side. For every backend we will initialize two balancer instances: (1) with all endpoints
-(2) with all endpoints corresponding to current zone for the backend. Then given the request once we choose what backend
-needs to serve the request, we will first try to use zonal balancer for that backend. If zonal balancer does not exist (i.e there's no zonal endpoint)
-then we will use general balancer. In case of zonal outages we assume that readiness probe will fail and controller will
-see no endpoints for the backend and therefore we will use general balancer.Summary ¶
Motivation ¶
-Goals ¶
-
Non-Goals ¶
@@ -1206,36 +1206,45 @@ According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs
Proposal ¶
-failure-domain.beta.kubernetes.io/zone
annotation. Then we can pass that value to Lua land through Nginx configuration
+We can have the pod spec pass the node name using downward API as an environment variable.
+Upon startup, the controller can get node details from the API based on the node name.
+Once the node details are obtained
+we can extract the zone from the failure-domain.beta.kubernetes.io/zone
annotation.
+Then we can pass that value to Lua land through Nginx configuration
when loading lua_ingress.lua
module in init_by_lua
phase..subsets.addresses[i].nodeName
and based on that fetch zone from the map in memory and store it as a field on the endpoint.
-This solution assumes failure-domain.beta.kubernetes.io/zone
annotation does not change until the end of node's life. Otherwise we have to
+This solution assumes failure-domain.beta.kubernetes.io/zone
annotation does not change until the end of the node's life. Otherwise, we have to
watch update events as well on the nodes and that'll add even more overhead.
We can enable the feature using a configmap setting. Doing it this way makes it easier to rollback in case of a problem.
More load on the Kubernetes API server.
diff --git a/search/search_index.json b/search/search_index.json index 64f9740f1..9874463ef 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"Welcome \u00b6 This is the documentation for the NGINX Ingress Controller. It is built around the Kubernetes Ingress resource , using a ConfigMap to store the NGINX configuration. Learn more about using Ingress on k8s.io . Getting Started \u00b6 See Deployment for a whirlwind tour that will get you started.","title":"Welcome"},{"location":"#welcome","text":"This is the documentation for the NGINX Ingress Controller. It is built around the Kubernetes Ingress resource , using a ConfigMap to store the NGINX configuration. Learn more about using Ingress on k8s.io .","title":"Welcome"},{"location":"#getting-started","text":"See Deployment for a whirlwind tour that will get you started.","title":"Getting Started"},{"location":"development/","text":"Developing for NGINX Ingress Controller \u00b6 This document explains how to get started with developing for NGINX Ingress controller. Prerequisites \u00b6 Install Go 1.14 or later. Note The project uses Go Modules Install Docker Important The majority of make tasks run as docker containers Quick Start \u00b6 Fork the repository Clone the repository to any location in your work station Add a GO111MODULE environment variable with export GO111MODULE=on Run go mod download to install dependencies Local build \u00b6 Start a local Kubernetes cluster using kind , build and deploy the ingress controller make dev-env Testing \u00b6 Run go unit tests make test Run unit-tests for lua code make lua-test Lua tests are located in the directory rootfs/etc/nginx/lua/test Important Test files must follow the naming convention