Teach ingress-nginx about availability zones where endpoints are running in. This way ingress-nginx pod will do its best to proxy to zone-local endpoint.
When users run their services across multiple availability zones they usually pay for egress traffic between zones. Providers such as GCP, and Amazon EC2 usually charge extra for this feature.
ingress-nginx when picking an endpoint to route request to does not consider whether the endpoint is in a different zone or the same one. That means it's at least equally likely
that it will pick an endpoint from another zone and proxy the request to it. In this situation response from the endpoint to the ingress-nginx pod is considered
At the time of this writing, GCP charges $0.01 per GB of inter-zone egress traffic according to https://cloud.google.com/compute/network-pricing.
According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs-and-how-to-minimize-them/ Amazon also charges the same amount of money as GCP for cross-zone, egress traffic.
* This feature inherently assumes that endpoints are distributed across zones in a way that they can handle all the traffic from ingress-nginx pod(s) in that zone
* This feature will be relying on https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone, it is not this KEP's goal to support other cases
when loading `lua_ingress.lua` module in `init_by_lua` phase.
**How do we extract zones for endpoints?**
We can have the controller watch create and update events on nodes in the entire cluster and based on that keep the map of nodes to zones in the memory.
And when we generate endpoints list, we can access node name using `.subsets.addresses[i].nodeName`
and based on that fetch zone from the map in memory and store it as a field on the endpoint.
Alternatively, we can get the list of nodes only when there's no node in the memory for the given node name. This is probably a better solution
because then we would avoid watching for API changes on node resources. We can eagerly fetch all the nodes and build node name to zone mapping on start.