diff --git a/deploy/index.html b/deploy/index.html index 07dc7dbf6..d6283f98d 100644 --- a/deploy/index.html +++ b/deploy/index.html @@ -24,7 +24,7 @@
  1. Edit the file and change the VPC CIDR in use for the Kubernetes cluster:

    proxy-real-ip-cidr: XXX.XXX.XXX/XX
     

  2. Change the AWS Certificate Manager (ACM) ID as well:

    arn:aws:acm:us-west-2:XXXXXXXX:certificate/XXXXXX-XXXXXXX-XXXXXXX-XXXXXXXX
     

  3. Deploy the manifest:

    kubectl apply -f deploy.yaml
    -

NLB Idle Timeouts

Idle timeout value for TCP flows is 350 seconds and cannot be modified.

For this reason, you need to ensure the keepalive_timeout value is configured less than 350 seconds to work as expected.

By default NGINX keepalive_timeout is set to 75s.

More information with regards to timeouts can be found in the official AWS documentation

GCE-GKE

First, your user needs to have cluster-admin permissions on the cluster. This can be done with the following command:

kubectl create clusterrolebinding cluster-admin-binding \
+

NLB Idle Timeouts

Idle timeout value for TCP flows is 350 seconds and cannot be modified.

For this reason, you need to ensure the keepalive_timeout value is configured less than 350 seconds to work as expected.

By default NGINX keepalive_timeout is set to 75s.

More information with regards to timeouts can be found in the official AWS documentation

GCE-GKE

First, your user needs to have cluster-admin permissions on the cluster. This can be done with the following command:

kubectl create clusterrolebinding cluster-admin-binding \
   --clusterrole cluster-admin \
   --user $(gcloud config get-value account)
 

Then, the ingress controller can be installed like this:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.3/deploy/static/provider/cloud/deploy.yaml
diff --git a/developer-guide/code-overview/index.html b/developer-guide/code-overview/index.html
index e3f6b3af6..4c9831433 100644
--- a/developer-guide/code-overview/index.html
+++ b/developer-guide/code-overview/index.html
@@ -1,4 +1,4 @@
- Code Overview - NGINX Ingress Controller     
Skip to content

Ingress NGINX - Code Overview

This document provides an overview of Ingress NGINX code.

Core Golang code

This part of the code is responsible for the main logic of Ingress NGINX. It contains all the logics that parses Ingress Objects, annotations, watches Endpoints and turn them into usable nginx.conf configuration.

Core Sync Logics:

Ingress-nginx has an internal model of the ingresses, secrets and endpoints in a given cluster. It maintains two copy of that (1) currently running configuration model and (2) the one generated in response to some changes in the cluster.

The sync logic diffs the two models and if there's a change it tries to converge the running configuration to the new one.

There are static and dynamic configuration changes.

All endpoints and certificate changes are handled dynamically by posting the payload to an internal NGINX endpoint that is handled by Lua.


The following parts of the code can be found:

Entrypoint

Is the main package, responsible for starting ingress-nginx program.

It can be found in cmd/nginx directory.

Version

Is the package of the code responsible for adding version subcommand, and can be found in version directory.

Internal code

This part of the code contains the internal logics that compose Ingress NGINX Controller, and it's split in:

Admission Controller

Contains the code of Kubernetes Admission Controller which validates the syntax of ingress objects before accepting it.

This code can be found in internal/admission/controller directory.

File functions

Contains auxiliary codes that deal with files, such as generating the SHA1 checksum of a file, or creating required directories.

This code can be found in internal/file directory.

Ingress functions

Contains all the logics from NGINX Ingress Controller, with some examples being:

And other parts of the code that will be written in this document in a future.

K8s functions

Contains helper functions for parsing Kubernetes objects.

This part of the code can be found in internal/k8s directory.

Networking functions

Contains helper functions for networking, such as IPv4 and IPv6 parsing, SSL certificate parsing, etc.

This part of the code can be found in internal/net directory.

NGINX functions

Contains helper function to deal with NGINX, such as verify if it's running and reading it's configuration file parts.

This part of the code can be found in internal/nginx directory.

Tasks / Queue

Contains the functions responsible for the sync queue part of the controller.

This part of the code can be found in internal/task directory.

Other parts of internal

Other parts of internal code might not be covered here, like runtime and watch but they can be added in a future.

E2E Test

The e2e tests code is in test directory.

Other programs

Describe here kubectl plugin, dbg, waitshutdown and cover the hack scripts.

kubectl plugin

It containes kubectl plugin for inspecting your ingress-nginx deployments. This part of code can be found in cmd/plugin directory Detail functions flow and available flow can be found in kubectl-plugin

Deploy files

This directory contains the yaml deploy files used as examples or references in the docs to deploy Ingress NGINX and other components.

Those files are in deploy directory.

Helm Chart

Used to generate the Helm chart published.

Code is in charts/ingress-nginx.

Documentation/Website

The documentation used to generate the website https://kubernetes.github.io/ingress-nginx/

This code is available in docs and it's main "language" is Markdown, used by mkdocs file to generate static pages.

Container Images

Container images used to run ingress-nginx, or to build the final image.

Base Images

Contains the Dockerfiles and scripts used to build base images that are used in other parts of the repo. They are present in images repo. Some examples: * nginx - The base NGINX image ingress-nginx uses is not a vanilla NGINX. It bundles many libraries together and it is a job in itself to maintain that and keep things up-to-date. * custom-error-pages - Used on the custom error page examples.

There are other images inside this directory.

Ingress Controller Image

The image used to build the final ingress controller, used in deploy scripts and Helm charts.

This is NGINX with some Lua enhancement. We do dynamic certificate, endpoints handling, canary traffic split, custom load balancing etc at this component. One can also add new functionalities using Lua plugin system.

The files are in rootfs directory and contains:

Ingress NGINX Lua Scripts

Ingress NGINX uses Lua Scripts to enable features like hot reloading, rate limiting and monitoring. Some are written using the OpenResty helper.

The directory containing Lua scripts is rootfs/etc/nginx/lua.

Nginx Go template file

One of the functions of Ingress NGINX is to turn Ingress objects into nginx.conf file.

To do so, the final step is to apply those configurations in nginx.tmpl turning it into a final nginx.conf file.

Skip to content

Ingress NGINX - Code Overview

This document provides an overview of Ingress NGINX code.

Core Golang code

This part of the code is responsible for the main logic of Ingress NGINX. It contains all the logics that parses Ingress Objects, annotations, watches Endpoints and turn them into usable nginx.conf configuration.

Core Sync Logics:

Ingress-nginx has an internal model of the ingresses, secrets and endpoints in a given cluster. It maintains two copy of that (1) currently running configuration model and (2) the one generated in response to some changes in the cluster.

The sync logic diffs the two models and if there's a change it tries to converge the running configuration to the new one.

There are static and dynamic configuration changes.

All endpoints and certificate changes are handled dynamically by posting the payload to an internal NGINX endpoint that is handled by Lua.


The following parts of the code can be found:

Entrypoint

Is the main package, responsible for starting ingress-nginx program.

It can be found in cmd/nginx directory.

Version

Is the package of the code responsible for adding version subcommand, and can be found in version directory.

Internal code

This part of the code contains the internal logics that compose Ingress NGINX Controller, and it's split in:

Admission Controller

Contains the code of Kubernetes Admission Controller which validates the syntax of ingress objects before accepting it.

This code can be found in internal/admission/controller directory.

File functions

Contains auxiliary codes that deal with files, such as generating the SHA1 checksum of a file, or creating required directories.

This code can be found in internal/file directory.

Ingress functions

Contains all the logics from NGINX Ingress Controller, with some examples being:

And other parts of the code that will be written in this document in a future.

K8s functions

Contains helper functions for parsing Kubernetes objects.

This part of the code can be found in internal/k8s directory.

Networking functions

Contains helper functions for networking, such as IPv4 and IPv6 parsing, SSL certificate parsing, etc.

This part of the code can be found in internal/net directory.

NGINX functions

Contains helper function to deal with NGINX, such as verify if it's running and reading it's configuration file parts.

This part of the code can be found in internal/nginx directory.

Tasks / Queue

Contains the functions responsible for the sync queue part of the controller.

This part of the code can be found in internal/task directory.

Other parts of internal

Other parts of internal code might not be covered here, like runtime and watch but they can be added in a future.

E2E Test

The e2e tests code is in test directory.

Other programs

Describe here kubectl plugin, dbg, waitshutdown and cover the hack scripts.

kubectl plugin

It containes kubectl plugin for inspecting your ingress-nginx deployments. This part of code can be found in cmd/plugin directory Detail functions flow and available flow can be found in kubectl-plugin

Deploy files

This directory contains the yaml deploy files used as examples or references in the docs to deploy Ingress NGINX and other components.

Those files are in deploy directory.

Helm Chart

Used to generate the Helm chart published.

Code is in charts/ingress-nginx.

Documentation/Website

The documentation used to generate the website https://kubernetes.github.io/ingress-nginx/

This code is available in docs and it's main "language" is Markdown, used by mkdocs file to generate static pages.

Container Images

Container images used to run ingress-nginx, or to build the final image.

Base Images

Contains the Dockerfiles and scripts used to build base images that are used in other parts of the repo. They are present in images repo. Some examples: * nginx - The base NGINX image ingress-nginx uses is not a vanilla NGINX. It bundles many libraries together and it is a job in itself to maintain that and keep things up-to-date. * custom-error-pages - Used on the custom error page examples.

There are other images inside this directory.

Ingress Controller Image

The image used to build the final ingress controller, used in deploy scripts and Helm charts.

This is NGINX with some Lua enhancement. We do dynamic certificate, endpoints handling, canary traffic split, custom load balancing etc at this component. One can also add new functionalities using Lua plugin system.

The files are in rootfs directory and contains:

Ingress NGINX Lua Scripts

Ingress NGINX uses Lua Scripts to enable features like hot reloading, rate limiting and monitoring. Some are written using the OpenResty helper.

The directory containing Lua scripts is rootfs/etc/nginx/lua.

Nginx Go template file

One of the functions of Ingress NGINX is to turn Ingress objects into nginx.conf file.

To do so, the final step is to apply those configurations in nginx.tmpl turning it into a final nginx.conf file.

Skip to content

e2e test suite for Ingress NGINX Controller

[Default Backend] change default settings

[Default Backend]

[Default Backend] custom service

[Default Backend] SSL

[TCP] tcp-services

auth-*

affinitymode

proxy-*

mirror-*

canary-*

limit-rate

force-ssl-redirect

http2-push-preload

proxy-ssl-*

modsecurity owasp

backend-protocol - GRPC

cors-*

influxdb-*

Annotation - limit-connections

client-body-buffer-size

default-backend

connection-proxy-header

upstream-vhost

custom-http-errors

disable-access-log disable-http-access-log disable-stream-access-log

server-snippet

rewrite-target use-regex enable-rewrite-log

app-root

whitelist-source-range

enable-access-log enable-rewrite-log

x-forwarded-prefix

configuration-snippet

backend-protocol - FastCGI

from-to-www-redirect

permanent-redirect permanent-redirect-code

upstream-hash-by-*

annotation-global-rate-limit

backend-protocol

satisfy

server-alias

ssl-ciphers

auth-tls-*

[Status] status update

Debug CLI

[Memory Leak] Dynamic Certificates

[Ingress] [PathType] mix Exact and Prefix paths

[Ingress] definition without host

single ingress - multiple hosts

[Ingress] [PathType] exact

[Ingress] [PathType] prefix checks

[Security] request smuggling

[SSL] [Flag] default-ssl-certificate

enable-real-ip

access-log

[Lua] lua-shared-dicts

server-tokens

use-proxy-protocol

[Flag] custom HTTP and HTTPS ports

[Security] no-auth-locations

Dynamic $proxy_host

proxy-connect-timeout

[Security] Pod Security Policies

Geoip2

[Security] Pod Security Policies with volumes

enable-multi-accept

log-format-*

[Flag] ingress-class

ssl-ciphers

proxy-next-upstream

[Security] global-auth-url

[Security] block-*

plugins

Configmap - limit-rate

Configure OpenTracing

use-forwarded-headers

proxy-send-timeout

Add no tls redirect locations

settings-global-rate-limit

add-headers

hash size

keep-alive keep-alive-requests

[Flag] disable-catch-all

main-snippet

[SSL] TLS protocols, ciphers and headers)

Configmap change

proxy-read-timeout

[Security] modsecurity-snippet

OCSP

reuse-port

[Shutdown] Graceful shutdown with pending request

[Shutdown] ingress controller

[Service] backend status code 503

[Service] Type ExternalName

Skip to content

e2e test suite for Ingress NGINX Controller

[Default Backend] change default settings

[Default Backend]

[Default Backend] custom service

[Default Backend] SSL

[TCP] tcp-services

auth-*

affinitymode

proxy-*

mirror-*

canary-*

limit-rate

force-ssl-redirect

http2-push-preload

proxy-ssl-*

modsecurity owasp

backend-protocol - GRPC

cors-*

influxdb-*

Annotation - limit-connections

client-body-buffer-size

default-backend

connection-proxy-header

upstream-vhost

custom-http-errors

disable-access-log disable-http-access-log disable-stream-access-log

server-snippet

rewrite-target use-regex enable-rewrite-log

app-root

whitelist-source-range

enable-access-log enable-rewrite-log

x-forwarded-prefix

configuration-snippet

backend-protocol - FastCGI

from-to-www-redirect

permanent-redirect permanent-redirect-code

upstream-hash-by-*

annotation-global-rate-limit

backend-protocol

satisfy

server-alias

ssl-ciphers

auth-tls-*

[Status] status update

Debug CLI

[Memory Leak] Dynamic Certificates

[Ingress] [PathType] mix Exact and Prefix paths

[Ingress] definition without host

single ingress - multiple hosts

[Ingress] [PathType] exact

[Ingress] [PathType] prefix checks

[Security] request smuggling

[SSL] [Flag] default-ssl-certificate

enable-real-ip

access-log

[Lua] lua-shared-dicts

server-tokens

use-proxy-protocol

[Flag] custom HTTP and HTTPS ports

[Security] no-auth-locations

Dynamic $proxy_host

proxy-connect-timeout

[Security] Pod Security Policies

Geoip2

[Security] Pod Security Policies with volumes

enable-multi-accept

log-format-*

[Flag] ingress-class

ssl-ciphers

proxy-next-upstream

[Security] global-auth-url

[Security] block-*

plugins

Configmap - limit-rate

Configure OpenTracing

use-forwarded-headers

proxy-send-timeout

Add no tls redirect locations

settings-global-rate-limit

add-headers

hash size

keep-alive keep-alive-requests

[Flag] disable-catch-all

main-snippet

[SSL] TLS protocols, ciphers and headers)

Configmap change

proxy-read-timeout

[Security] modsecurity-snippet

OCSP

reuse-port

[Shutdown] Graceful shutdown with pending request

[Shutdown] ingress controller

[Service] backend status code 503

[Service] Type ExternalName

Skip to content

Availability zone aware routing

Table of Contents

Summary

Teach ingress-nginx about availability zones where endpoints are running in. This way ingress-nginx pod will do its best to proxy to zone-local endpoint.

Motivation

When users run their services across multiple availability zones they usually pay for egress traffic between zones. Providers such as GCP, and Amazon EC2 usually charge extra for this feature. ingress-nginx when picking an endpoint to route request to does not consider whether the endpoint is in a different zone or the same one. That means it's at least equally likely that it will pick an endpoint from another zone and proxy the request to it. In this situation response from the endpoint to the ingress-nginx pod is considered inter-zone traffic and usually costs extra money.

At the time of this writing, GCP charges $0.01 per GB of inter-zone egress traffic according to https://cloud.google.com/compute/network-pricing. According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs-and-how-to-minimize-them/ Amazon also charges the same amount of money as GCP for cross-zone, egress traffic.

This can be a lot of money depending on once's traffic. By teaching ingress-nginx about zones we can eliminate or at least decrease this cost.

Arguably inter-zone network latency should also be better than cross-zone.

Goals

  • Given a regional cluster running ingress-nginx, ingress-nginx should do best-effort to pick a zone-local endpoint when proxying
  • This should not impact canary feature
  • ingress-nginx should be able to operate successfully if there are no zonal endpoints

Non-Goals

  • This feature inherently assumes that endpoints are distributed across zones in a way that they can handle all the traffic from ingress-nginx pod(s) in that zone
  • This feature will be relying on https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone, it is not this KEP's goal to support other cases

Proposal

The idea here is to have the controller part of ingress-nginx (1) detect what zone its current pod is running in and (2) detect the zone for every endpoint it knows about. After that, it will post that data as part of endpoints to Lua land. When picking an endpoint, the Lua balancer will try to pick zone-local endpoint first and if there is no zone-local endpoint then it will fall back to current behavior.

Initially, this feature should be optional since it is going to make it harder to reason about the load balancing and not everyone might want that.

How does controller know what zone it runs in? We can have the pod spec pass the node name using downward API as an environment variable. Upon startup, the controller can get node details from the API based on the node name. Once the node details are obtained we can extract the zone from the failure-domain.beta.kubernetes.io/zone annotation. Then we can pass that value to Lua land through Nginx configuration when loading lua_ingress.lua module in init_by_lua phase.

How do we extract zones for endpoints? We can have the controller watch create and update events on nodes in the entire cluster and based on that keep the map of nodes to zones in the memory. And when we generate endpoints list, we can access node name using .subsets.addresses[i].nodeName and based on that fetch zone from the map in memory and store it as a field on the endpoint. This solution assumes failure-domain.beta.kubernetes.io/zone annotation does not change until the end of the node's life. Otherwise, we have to watch update events as well on the nodes and that'll add even more overhead.

Alternatively, we can get the list of nodes only when there's no node in the memory for the given node name. This is probably a better solution because then we would avoid watching for API changes on node resources. We can eagerly fetch all the nodes and build node name to zone mapping on start. From there on, it will sync during endpoint building in the main event loop if there's no existing entry for the node of an endpoint. This means an extra API call in case cluster has expanded.

How do we make sure we do our best to choose zone-local endpoint? This will be done on the Lua side. For every backend, we will initialize two balancer instances: (1) with all endpoints (2) with all endpoints corresponding to the current zone for the backend. Then given the request once we choose what backend needs to serve the request, we will first try to use a zonal balancer for that backend. If a zonal balancer does not exist (i.e. there's no zonal endpoint) then we will use a general balancer. In case of zonal outages, we assume that the readiness probe will fail and the controller will see no endpoints for the backend and therefore we will use a general balancer.

We can enable the feature using a configmap setting. Doing it this way makes it easier to rollback in case of a problem.

Implementation History

  • initial version of KEP is shipped
  • proposal and implementation details are done

Drawbacks [optional]

More load on the Kubernetes API server.

Skip to content

Availability zone aware routing

Table of Contents

Summary

Teach ingress-nginx about availability zones where endpoints are running in. This way ingress-nginx pod will do its best to proxy to zone-local endpoint.

Motivation

When users run their services across multiple availability zones they usually pay for egress traffic between zones. Providers such as GCP, and Amazon EC2 usually charge extra for this feature. ingress-nginx when picking an endpoint to route request to does not consider whether the endpoint is in a different zone or the same one. That means it's at least equally likely that it will pick an endpoint from another zone and proxy the request to it. In this situation response from the endpoint to the ingress-nginx pod is considered inter-zone traffic and usually costs extra money.

At the time of this writing, GCP charges $0.01 per GB of inter-zone egress traffic according to https://cloud.google.com/compute/network-pricing. According to https://datapath.io/resources/blog/what-are-aws-data-transfer-costs-and-how-to-minimize-them/ Amazon also charges the same amount of money as GCP for cross-zone, egress traffic.

This can be a lot of money depending on once's traffic. By teaching ingress-nginx about zones we can eliminate or at least decrease this cost.

Arguably inter-zone network latency should also be better than cross-zone.

Goals

  • Given a regional cluster running ingress-nginx, ingress-nginx should do best-effort to pick a zone-local endpoint when proxying
  • This should not impact canary feature
  • ingress-nginx should be able to operate successfully if there are no zonal endpoints

Non-Goals

  • This feature inherently assumes that endpoints are distributed across zones in a way that they can handle all the traffic from ingress-nginx pod(s) in that zone
  • This feature will be relying on https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone, it is not this KEP's goal to support other cases

Proposal

The idea here is to have the controller part of ingress-nginx (1) detect what zone its current pod is running in and (2) detect the zone for every endpoint it knows about. After that, it will post that data as part of endpoints to Lua land. When picking an endpoint, the Lua balancer will try to pick zone-local endpoint first and if there is no zone-local endpoint then it will fall back to current behavior.

Initially, this feature should be optional since it is going to make it harder to reason about the load balancing and not everyone might want that.

How does controller know what zone it runs in? We can have the pod spec pass the node name using downward API as an environment variable. Upon startup, the controller can get node details from the API based on the node name. Once the node details are obtained we can extract the zone from the failure-domain.beta.kubernetes.io/zone annotation. Then we can pass that value to Lua land through Nginx configuration when loading lua_ingress.lua module in init_by_lua phase.

How do we extract zones for endpoints? We can have the controller watch create and update events on nodes in the entire cluster and based on that keep the map of nodes to zones in the memory. And when we generate endpoints list, we can access node name using .subsets.addresses[i].nodeName and based on that fetch zone from the map in memory and store it as a field on the endpoint. This solution assumes failure-domain.beta.kubernetes.io/zone annotation does not change until the end of the node's life. Otherwise, we have to watch update events as well on the nodes and that'll add even more overhead.

Alternatively, we can get the list of nodes only when there's no node in the memory for the given node name. This is probably a better solution because then we would avoid watching for API changes on node resources. We can eagerly fetch all the nodes and build node name to zone mapping on start. From there on, it will sync during endpoint building in the main event loop if there's no existing entry for the node of an endpoint. This means an extra API call in case cluster has expanded.

How do we make sure we do our best to choose zone-local endpoint? This will be done on the Lua side. For every backend, we will initialize two balancer instances: (1) with all endpoints (2) with all endpoints corresponding to the current zone for the backend. Then given the request once we choose what backend needs to serve the request, we will first try to use a zonal balancer for that backend. If a zonal balancer does not exist (i.e. there's no zonal endpoint) then we will use a general balancer. In case of zonal outages, we assume that the readiness probe will fail and the controller will see no endpoints for the backend and therefore we will use a general balancer.

We can enable the feature using a configmap setting. Doing it this way makes it easier to rollback in case of a problem.

Implementation History

  • initial version of KEP is shipped
  • proposal and implementation details are done

Drawbacks [optional]

More load on the Kubernetes API server.