179 lines
8.3 KiB
Markdown
179 lines
8.3 KiB
Markdown
![]() |
---
|
||
|
title: Http Routing
|
||
|
weight: 100
|
||
|
---
|
||
|
|
||
|
### Routing switch
|
||
|
|
||
|
The idpbuilder supports creating platforms using either path based or subdomain
|
||
|
based routing:
|
||
|
|
||
|
```shell
|
||
|
idpbuilder create --log-level debug --package https://github.com/cnoe-io/stacks//ref-implementation
|
||
|
```
|
||
|
|
||
|
```shell
|
||
|
idpbuilder create --use-path-routing --log-level debug --package https://github.com/cnoe-io/stacks//ref-implementation
|
||
|
```
|
||
|
|
||
|
However, even though argo does report all deployments as green eventually, not
|
||
|
the entire demo is actually functional (verification?). This is due to
|
||
|
hardcoded values that for example point to the path-routed location of gitea to
|
||
|
access git repos. Thus, backstage might not be able to access them.
|
||
|
|
||
|
Within the demo / ref-implementation, a simple search & replace is suggested to
|
||
|
change urls to fit the given environment. But proper scripting/templating could
|
||
|
take care of that as the hostnames and necessary properties should be
|
||
|
available. This is, however, a tedious and repetitive task one has to keep in
|
||
|
mind throughout the entire system, which might lead to an explosion of config
|
||
|
options in the future. Code that addresses correct routing is located in both
|
||
|
the stack templates and the idpbuilder code.
|
||
|
|
||
|
### Cluster internal routing
|
||
|
|
||
|
For the most part, components communicate with either the cluster API using the
|
||
|
default DNS or with each other via http(s) using the public DNS/hostname (+
|
||
|
path-routing scheme). The latter is necessary due to configs that are visible
|
||
|
and modifiable by users. This includes for example argocd config for components
|
||
|
that has to sync to a gitea git repo. Using the same URL for internal and
|
||
|
external resolution is imperative.
|
||
|
|
||
|
The idpbuilder achieves transparent internal DNS resolution by overriding the
|
||
|
public DNS name in the cluster's internal DNS server (coreDNS). Subsequently,
|
||
|
within the cluster requests to the public hostnames resolve to the IP of the
|
||
|
internal ingress controller service. Thus, internal and external requests take
|
||
|
a similar path and run through proper routing (rewrites, ssl/tls, etc).
|
||
|
|
||
|
### Conclusion
|
||
|
|
||
|
One has to keep in mind that some specific app features might not
|
||
|
work properly or without haxx when using path based routing (e.g. docker
|
||
|
registry in gitea). Futhermore, supporting multiple setup strategies will
|
||
|
become cumbersome as the platforms grows. We should probably only support one
|
||
|
type of setup to keep the system as simple as possible, but allow modification
|
||
|
if necessary.
|
||
|
|
||
|
DNS solutions like `nip.io` or the already used `localtest.me` mitigate the
|
||
|
need for path based routing
|
||
|
|
||
|
## Excerpt
|
||
|
|
||
|
HTTP is a cornerstone of the internet due to its high flexibility. Starting
|
||
|
from HTTP/1.1 each request in the protocol contains among others a path and a
|
||
|
`Host`name in its header. While an HTTP request is sent to a single IP address
|
||
|
/ server, these two pieces of data allow (distributed) systems to handle
|
||
|
requests in various ways.
|
||
|
|
||
|
```shell
|
||
|
$ curl -v http://google.com/something > /dev/null
|
||
|
|
||
|
* Connected to google.com (2a00:1450:4001:82f::200e) port 80
|
||
|
* using HTTP/1.x
|
||
|
> GET /something HTTP/1.1
|
||
|
> Host: google.com
|
||
|
> User-Agent: curl/8.10.1
|
||
|
> Accept: */*
|
||
|
...
|
||
|
```
|
||
|
|
||
|
### Path-Routing
|
||
|
|
||
|
Imagine requesting `http://myhost.foo/some/file.html`, in a simple setup, the
|
||
|
web server `myhost.foo` resolves to would serve static files from some
|
||
|
directory, `/<some_dir>/some/file.html`.
|
||
|
|
||
|
In more complex systems, one might have multiple services that fulfill various
|
||
|
roles, for example a service that generates HTML sites of articles from a CMS
|
||
|
and a service that can convert images into various formats. Using path-routing
|
||
|
both services are available on the same host from a user's POV.
|
||
|
|
||
|
An article served from `http://myhost.foo/articles/news1.html` would be
|
||
|
generated from the article service and points to an image
|
||
|
`http://myhost.foo/images/pic.jpg` which in turn is generated by the image
|
||
|
converter service. When a user sends an HTTP request to `myhost.foo`, they hit
|
||
|
a reverse proxy which forwards the request based on the requested path to some
|
||
|
other system, waits for a response, and subsequently returns that response to
|
||
|
the user.
|
||
|
|
||
|

|
||
|
|
||
|
Such a setup hides the complexity from the user and allows the creation of
|
||
|
large distributed, scalable systems acting as a unified entity from the
|
||
|
outside. Since everything is served on the same host, the browser is inclined
|
||
|
to trust all downstream services. This allows for easier 'communication'
|
||
|
between services through the browser. For example, cookies could be valid for
|
||
|
the entire host and thus authentication data could be forwarded to requested
|
||
|
downstream services without the user having to explicitly re-authenticate.
|
||
|
|
||
|
Furthermore, services 'know' their user-facing location by knowing their path
|
||
|
and the paths to other services as paths are usually set as a convention and /
|
||
|
or hard-coded. In practice, this makes configuration of the entire system
|
||
|
somewhat easier, especially if you have various environments for testing,
|
||
|
development, and production. The hostname of the system does not matter as one
|
||
|
can use hostname-relative URLs, e.g. `/some/service`.
|
||
|
|
||
|
Load balancing is also easily achievable by multiplying the number of service
|
||
|
instances. Most reverse proxy systems are able to apply various load balancing
|
||
|
strategies to forward traffic to downstream systems.
|
||
|
|
||
|
Problems might arise if downstream systems are not built with path-routing in
|
||
|
mind. Some systems require to be served from the root of a domain, see for
|
||
|
example the container registry spec.
|
||
|
|
||
|
|
||
|
### Hostname-Routing
|
||
|
|
||
|
Each downstream service in a distributed system is served from a different
|
||
|
host, typically a subdomain, e.g. `serviceA.myhost.foo` and
|
||
|
`serviceB.myhost.foo`. This gives services full control over their respective
|
||
|
host, and even allows them to do path-routing within each system. Moreover,
|
||
|
hostname-routing allows the entire system to create more flexible and powerful
|
||
|
routing schemes in terms of scalability. Intra-system communication becomes
|
||
|
somewhat harder as the browser treats each subdomain as a separate host,
|
||
|
shielding cookies for example form one another.
|
||
|
|
||
|
Each host that serves some services requires a DNS entry that has to be
|
||
|
published to the clients (from some DNS server). Depending on the environment
|
||
|
this can become quite tedious as DNS resolution on the internet and intranets
|
||
|
might have to deviate. This applies to intra-cluster communication as well, as
|
||
|
seen with the idpbuilder's platform. In this case, external DNS resolution has
|
||
|
to be replicated within the cluster to be able to use the same URLs to address
|
||
|
for example gitea.
|
||
|
|
||
|
The following example depicts DNS-only routing. By defining separate DNS
|
||
|
entries for each service / subdomain requests are resolved to the respective
|
||
|
servers. In theory, no additional infrastructure is necessary to route user
|
||
|
traffic to each service. However, as services are completely separated other
|
||
|
infrastructure like authentication possibly has to be duplicated.
|
||
|
|
||
|

|
||
|
|
||
|
When using hostname based routing, one does not have to set different IPs for
|
||
|
each hostname. Instead, having multiple DNS entries pointing to the same set of
|
||
|
IPs allows re-using existing infrastructure. As shown below, a reverse proxy is
|
||
|
able to forward requests to downstream services based on the `Host` request
|
||
|
parameter. This way specific hostname can be forwarded to a defined service.
|
||
|
|
||
|

|
||
|
|
||
|
At the same time, one could imagine a multi-tenant system that differentiates
|
||
|
customer systems by name, e.g. `tenant-1.cool.system` and
|
||
|
`tenant-2.cool.system`. Configured as a wildcard-sytle domain, `*.cool.system`
|
||
|
could point to a reverse proxy that forwards requests to a tenants instance of
|
||
|
a system, allowing re-use of central infrastructure while still hosting
|
||
|
separate systems per tenant.
|
||
|
|
||
|
|
||
|
The implicit dependency on DNS resolution generally makes this kind of routing
|
||
|
more complex and error-prone as changes to DNS server entries are not always
|
||
|
possible or modifiable by everyone. Also, local changes to your `/etc/hosts`
|
||
|
file are a constant pain and should be seen as a dirty hack. As mentioned
|
||
|
above, dynamic DNS solutions like `nip.io` are often helpful in this case.
|
||
|
|
||
|
### Conclusion
|
||
|
|
||
|
Path and hostname based routing are the two most common methods of HTTP traffic
|
||
|
routing. They can be used separately but more often they are used in
|
||
|
conjunction. Due to HTTP's versatility other forms of HTTP routing, for example
|
||
|
based on the `Content-Type` Header are also very common.
|