feat(docker-registry-caching-concept): added output from ipceicis-1766 into edp-doc

This commit is contained in:
Stephan Lo 2025-03-10 14:45:51 +01:00
parent 9a3f17ec02
commit e0688a01c5
18 changed files with 660 additions and 0 deletions

View file

@ -0,0 +1,25 @@
# Localdev Registry
This [repository](https://forgejo.edf-bootstrap.cx.fg1.ffm.osc.live/DevFW/localdev-registry) provides documentation and code for the usage of docker image pulling caching mechanisms.
## Overview: Usage scenarios
This are two mechanisms: Registry mirroring and registry proxying.
While the first redirects image pull requests to a real registry clone serving OCI requests and thus behaves as a representative, the second justs caches HTTP responses and acts as an object cache speeding up traffic.
As we need these two mechanisms in two settings 'local kind' and 'local docker', there result four scenarios which need to be implemented:
1. kind-mirror
1. kind-proxy-cache
1. docker-mirror
1. docker-proxy-cache
There is information about how to use and deploy these scenarios manually on the local box, and there is code for automating the deployment either by shell commands or in edpbuilder provissioning code.
## How to use this documentation
As of sprint 1, you can
1. read the [conceptual overview](./1-registry-mirror-and-cache-proxy-theory.md)
1. [set up one or a combination of the usage scenarions manually on your box](./2-registry-mirror-and-cache-proxy-manual-installation.md)
Additionally there is a documentation about ['hacking' a local mirror](./3-registry-mirror-and-cache-proxy-hacks.md), if you like to test the mirroring scenarios independently of external mirror registries.

View file

@ -0,0 +1,200 @@
# Introduction
This documentation describes how docker/OCI image pulls on a local linux box can be configured to connect to mirrors or pull through cache proxies.
The audience is developers who want to have faster or more reliable pulls, and want to avoid rate limits from external registries.
## Overview
This documantation has three parts:
1. backgound, (docker) image basics - this file
1. Installation of mirrors and caches
1. some hacks, e.g. a local mirror
## It's all about Processes!
We talk about 'docker images' and the way we 'pull' them from 'registries'.
So let's first go one step back and think about the context we use these terms and why.
The reason is that we want to run containers as processes and that they somehow need to come into life, and images are very low level artifacts on top of running conatiners.
> This is our developer's intention! We want processes running our application code!
![alt text](./img/1-linux-processes.png)
### Spawn processes by different stacks
The funny thing is that however you spawn a process you see it at the end as a normal linux process:
#### Run by a shell
```bash
bash -c 'exec -a my-job-bash sleep infinity' &
```
#### Run by Docker
```bash
docker run --restart=always -d ubuntu bash -c 'exec -a my-job-docker sleep infinity'
```
#### Run by Kubernetes
```bash
kubectl run k8sjob --image=ubuntu -- bash -c 'exec -a my-job-k8s sleep infinity'
```
#### Outcome
Your process list should look sth. like this ! :-)
```bash
~ $ ps ax | grep myjob
22529 pts/1 S 0:00 my-job-bash infinity
23154 ? Ss+ 0:00 my-job-docker infinity
24163 ? Ss 0:00 my-job-k8s infinity
```
### Extra
Try to kill the jobs and look what happens!
## Container Images
In the case of the Docker and Kubernetes 'Orchestrating overhead' we need 'images' as a base for the runtime.
> Hint: You can analyze a local image with https://github.com/containers/skopeo
* https://earthly.dev/blog/docker-image-storage-on-host/
* https://www.freecodecamp.org/news/where-are-docker-images-stored-docker-container-paths-explained/
```bash
~ $ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu latest a04dc4851cbc 3 weeks ago 78.1MB
kindest/node latest 2d9b4b74084a 2 months ago 1.05GB
ghcr.io/catthehacker/ubuntu act-latest 0fbcdbe238bf 2 months ago 1.47GB
registry 2 282bd1664cf1 16 months ago 25.4MB
ubuntu 18.04 f9a80a55f492 21 months ago 63.2MB
act-actions-action1-dockeraction latest f9a80a55f492 21 months ago 63.2MB
rpardini/docker-registry-proxy 0.6.2 6bbe4e47a504 4 years ago 12.3MB
registry.k8s.io/pause latest 350b164e7ae1 10 years ago 240kB
~ $ docker exec -it cluster-with-registry-mirror-control-plane crictl image ls
IMAGE TAG IMAGE ID SIZE
docker.io/kindest/kindnetd v20241212-9f82dd49 d300845f67aeb 39MB
docker.io/kindest/local-path-helper v20241212-8ac705d0 baa0d31514ee5 3.08MB
docker.io/kindest/local-path-provisioner v20241212-8ac705d0 04b7d0b91e7e5 22.5MB
docker.io/library/ubuntu latest a04dc4851cbcb 29.8MB
registry.k8s.io/coredns/coredns v1.11.3 c69fa2e9cbf5f 18.6MB
registry.k8s.io/etcd 3.5.16-0 a9e7e6b294baf 57.7MB
registry.k8s.io/kube-apiserver-amd64 v1.32.0 73afaf82c9cc3 98MB
registry.k8s.io/kube-apiserver v1.32.0 73afaf82c9cc3 98MB
registry.k8s.io/kube-controller-manager-amd64 v1.32.0 f3548c6ff8a1e 90.8MB
registry.k8s.io/kube-controller-manager v1.32.0 f3548c6ff8a1e 90.8MB
registry.k8s.io/kube-proxy-amd64 v1.32.0 aa194712e698a 95.3MB
registry.k8s.io/kube-proxy v1.32.0 aa194712e698a 95.3MB
registry.k8s.io/kube-scheduler-amd64 v1.32.0 faaacead470c4 70.6MB
registry.k8s.io/kube-scheduler v1.32.0 faaacead470c4 70.6MB
registry.k8s.io/pause 3.10 873ed75102791 320kB
```
### Distributable Images are stored in Registries
* Images which not have been built in the local host image store come from registries.
* They are complex compounds where the parts are stored in a `image repository`.
* For now it's not important how they are composed - we are just interested to understand how we pull them.
* **The parts we are pulling is our subject to cache!**
![alt text](img/1-images.png)
### Image Name Syntax
The naming conventions are a bit fuzzy. The best definition is this one:
*`registry/namespace/repo:tag`*
![alt text](./img/1-image-naming-convention.png)
## Engine
Next let's find out which components do the pull, so that we know what needs to be configured to use our mirror and caching components.
Images are prepared and processed by so called 'container engines', so that we have a running container at the end. These engines comply both to the OCI specification (for the underlying runtime), and secondly to the CRI spectification (for Kubernetes, to abstract away the container engine stuff).
### Different possibilities for the engine
<!--https://docs.google.com/presentation/d/1S-JqLQ4jatHwEBRUQRiA5WOuCwpTUnxl2d1qRUoTz5g p.37 -->
![alt text](./img/2-differnt-engines.png)
## OCI/CRI stack
As we want to run kind and docker we focus on engines which are both Docker and Kubernetes compliant.
<!-- https://www.kreyman.de/index.php/others/linux-kubernetes/232-unterschiede-zwischen-docker-containerd-cri-o-und-runc -->
![alt text](./img/docker-containerd-cri-o-und-runc.png)
This is possible by `containerd`, and then we have these two engine stacks with respect to Docker and Kubernetes:
> Originally containerd came from Docker, but it has a Plugin called 'cri-shim'
### Docker - container engine stack
* dockerd - container daemon
* containerd - high-level container runtime
* runc - low-level container runtime
### Kubernetes - container engine stack
* option 1 CRI-O-based:
* CRI-O - high-level container runtime
* runc - low-level container runtime
* option 2 containerd-based:
* containerd - high-level container runtime
* runc - low-level container runtime
The purposeful significance of containerd is also shown [in this even broader picture of tools and runtimes](https://sarusso.github.io/blog/container-engines-runtimes-orchestrators.html):
![alt text](./img/container-engines-runtimes-orchestrators.png)
### Kind uses containerd
The CRI part of the stack is implemented in our Kind and Docker case by containerd (and runc)
```bash
~ $ k get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
cluster-with-registry-mirror-control-plane Ready control-plane 7h4m v1.32.0 172.18.0.4 <none> Debian GNU/Linux 12 (bookworm) 6.8.0-53-generic containerd://1.7.24
```
### How this relates to images
As a side note: How the engine operates on images to pass it to the dring component `runc` is nicely shown here:
<!--https://docs.google.com/presentation/d/1S-JqLQ4jatHwEBRUQRiA5WOuCwpTUnxl2d1qRUoTz5g p.36 -->
![alt text](./img/1-container-engine.png)
## Outcome
In a local
* Linux host setup
with
* Kind-Kubernetes and
* Docker
as container engines we can focus on dockerd and containerd when we want to handle image pulling.
We must also focus on dockerd (not only containerd, although it's also in der docker engine included) as dockerd overwrites the containerd mirror config.
## references
* https://collabnix.com/monitoring-containerd
* https://github.com/containerd/containerd/blob/main/docs/cri/registry.md#configure-registry-credentials-example---gcr-with-service-account-key-authentication
* https://medium.com/@charled.breteche/caching-docker-images-for-local-kind-clusters-252fac5434aa
* https://maelvls.dev/docker-proxy-registry-kind/

View file

@ -0,0 +1,256 @@
# Installation
This documentation describes how docker/OCI image pulls on a local linux box can be configured to connect to mirrors or pull through cache proxies.
The audience is developers who want to have faster or more reliable pulls, and want to avoid rate limits from external registries.
## Introduction
There are four different scenarios, which can be combined arbitrarily:
| Registry type | kind | docker |
| --- | --- | --- |
| mirror | scenario 1 | scenario 3 |
| cache | scenario 2 | scenario 4 |
The scenarios can be combined arbitrarily, but when use use sceario 1 you shouldn't forget to also use the mirror in kind:
| combination | s2 (kind + cache) | s3 (docker + mirror) | s4 (docker + cache) |
| --- | --- | --- | --- |
| s1 (kind + mirror) | Mirror and Cache only for kind, you probably don't use docker too much | Both Docker and Kind are mirrored and dont have a cache | |
| s2 (kind + cache) | | doesn't make sense without s1 | both kind and docker are cached and dont need a mirror, you probably are in free internet |
| s3 (docker + mirror) | | | doesn't make sense without s1 |
## Preliminaries
We will need two container images stored in our container host repo before we do changes to the pull configuration.
So be sure that you have them already successfully pulled, or do it right now:
```bash
# precondition: your current docker config is able to pull images from docker.io
docker pull registry:2
docker pull rpardini/docker-registry-proxy:0.6.2
```
## Scenario 1: Registry Mirror on Kind for MMS company network
### What you need
You need to know the registries you want to mirror, and the address of the mirror, which here will be the MMS artifactory mirror. (remark: see in the last chapter for a test registry as mirror on your box.)
> Hint: Typically only 'docker.io' needs to be mirrored.
### Install
The installation is done by setting the `containerd` configuartion with a registry-mirror entry during kind setup.
```bash
# in MMS:
MIRROR_NAME=common-docker.artifacts.mms-at-work.de
KIND_CLUSTER_NAME=cluster-with-registry-mirror
```
In the follwing kind config we only mirror `docker.io`. You can append [as much mirror-entries as you want[(https://github.com/containerd/containerd/blob/main/docs/cri/registry.md#configure-registry-credentials-example---gcr-with-service-account-key-authentication)].
```bash
cat <<EOF | kind create cluster --name $KIND_CLUSTER_NAME --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["${MIRROR_NAME}"]
EOF
```
> Hint: To get logs in info or debug level see section 'hacks'
> Hint: You can also change to `containerd config in a running node. Then just restart the containerd with `systemctl restart containerd`
## Outcome
Now a typical company network blocker is removed - all containerd's (in every cluster node) knows how to bypass the Dockerhub rate limit by requetsing `docker.io` images first from the mirror:
![alt text](./img/2-scenario-1-kind-mirror.png))
The drawback is that you still need the bandwidth to the mirror. So if you are in homeoffice you still will pull each new image on each new kind cliuster creation over the network.
## Scenario 2: Registry Cache Proxy on Kind
Now it gets more tricky or let's say 'even more local': We will install a [cache proxy](`https://github.com/rpardini/docker-registry-proxy) as docker container process on our host.
### What you need
You need the [Caching proxy](`https://github.com/rpardini/docker-registry-proxy). We will install it on your node.
### Install
```bash
CACHE_PROXY_NAME=docker_registry_proxy
DOCKER_KIND_NETWORK=kind
```
#### Caching Proxy
Run the caching proxy `https://github.com/rpardini/docker-registry-proxy`:
```bash
# we start the caching proxy. It will also cache the mirror, if used
# tip: this container is very long running and stores the cached data in a subfolder from your pwd.
# So it's recommended to start it in a higher level folder.
docker run -itd \
--restart always \
--name $CACHE_PROXY_NAME \
--network $DOCKER_KIND_NETWORK \
--hostname $CACHE_PROXY_NAME \
-p 0.0.0.0:${HOST_PORT:-3128}:3128 \
-e ENABLE_MANIFEST_CACHE=true \
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
-v $(pwd)/docker_mirror_certs:/ca \
-e REGISTRIES="$MIRROR_NAME k8s.gcr.io gcr.io quay.io docker.elastic.co" \
rpardini/docker-registry-proxy:0.6.2
```
#### Kind cluster
Next start your kind cluster if not running yet. You also can use the cluster from above with the mirror set.
> When you run a test registry mirror (see section 'hacks'), then you need either to provide the mirror's certificate inside the proxy server or you setup the proxy server with tls_no_verify (`-e VERIFY_SSL=false`). In the former everything will work as suspected, in the latter the mirror will complain about a missing TLS connectivity to the proxy.
#### Configure nodes of the Kind cluster
Now each node's `containerd` needs to be configured to use the Cache proxy on the host:
```bash
#!/bin/sh
# https://github.com/rpardini/docker-registry-proxy#kind-cluster
SETUP_URL=http://$CACHE_PROXY_NAME:3128/setup/systemd
pids=""
for NODE in $(kind get nodes --name "$KIND_CLUSTER_NAME"); do
docker exec "$NODE" sh -c "\
curl $SETUP_URL \
| sed s/docker\.service/containerd\.service/g \
| sed '/Environment/ s/$/ \"NO_PROXY=127.0.0.0\/8,10.0.0.0\/8,172.16.0.0\/12,192.168.0.0\/16\"/' \
| bash" & pids="$pids $!" # Configure every node in background
done
wait $pids # Wait for all configurations to end
```
## Outcome
As the proxy is configured all requests - either mirror or registry - go directly to the proxy.
The proxy checks whether first a mirror is to be connected if the image is not in the cache or is to be updated.
![alt text](./img/2-scenario-2-kind-cache.png))
## Scenario 3: Registry Mirror on Docker
> Be aware that Docker only enables a mirror for `docker.io`
### What you need
You need to know he address of the mirror. Here we reuse the ${MIRROR_NAME} from scenario 2.
Only `docker.io` will be mirrored, this is a restriction of docker.
### Run
```bash
# as root
cat << EOD > /etc/docker/daemon.json
{
"metrics-addr" : "127.0.0.1:9323",
"experimental" : true,
"features": { "buildkit": true },
"registry-mirrors": ["${MIRROR_NAME}"],
"insecure-registries" : []
}
EOD
```
```bash
# as root
systemctl restart docker
```
## Outcome
![alt text](./img/2-scenario-3-docker-mirror.png))
## Scenario 4: Registry Cache Proxy on Docker
Last not least we connect our hosts docker engine to the proxy cache.
### What you need
You need the Caching proxy. We will reuse it from scenario 2.
### Run
We run the same systemd-settings for containerd as we did for containerd in the kind nodes.
```bash
# as root
mkdir -p /etc/systemd/system/docker.service.d
cat << EOD > /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://localhost:3128/"
Environment="HTTPS_PROXY=http://localhost:3128/"
Environment="NO_PROXY=localhost,127.0.0.1,gitea.poc.edp.localtest.me"
EOD
curl http://localhost:3128/ca.crt > /usr/share/ca-certificates/docker_registry_proxy.crt
if fgrep -q "docker_registry_proxy.crt" /etc/ca-certificates.conf ; then
echo "certificate refreshed"
else
echo "docker_registry_proxy.crt" >> /etc/ca-certificates.conf
fi
update-ca-certificates --fresh
```
Now reload and restart:
```bash
# as root
systemctl daemon-reload
systemctl restart docker
```
## Outcome
As the proxy is configured all requests - either mirror or registry - go directly to the proxy.
The proxy checks whether first a mirror is to be connected if the image is not in the cache or is to be updated.
![alt text](./img/2-scenario-4-docker-cache.png))
In the `docker info` output you see proxy and mirror setting:
```bash
~ $ docker info
Client: Docker Engine - Community
Version: 27.0.2
Server:
...
HTTP Proxy: http://localhost:3128/
HTTPS Proxy: http://localhost:3128/
No Proxy: localhost,127.0.0.1,gitea.poc.edp.localtest.me
Experimental: true
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
https://registry-1.docker.io.mirror.test/
```

View file

@ -0,0 +1,179 @@
# Hacks
This documentation describes how docker/OCI image pulls on a local linux box can be configured to connect to mirrors or pull through cache proxies.
The audience is developers who want to have faster or more reliable pulls, and want to avoid rate limits from external registries.
This part is called 'hacks' and describes some more hands-on components and investigations on the command line.
## Create an own registry mirror to test a kind mirror setting
May be you don't have or need a mirror, but you would like to run all sceanrios of part 2 and thus need a local mirror.
Or you would like to investigate the handshaking between mirror and cache and thus need the logs of the mirror.
```bash
# the name of our mirror
MIRROR_NAME=registry.docker.io.mirror.test
# the mirror will be accessable by its host name in the kind network
DOCKER_KIND_NETWORK=kind
```
## The registry needs TLS
```bash
# create a temporary directory
mkdir registry-certs
```
```bash
# cert config
cat <<EOF>openssl-${MIRROR_NAME}.cnf
[req]
default_bits = 2048
default_keyfile = domain.key
distinguished_name = req_distinguished_name
x509_extensions = v3_ca
req_extensions = v3_ca
prompt = no
[req_distinguished_name]
countryName = DE
stateOrProvinceName = SomeState
localityName = SomeCity
organizationName = MyCompany
organizationalUnitName = IT
commonName = ${MIRROR_NAME}
[v3_ca]
subjectAltName = @alt_names
[alt_names]
DNS.1 = ${MIRROR_NAME}
EOF
```
```bash
# create self signed cert
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout registry-certs/${MIRROR_NAME}.key -out registry-certs/${MIRROR_NAME}.crt -config openssl-${MIRROR_NAME}.cnf
```
### Now run the registry
```bash
# run registry as mirror
docker run -d \
--name ${MIRROR_NAME} \
--network $DOCKER_KIND_NETWORK \
-p 443:443 \
-v $(pwd)/registry-certs:/certs \
-e REGISTRY_HTTP_ADDR=0.0.0.0:443 \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/${MIRROR_NAME}.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/${MIRROR_NAME}.key \
-e REGISTRY_PROXY_REMOTEURL="https://registry-1.docker.io" \
registry:2
```
### Next run the kind cluster
```bash
# create kind cluster
cat <<EOF | kind create cluster --name cluster-with-registry-mirror --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["${MIRROR_NAME}"]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.configs."${MIRROR_NAME}".tls]
insecure_skip_verify = true
EOF
```
### Log the registry and do a deployment
```bash
# in another terminal
docker logs -f ${MIRROR_NAME}
```
```bash
# check images in the cluster before deployment
docker exec -it cluster-with-registry-mirror-control-plane crictl image ls
# do deployment
kubectl run busybox --image=busybox -- /bin/sh -c "sleep 3600"
# check images in the cluster again, you should see busybox
docker exec -it cluster-with-registry-mirror-control-plane crictl image ls
```
## journalctl
You also can check the containerd logs:
```bash
docker exec -it cluster-with-registry-mirror-control-plane journalctl -u containerd
```
See also:
* Logging variants: https://www.baeldung.com/ops/containerd-check-logs
* Monitoring containerd: https://collabnix.com/monitoring-containerd/
### debug journalctl
* https://gvisor.dev/docs/
* https://gvisor.dev/docs/user_guide/containerd/configuration/
```bash
cat <<EOF | kind create cluster --name cluster-with-registry-mirror --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches:
- |-
[debug]
level="debug"
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["${MIRROR_NAME}"]
EOF
```
## Integrate registry proxy
If you have a running registry proxy which also proxies the mirror, e.g. started like
```bash
CACHE_PROXY_NAME=docker_registry_proxy
docker run -itd \
--restart always \
--name $CACHE_PROXY_NAME \
--network $DOCKER_KIND_NETWORK \
--hostname $CACHE_PROXY_NAME \
-p 0.0.0.0:${HOST_PORT:-3128}:3128 \
-e ENABLE_MANIFEST_CACHE=true \
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
-v $(pwd)/docker_mirror_certs:/ca \
-e REGISTRIES="$MIRROR_NAME k8s.gcr.io gcr.io quay.io docker.elastic.co" \
rpardini/docker-registry-proxy:0.6.2
```
then you need to make the proxy aware of the mirror's certificate.
### set mirror ca in proxy
The proxy is ssl veryfying upstreams. So we need to place the ca of the mirror.
```bash
docker cp registry-certs/registry-1.docker.io.mirror.test.crt docker_registry_proxy:/
docker exec -it docker_registry_proxy bash -c 'cat /registry-1.docker.io.mirror.test.crt >> /etc/ssl/certs/ca-certificates.crt'
docker exec -it docker_registry_proxy bash -c 'kill -SIGHUP $(cat /run/nginx.pid)'
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB