6.9 KiB
Caching in MVP1
Requirements
-
We want to cache artifacts from upstream repositories in order to
- Avoid rate limiting (docker hub)
- Improve on download speed
- Improve on availability
-
We want to cache container images
- Docker Hub
- GCR
- Quay
-
We want to cache common software dependency artifacts of various programming languages
- Maven/Ivy Java
- Go
- NPM
- Rust
- PyPI
-
Must be easily configurable / manageable
- Static config
- API config (REST)
-
Must store artifacts permanently
- Resetting the cache (delete everything) should be easy, tho
-
Currently of out Scope
- Auth: Cache provides data to anyone who can reach it
-
Nice to have
- Repo Cache: Can store uploaded artifacts
Architectural Solutions
File System-Based Caching
-
Re-using artifacts stored on the local file system
- e.g. backup and restore
node_modules
directory - Setup within pipelines
- e.g. backup and restore
-
Important: proper cache key selection
-
Performance depends on the cache's storage location
- on node: fast but localized to node
- network storage: still has to download cache archive
-
Pro: Artifacts are downloaded directly from upstream, no further config needed
- Con: Does not address rate limiting concerns for initial cache warm up
-
Pro: No extra config needed in tooling apart of pipeline cache config
-
Has to be stored somewhere?
- GitHub Actions / GitLab typically manage this
- similar to local dev env
-
Con: State management
- Update the cache if new dependencies are used/requested
- Dirty state (looking at you, maven)
- Impure behaviour possible, creating side effects
- Integrity checks of package managers might be bypassed at this point
-
Con: Duplicate content
-
Con: Invalidation needed at some point
Pull-Through Cache
-
Mirror/Proxy repo for upstream repo
- Downloads artifacts transparently from upstream when requested
- Downloaded artifacts are stored locally in the mirror for faster access
-
Pro: Can be re-used in pipelines, dev machines, cloud/prod environments
-
Pro: Little state management necessary if any
-
Con: Requires extra config in tooling, build tools,
containerd
, etc -
Using only the pull-through cache should be fast enough for builds in CI
- Reproducible builds ftw
Solution Candidates
Forgejo Runner Cache
- common actions like
setup-java
do a good job as they create dependencies on all build config files (e.g allpom.xml
)- invalidation if there is any change to dependencies etc.
Nexus
-
Open source / free version
- EPL License allows commercial distribution
-
OSS version only has an extremely limited feature set of supported repository types.
- basically only maven support
- does not suffice for our use case
-
Community Edition has more features but is limited in sizing. Upgrade to Pro edition necessary if those limits are exceeded.
Artifactory
-
Open source / free version
- Limited feature set
- Separate distributions per repo type java / container / etc
-
Inconvenient and insufficient for our use case
-
Full feature set requires paid license
License evaluation needed EULA
Artipie
-
Self-hosted and upstream artifact caching
-
MIT License
-
might be abandoned / low dev activity / needs new maintainer
- However, technically it looks extremely promising
- Initial setup does not run out of the box correctly, needs some love
-
Mostly headless
- Brings a limited web interface
- Repo creation, artifact viewing
- Brings a limited web interface
-
Buggy default config
- config changes require restart, which seems to be a bug?
-
Easy to setup, once bugs and buggy config are mitigated/worked around
-
File system and object storage supported
-
No databases required
-
Pro: Config in yaml file
-
Due to its simplicity it might be a good candidate for a first upstream caching solution
Pulp
-
Self-hosted and upstream artifact caching
-
GPL 2.0 License
-
Pull-Through Caches are only technical previews and might not work correctly
- Pull-through cache does not fit into the concept of how artifacts are stored an tracked
- Intended workflow is to sync dedicated artifacts with some upstream repo, not the entire repo
-
Setup and config are quite complex
- Build for high availability
-
File system and object storage supported
-
Requires SQL Db (Postgres) and possibly Redis
kube-image-keeper
-
Creates a DaemonSet, installing a service on each worker node
-
Works within the cluster and rewrites image coordinates on the fly
-
Pro: fine grained caching control
- select/exempt images / namespaces
- cache invalidation
-
Pro: config within k8s or as k8s objects
-
Con: Invasive
-
Con: Rewrites image coordinates using a mutating webhook
-
Con: Must be hosted within each (workload) cluster
-
Con BLOCKER: Cannot handle image digest due to manifest rewrites
'Simple' Squid proxy (or similar)
- Caching of arbitrary resources via HTTP
- "Stupid" caching
- Invalidation becomes a problem rather quickly
Harbor
-
Apache 2.0 License
-
The go-to container registry
-
Allows self-hosting artifacts and caching upstream ones
-
Pro: Image Signing
-
Pro: Multi Tenant
-
Pro: Quotas
-
Pro: Vulnerability Scans
-
Pro: SBOM creation
-
Pro: P2P distribution of artifacts
-
Pro: fully fledged web interface
-
Con: Only Container / OCI related artifacts
Recommendation
-
File system cache
- Easy solution as it is offered within most pipelines
- Reduces build times significantly if dependencies have to be downloaded from outside networks
- Avoid using fs cache, i.e. forgejo runner cache, long term or at all
- Unless you can handle proper cache invalidation
- Promotes immutable infra and reproducible builds without side effects
- Use as additional layer if there is no local cache repo
-
Repo caches
- Can replace file system cache if network and repo are fast enough
- Optimal solution would be a Nexus/Artifactory-like unified solution
- Foss solutions like Artipie and Pulp have severe problems
- Requires us to add features/fixes/maintenance
- Foss solutions like Artipie and Pulp have severe problems
- Due to scarce landscape of proper foss solutions we might have to opt for multiple dedicated solutions
- If we opt for a dedicated container cache, we should re-evaluate Harbor or Quay
-
Try to use Artipie as a first, simple solution and use Forgejo Runner caches in conjunction for even better performance
- If Artipie does not work correctly or does not fit some reason we didn't waste too much time on it
- If Artipie is abandoned but the concept works for us, we should consider maintaining it and continuing its development