## Caching in MVP1 ### Requirements - We want to cache artifacts from upstream repositories in order to - Avoid rate limiting (docker hub) - Improve on download speed - Improve on availability - We want to cache container images - Docker Hub - GCR - Quay - We want to cache common software dependency artifacts of various programming languages - Maven/Ivy Java - Go - NPM - Rust - PyPI - Must be easily configurable / manageable - Static config - API config (REST) - Must store artifacts permanently - Resetting the cache (delete everything) should be easy, tho - Currently of out Scope - Auth: Cache provides data to anyone who can reach it - Nice to have - Repo Cache: Can store uploaded artifacts ### Architectural Solutions #### File System-Based Caching - Re-using artifacts stored on the local file system - e.g. backup and restore `node_modules` directory - Setup within pipelines - Important: proper cache key selection - Performance depends on the cache's storage location - on node: fast but localized to node - network storage: still has to download cache archive - Pro: Artifacts are downloaded directly from upstream, no further config needed - Con: Does not address rate limiting concerns for initial cache warm up - Pro: No extra config needed in tooling apart of pipeline cache config - Has to be stored somewhere? - GitHub Actions / GitLab typically manage this - similar to local dev env - Con: State management - Update the cache if new dependencies are used/requested - Dirty state (looking at you, maven) - Impure behaviour possible, creating side effects - Integrity checks of package managers might be bypassed at this point - Con: Duplicate content - Con: Invalidation needed at some point #### Pull-Through Cache - Mirror/Proxy repo for upstream repo - Downloads artifacts transparently from upstream when requested - Downloaded artifacts are stored locally in the mirror for faster access - Pro: Can be re-used in pipelines, dev machines, cloud/prod environments - Pro: Little state management necessary if any - Con: Requires extra config in tooling, build tools, `containerd`, etc - Using only the pull-through cache should be fast enough for builds in CI - Reproducible builds ftw ### Solution Candidates #### Forgejo Runner Cache - common actions like `setup-java` do a good job as they create dependencies on all build config files (e.g all `pom.xml`) - invalidation if there is any change to dependencies etc. #### Nexus [Nexus OSS GH](https://github.com/sonatype/nexus-public) - Open source / free version - EPL License allows commercial distribution - OSS version only has an extremely limited feature set of supported repository types. - basically only maven support - does not suffice for our use case - Community Edition has more features but is limited in sizing. Upgrade to Pro edition necessary if those limits are exceeded. #### Artifactory - Open source / free version - Limited feature set - Separate distributions per repo type java / container / etc - Inconvenient and insufficient for our use case - Full feature set requires paid license License evaluation needed [EULA](https://jfrog.com/artifactory/eula/) #### Artipie [GH](https://github.com/artipie/artipie) [Wiki](https://github.com/artipie/artipie/wiki) - Self-hosted and upstream artifact caching - MIT License - might be abandoned / low dev activity / needs new maintainer - However, technically it looks extremely promising - Initial setup does not run out of the box correctly, needs some love - Mostly headless - Brings a limited web interface - Repo creation, artifact viewing - Buggy default config - config changes require restart, which seems to be a bug? - Easy to setup, once bugs and buggy config are mitigated/worked around - File system and object storage supported - No databases required - Pro: Config in yaml file - Due to its simplicity it might be a good candidate for a first upstream caching solution #### Pulp [Website](https://pulpproject.org/) [GH](https://github.com/pulp/pulpcore) - Self-hosted and upstream artifact caching - GPL 2.0 License - Pull-Through Caches are only technical previews and might not work correctly - Pull-through cache does not fit into the concept of how artifacts are stored an tracked - Intended workflow is to sync dedicated artifacts with some upstream repo, not the entire repo - Setup and config are quite complex - Build for high availability - File system and object storage supported - Requires SQL Db (Postgres) and possibly Redis #### kube-image-keeper [GH](https://github.com/enix/kube-image-keeper) - Creates a DaemonSet, installing a service on each worker node - Works within the cluster and rewrites image coordinates on the fly - Pro: fine grained caching control - select/exempt images / namespaces - cache invalidation - Pro: config within k8s or as k8s objects - Con: Invasive - Con: Rewrites image coordinates using a mutating webhook - Con: Must be hosted within each (workload) cluster - Con BLOCKER: Cannot handle image digest due to manifest rewrites #### 'Simple' Squid proxy (or similar) - Caching of arbitrary resources via HTTP - "Stupid" caching - Invalidation becomes a problem rather quickly #### Harbor [Website](https://goharbor.io/) [GH](https://github.com/goharbor/harbor) - Apache 2.0 License - The go-to container registry - Allows self-hosting artifacts and caching upstream ones - Pro: Image Signing - Pro: Multi Tenant - Pro: Quotas - Pro: Vulnerability Scans - Pro: SBOM creation - Pro: P2P distribution of artifacts - Pro: fully fledged web interface - Con: Only Container / OCI related artifacts ### Recommendation - File system cache - Easy solution as it is offered within most pipelines - Reduces build times significantly if dependencies have to be downloaded from outside networks - Avoid using fs cache, i.e. forgejo runner cache, long term or at all - Unless you can handle proper cache invalidation - Promotes immutable infra and reproducible builds without side effects - Use as additional layer if there is no local cache repo - Repo caches - Can replace file system cache if network and repo are fast enough - Optimal solution would be a Nexus/Artifactory-like unified solution - Foss solutions like Artipie and Pulp have severe problems - Requires us to add features/fixes/maintenance - Due to scarce landscape of proper foss solutions we might have to opt for multiple dedicated solutions - If we opt for a dedicated container cache, we should re-evaluate Harbor or Quay - Try to use Artipie as a first, simple solution and use Forgejo Runner caches in conjunction for even better performance - If Artipie does not work correctly or does not fit some reason we didn't waste too much time on it - If Artipie is abandoned but the concept works for us, we should consider maintaining it and continuing its development