This week, I wanted to explore one of the most overloooked aspects of building a proper GitHub Actions Runner image - caching. The
base Runner image is very small. It contains a very small subset of the files that it will need for doing typical tasks. If you’ve build an image before, this is not unusual. In fact, you’re probably used to needing to add files to the image to make everything work correctly. Some of those need to be available globally, such as tar
. Others, such as language SDKs, should be installed dynamically using tools.
In Actions, tools are the files that are configured by tasks such as actions/setup-java
and actions/setup-python
. In fact, if it starts with actions/setup-
, it’s a tool. What makes tools special is how they work. Essentially, running a step with that Action will do a few things:
- Look to see if that version of the tool exists in the directory pointed to by the environment variable
RUNNER_TOOL_CACHE
(or the_tools
directory in the Runner’s_work
folder). - If the version doesn’t exist, find the URL for the tool and download it. Unpack it into the tool cache.
- Add the path to the tool’s binary to the
$GITHUB_PATH
, configure caching, and update any required settings.
This process makes it easy to switch between versions of the tool on the same runner. It also avoids the risks of using the ambient version of a programming language. I’ve seen far too many teams relying on node
or dotnet
that were shocked to find their builds breaking because the global version of the tool changed. Using a tool avoids this problem!
There’s a downside to this approach, however. Each time a version of the tool is needed, multiple requests are made to resolve the version and then download the package. If you run 100 builds, that’s 100 sets of downloads. Not very efficient! The naive approach is to directly install the tool on the image. To avoid the risks of a global tool, an image is built for each variant. Unfortunately, that means that you’re building and maintaining multiple images.
A better approach is often to pre-cache the tools. While this can create a slightly larger image, it provides flexibility while minimizing download costs. In truth, the total image size is often the same or smaller than having multiple images with different versions of the tools installed globally. Actions will use the cached copy if it’s available, avoiding the download step.
So how does it work?
The Dockerfile
For this cache, the Dockerfile can be quite basic. We can start with a version of actions-runner
(or the latest copy), then copy a local copy of the tools to the image:
1FROM ghcr.io/actions/actions-runner:latest
2ENV ACTIONS_TOOL_CACHE=/home/runner/actions-tool-cache
3COPY --link --chown=1001:123 tools $ACTIONS_TOOL_CACHE
The --link
command tells Docker that the layer isn’t actually dependent on details from a previous layer to build the final image. The --chown
command sets the ownership to a specific user (runner
) and group (docker
), matching the permissions on the image. Why not use --chown=runner:docker
? Resolving those names relies on details from the base image. This leads to an error at build time unless you remove --link
.
Next up … populate the cache!
Building a tool cache
You could examine each tool, understand how the files are structured for that tool, and recreate it yourself. I’m not a fan of doing work that isn’t necessary, so let’s look at something easier. If you’ve ever worked with GitHub Enterprise Server, you may have come across this trick for building the tool cache folder using an Actions workflow. We’ll extend the process just a bit to be image-friendly.
Essentially, you need to create a job that does a few tasks:
- Replace the existing cache folder to eliminate cache content
- Run each
actions/setup-
that you want to cache - Archive the tool cache folder
- Upload the archive as an artifact
Make sure to do this on a runner that’s compatible with your image architecture so that you get the right binaries. There are a few reasons to eliminate the existing cache:
- The initial cache is huge, typically over 2GB compressed (and 8GB uncompressed)
- You want to minimize the Docker image
- You want to control what’s being cached and included
The job definition looks like this:
1on:
2 # Your triggers here
3
4jobs:
5 create-tool-cache:
6 runs-on: ubuntu-latest
7 steps:
8
9 ## Remove any existing cached content
10 - name: Clear any existing tool cache
11 run: |
12 mv "${{ runner.tool_cache }}" "${{ runner.tool_cache }}.old"
13 mkdir -p "${{ runner.tool_cache }}"
14
15 ## Run the setup tasks to download and cache the required tools
16 - name: Setup Node 16
17 uses: actions/setup-node@v4
18 with:
19 node-version: 16.x
20 - name: Setup Node 18
21 uses: actions/setup-node@v4
22 with:
23 node-version: 18.x
24 - name: Setup Java
25 uses: actions/setup-java@v4
26 with:
27 distribution: 'temurin'
28 java-version: '21'
29
30 ## Compress the tool cache folder for faster upload
31 - name: Archive tool cache
32 working-directory: ${{ runner.tool_cache }}
33 run: |
34 tar -czf tool_cache.tar.gz *
35
36 ## Upload the archive as an artifact
37 - name: Upload tool cache artifact
38 uses: actions/upload-artifact@v4
39 with:
40 name: tools
41 retention-days: 1
42 path: ${{runner.tool_cache}}/tool_cache.tar.gz
Why do I set retention-days
to 1? If everything works in the next steps, then I don’t have a need to retain that artifact. I could delete it, but I’m giving myself 1 day in case I want to review the contents. If you need artifacts for multiple architectures, simply use a matrix
to run jobs on the required hardware.
Build the image
At this point, we can now use those files to build an image. For that, I’ll use a separate job. Since I was modifying the runner state a bit in the last job, I like to start fresh in a new environment to build the image. This also ensures that any pre-job work that might be done by the setup jobs doesn’t affect my image creation. To be clear, you could run all of this on a single runner and skip the artifact upload. To create the image, we need to do a few things:
- Checkout the repo (to get the Dockerfile)
- Download the
tools
artifact (so we have the tools) - Unpack the tools where the Dockerfile expects to find them (in the
tools
folder in the workspace) - Build the image with the Dockerfile we created earlier, copying the files into the image
The job would look something likes this:
1build-with-tool-cache:
2 runs-on: ubuntu-latest
3
4 ## We need the tools archive to have been created
5 needs: create-tool-cache
6 env:
7 # Setup some variables for naming the image automatically
8 REGISTRY: ghcr.io
9 IMAGE_NAME: ${{ github.repository }}
10
11 steps:
12
13 ## Checkout the repo to get the Dockerfile
14 - name: Checkout repository
15 uses: actions/checkout@v4
16
17 ## Download the tools artifact created in the last job
18 - name: Download artifacts
19 uses: actions/download-artifact@v4
20 with:
21 name: tools
22 path: ${{github.workspace}}/tools
23
24 ## Expand the tools into the expected folder
25 - name: Unpack tools
26 run: |
27 tar -xzf ${{github.workspace}}/tools/tool_cache.tar.gz -C ${{github.workspace}}/tools/
28 rm ${{github.workspace}}/tools/tool_cache.tar.gz
29
30 ## Build the image
31
32 ## Set up BuildKit Docker container builder
33 - name: Set up Docker Buildx
34 uses: docker/setup-buildx-action@v3
35
36 ## Automatically create metadata for the image
37 - name: Extract Docker metadata
38 id: meta
39 uses: docker/metadata-action@v5
40 with:
41 images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
42
43 ## Build the image
44 - name: Build and push Docker image
45 id: build
46 uses: docker/build-push-action@v5
47 with:
48 context: .
49 push: false
50 tags: ${{ steps.meta.outputs.tags }}
51 labels: ${{ steps.meta.outputs.labels }}
You can modify this to automatically push your image to make it available for later.
And there you have it – the automation for including the tools cache in your image. It’s worth mentioning that by layering the image this way, the base image will only be updated when a new runner version is published. This can help maximize layer caching ( more on that here).
Happy DevOp’ing!