If you can code it, someone can exploit it. That may seem like an obvious statement to some, but it’s a rule we often take for granted when we step outside of doing day-to-day programming. For example, consider GitHub Actions. Most people would consider CI/CD programming to have security or quality considerations. The truth is that anything you can code requires thought. While I’m using GitHub Actions for the walkthrough, these issues really apply across CI/CD tools.
Expressions
To have the necessary flexibility for builds, GitHub provides support for expressions. Expressions are a short syntax that enables you to apply logic or use inputs as part of your automation. They follow this format:
${{
Your code here }}
An expression can provide support for both logic (if
) and content. For example, consider:
1on:
2 workflow_dispatch:
3 inputs:
4 message:
5 type: string
6 description: Text message to echo
7⋮
8steps:
9 - name: Write a message to the console
10 if: ${{ inputs.message }}
11 run: echo ${{ inputs.message}}
12⋮
This creates a workflow that requires the user to manually start it. The user will provide a string value, message
, that gets written to the console. The if
condition ensures that the step is only run if the user provides a value.
The implementation
Think of a workflow as being a compiled application. Blocks of code in the run
are turned into scripts that can be executed. As the job and its steps are prepared, the expressions are compiled and directly replaced with the values they represent. If my message
is Hello world
, the compiled step becomes:
1- name: Write a message to the console
2 if: true
3 run: echo Hello world
This direct replacement can lead to unexpected results in some cases. For example, what if message is a formatted JSON object:
1{
2 "hello": "world"
3}
Because the code is directly injected, the YAML step now becomes:
1- name: Write a message to the console
2 if: true
3 run: "echo {
4 \"hello\": \"world\"
5 }"
And this in turn becomes a short script for the step to execute:
1echo {
2 "hello": "world"
3}
The command fails because the script is invalid. This happens any time you inject a value using an expression. The workaround for this is to take advantage of how environment variables are handled. If the variable is declared as part of the step, the complete string for the expression is assigned. The variable is then injected into the environment where the step is executed. When environment variables are used, they are evaluated when the process is launched and provide a single string value. The variable can be read using $VARIABLE
in Linux, %VARIABLE%
in Windows CMD, or $env:VARIABLE
in PowerShell. The corrected step becomes:
1- name: Write a message to the console
2 if: ${{ inputs.message }}
3 run: echo $MESSAGE
4 env:
5 MESSAGE: ${{ inputs.message}}
Now the step will behave as expected. The echo command will receive an environment variable that is interpreted and expanded when the command runs. (Yes, I know … I’m really over-simplifying what’s happening!)
Notice that I don’t use the expression ${{ env.MESSAGE }}
. Because it’s an expression, it is expanded at “compile time”. Instead of referencing the environment variable, it processes and replaces the expression with the appropriate value.
Exploiting the feature
Taking the example a step further, what happens if the user’s message is hello && rm -rf /
? If we’re just using an expression, the generated script is effectively:
1echo hello && rm-rf /
This writes “hello” to the console while erasing all of the files on the current drive. That’s probably not what you expected! Because the expression is directly replaced, a knowledgeable user can use &&
(or other approaches) to run multiple processes. There’s now a way to exploit the process (or at least do something more than what was expected).
Using the environment variable approach above, the generated script would have instead been:
1echo $MESSAGE
The environment variable MESSAGE
would be configured as a string with the value hello && rm -rf /
. As a result, the user would see the expected result – the value is echoed to the console. The approach which helps to keep the code from breaking also helps prevent unexpected injections.
Self-hosted runners
If you’re using self-hosted runners, this issue can be especially important to understand. If the environment is ephemeral, the concerns are generally the same as a hosted runner. However, if the environment is not ephemeral – such as a dedicated VM – the risks can be higher. This is because a self-hosted runner typically runs as a service. By default, it has access to other executables on the box and elevated system privileges. If you’re running in this configuration, then executing code may have super-user access to anything on the box. This means that mistakes in the automation code can have broader consequences.
Steps and Job Parameters
To be clear, this approach does not apply to all aspects of steps and jobs. It’s very specific to executing commands and interacting with the shell or OS (either directly or through an Action). For step/job parameters (such as if
, name
, or matrix
), direct expressions are the only way to provide the values. As an example, the scenarios above are using env
to assign a user-supplied value to an environment variable. In each of these cases, the Actions runtime ensures that the content is properly handled for you.
Conclusion
While there can be times where its useful (or necessary) to take advantage of direct expressions, it’s often advisable to consider using environment variables to ensure that commands and parameters are treated as string values. This helps keep the scripts from breaking and prevents running commands unintentionally. This approach is also very valuable for passing multiline variables between steps or jobs (for example, serialized JSON) in more advanced workflows.