One of the more important parts of DevOps is security. Whenever we’re developing code, we need to consider the potential security risks. One surprising place that developers often forget about is CI/CD workflows. This is an interesting area of vulnerability, and it’s one that seems to be getting a lot of focus recently.
Exploits in the CI/CD pipeline can have broad impacts. For teams using self-hosted pipelines, it might expose access to internal resources. If the runners are long-lived and not ephemeral, then exploits can also impact other builds on the same machine. I’ve previously discussed supply chain attacks with dependencies.
One of the most common exploits in any coded solution is injection attacks. Injection attacks rely on a basic mechanism. If an unchecked input can be written to an output without modification or escaping, some part of the system can ex exploited. It may allow attackers to execute code, create unintended side effects, or misdirect users.
Script injection 101
It’s important to understand how the attack works if you want to know how to avoid it. As a starting point, let’s look at a shell script injection. Consider this simple shell script which writes a message to the console:
1echo “hello $user”
If a value for $user
is provided as an environment variable, the value is expanded as the script is run. Bash will treat the variable value as a single string and render it as part of the echo
output. It doesn’t matter if the value is a multiple-line JSON object, a quoted string, or a simple name. It will be displayed as—is.
Now, imagine that instead of this, we’re using a template to dynamically write a script file. Users will provide an input, and the script file will be written to a secure area of the file system and executed. The template could look like this:
1echo “hello ${{user}}”
In this case, ${{user}}
will get replaced dynamically with a value from the user. The resulting content will then be written to the file system as a script and executed. Because the literal value is included in the script file, we no longer have Bash ensuring the value is treated as a string. If an attacker passes malicious code as the value for ${{user}}
, it will be included in the file that is created. This is one form of an injection attack.
For example, consider what happens if ${{user}}
set to a string with the value user!” && rm -rf / && echo “Goodbye!
. It’s almost as bad a “
Little Bobby Tables”. If this was an environment variable, it would have been treated as a string. However, since ${{user}}
is being replaced with a value as part of a template, the generated script file becomes:
1echo “hello user!” && rm -rf / && echo “Goodbye!”
Notice that when a string is directly written without any escapes or special encoding, it can create (“inject”) additional logic into the script. In this case, it displays two messages and erases the hard drive in between. Clearly, this is not safe to use as-is.
Now, let’s explore how the same problem can occur with Actions.
Actions injections
When you use the run
keyword in a workflow, Actions creates a temporary script file. It then puts the contents of the command into the file and executes the resulting script. In many ways, the process is similar to the templating example above. How can users inject values? Actions workflows allow you to use
contexts and
expressions. Expressions are evaluated as part of the step. Each expression is evaluated, and the expression is replaced with the results. The value is included without any special encoding.
To help make this clearer, consider this script from a workflow:
1run: |
2 echo “Processing ${{ github.event.issue.title }}”
3 ./process.sh —-title “${{ github.event.issue.title }}“
The expression is replaced with the text contents of the issue’s title. Next, the Action writes a script file with this content. Finally, it executes the script. If the issue’s title is a malicious string (such as the one in our earlier example), then the script may contain malicious steps.
The second line shows another way an injection can occur. A parameter is being passed, and the verbatim contents of the issue title will be appended after the parameter. Using our sample malicious string from before, the run
step becomes the following (which is then written into a file and executed):
1run: |
2 echo “Processing user!” && rm -rf / && echo “Goodbye!”
3 ./process.sh —-title “user!” && rm -rf / && echo “Goodbye!“
Any time an expression is used, it is written without any special handling. Values in those contexts, variables, or secrets can become vectors for exploits. This is why it is important to program defensively and prevent these kinds of risks. It can be even riskier with open source projects, allowing an attacker to add code, kick off a release process, or perform similar exploits through simple things like PR and issue titles (or comments).
To be clear, not all injection attacks are deliberate. Sometimes, they occur because someone includes details in the title which then cause a command to fail. For example, users might report an issue with running a command, such as “Running `rm -rf /` doesn’t delete any files”. This title includes backticks. When used in a script, backticks are used for command substitution. The expression is executed in a subshell, and the console output from that process replaces the backtick content. In short, echoing that title to the console from a script file could have some very unintended consequences!
Now that we understand the problem, let’s discuss a solution.
The fix
Coming back to our original example, the issue was avoided by using an environment variable. Because this was expanded when the command executes, it was treated as a single string and not as verbatim text to execute. The same approach also works in this case.
By making a minor change to the script, it becomes more secure:
1run: |
2 echo “Processing $ISSUE”
3 ./process.sh —-title “$ISSUE“
4env:
5 ISSUE: ${{ github.event.issue.title }}
This approach creates an environment variable, $ISSUE
, which will contain the untrusted value. The expression will be resolved and then used as the value for that environment variable. Nothing is injected into the generated file, and the value of $ISSUE
is treated as a string.
This same approach is recommended for working with secrets. It prevents secrets from being written into the file or becoming part of a command line (allowing it to be discovered by other processes on the same machine). By putting secrets into an environment variable, they only exist in memory. This is why the
recommended way to authenticate with the gh
CLI in an Action is to provide a token in the environment variable GH_TOKEN
. This prevents the secret from needing to be passed through a pipe or included on the command line.
This approach is also important if you want to write a JSON object to the console. By storing that value as an environment variable, you don’t end up with newlines from “pretty printed” JSON. For example, to output the full event data from the github
context:
1run: |
2 echo “Event data: $JSON”
3env:
4 JSON: ${{ toJSON(gitHub.event) }}
There are lots of avenues for exploiting code. As developers, the more we know, the better protected our systems are from exploit. This should give you a better understanding of injection attacks and a firmer understanding of how to avoid these types of issues in your CI/CD processes.
Happy DevOp’ing!