Hey there DevOps wizards, code gurus, and IT maestros! Ever found yourself buried in a mountain of code dependencies? Worried about staying on the right side of open-source licenses? We've got you covered. Let's talk about how you can automate license checks in your CI pipelines. 🚀
Why Should You Care?
Like many of you, we're big fans of open-source components. But hey, we also want to play nice and respect everyone's rights. So we decided to auto-approve certain “safe” licenses like Apache 2 and MIT, and review any other license types we bump into “on the go,” making approvals or rejecting to the particular packages. Sound like you? Then read on!
The Setup: GitHub Actions
We're using GitHub Actions as our go-to CI tool. Don't worry, the process is straightforward—just a bit of Python and some GitHub magic. We're making a reusable workflow for our many microservices, and we're storing them in our GHA-Store repo.
Interested in scaling GitHub Actions? Subscribe to our newsletter (on the right); we've got more on that in upcoming posts.
Your First Step: Reusable Workflow File
Here's a snippet for initiating the reusable workflow:
Nothing too crazy, right?
This sets up the workflow to be reusable and passes down inputs from the parent workflow to the child. As reusable workflow can't take all parent's inputs by default, we pack it into json and pass it as an argument to this workflow.
In CHECKOUT_REF you can find how we get the value back, I’ll show you the packaging at the end of this post, where we will call this workflow.
Let's Get to Work: The Main Job
1. Fetch the Source Code and License Config
Your job starts with fetching your repo and a separate repo containing your license configuration. Here's the code:
Regarding "Checkout current repo" everything is simple, but pay attention to "ref". We use the to ensure that the PR will get Commit from PR because the default behavior on PR is to checkout ephemeral commits of merging PR to the target branch.
To access configuration, we need to checkout another repo (inside our org). To do so, we created a Github Application with a specific set of permissions.
So, as result, we have the next working directory structure:
2. Setting Up Go and Vendor
If you're dealing with a Golang repo, you'll need to set up Go and Govendor like so:
Here we setup “go” and again reuse the token from the "Generate token" step.
If you don't have private dependencies, you can omit the "Setup vendor" step.
3. Generate Software Bill of Materials (SBOM)
Why SBOM? It makes it easier to list all your dependencies.
To do it with Golang we will use "CycloneDX/gh-gomod-generate-sbom@v2" from CycloneDX.
Note, we used OWASP CycloneDX as it is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction. It is managed by the CycloneDX Core Working Group, is backed by the OWASP Foundation
As an output, it will generate a JSON file with all our dependencies.
Take a look:
By using SBOM as a standard to get dependencies information, we can avoid further refactoring to support multiple languages and make our code simple.
The only thing left is to parse this SBOM and compare dependencies licenses with the allowed list.
To do it, we will run a simple Python script which we will run directly in GitHub actions step using "shell: python"
But before we start, let's look at our "allowed-licenses.yaml" (or "configuration") file:
Here we have 2 arrays:
- allowed - array (string) of allowed licenses
- ignore - array (string) of packages which we ignore and allow any license for them
In this configuration file we store in our gha-store repo, folder "helpers" and are able to update it as often as we want. Every new workflow we run will take an updated configuration.
4. Python Magic: Analyzing Licenses
Now, the real fun begins! We've written a Python script that does all the heavy lifting.
Here we compare dependencies licenses from SBOM with the allowed licenses and log everything that doesn't have the proper license and is not ignored.
Also, we generate a markdown table with the "failed" packages.
5. Keeping It User-Friendly: GitHub Actions Summary
We want the output to be as developer-friendly as possible, so we use GitHub Actions Summary:
No magic, just GitHub Actions.
6. Don't Let It Slide: Slack Notifications
If anything goes south, you'll get a Slack notification. Trust me, you want to set this up!
To setup, please create a new Slack application and add it to your workspace. Then add a new WebHook and save it as a global secret, for example, "GLOBAL_SLACK_DEPENDENCY_LICENSE" as we do.
You can read more about it in Slack documentation: https://api.slack.com/messaging/webhooks
Call me ...
We finished our Reusable workflow, now we need to call it somehow:
As promised example of how to pass parent's inputs to child workflow. We just call this Reusable workflow to check our licenses.
Bringing It All Together
And there you have it, folks! Automating license checks is easier than you thought. With just a bit of Python and GitHub Actions, you're all set to ensure you're not stepping on any legal landmines.
So, go ahead, give it a try, and let us know how it goes. Happy coding! 🎉
Full reusable workflow:
Common Open source software (OSS) libraries licenses
Open source software (OSS) libraries typically use a variety of licenses, but some are more common than others. Always read and understand the terms of a license before using or contributing to a project.