NCZOnline - The Official Web Site of Nicholas C. Zakas

Subscribe to NCZOnline - The Official Web Site of Nicholas C. Zakas feed
The Official Web Site of Nicholas C. Zakas
Updated: 26 min 29 sec ago

Creating a JavaScript promise from scratch, Part 2: Resolving to a promise

Mon, 09/28/2020 - 20:00

In my first post of this series, I explained how the Promise constructor works by recreating it as the Pledge constructor. I noted in that post that there is nothing asynchronous about the constructor, and that all of the asynchronous operations happen later. In this post, I’ll cover how to resolve one promise to another promise, which will trigger asynchronous operations.

As a reminder, this series is based on my promise library, Pledge. You can view and download all of the source code from GitHub.

Jobs and microtasks

Before getting into the implementation, it’s helpful to talk about the mechanics of asynchronous operations in promises. Asynchronous promise operations are defined in ECMA-262 as jobs1:

A Job is an abstract closure with no parameters that initiates an ECMAScript computation when no other ECMAScript computation is currently in progress.

Put in simpler language, the specification says that a job is a function that executes when no other function is executing. But it’s the specifics of this process that are interesting. Here’s what the specification says1:

  • At some future point in time, when there is no running execution context and the execution context stack is empty, the implementation must:
    1. Push an execution context onto the execution context stack.
    2. Perform any implementation-defined preparation steps.
    3. Call the abstract closure.
    4. Perform any implementation-defined cleanup steps.
    5. Pop the previously-pushed execution context from the execution context stack.> > * Only one Job may be actively undergoing evaluation at any point in time.
  • Once evaluation of a Job starts, it must run to completion before evaluation of any other Job starts.
  • The abstract closure must return a normal completion, implementing its own handling of errors.

It’s easiest to think through this process by using an example. Suppose you have set up an onclick event handler on a button in a web page. When you click the button, a new execution context is pushed onto the execution context stack in order to run the event handler. Once the event handler has finished executing, the execution context is popped off the stack and the stack is now empty. This is the time when jobs are executed, before yielding back to the event loop that is waiting for more JavaScript to run.

In JavaScript engines, the button’s event handler is considered a task while a job is a considered a microtask. Any microtasks that are queued during a task are executed in the order in which they were queued immediately after the task completes. Fortunately for you and I, browsers, Node.js, and Deno have the queueMicrotask() function that implements the queueing of microtasks.

The queueMicrotask() function is defined in the HTML specification2 and accepts a single argument, which is the function to call as a microtask. For example:

queueMicrotask(() => { console.log("Hi"); });

This example will output "Hi" to the console once the current task has completed. Keep in mind that microtasks will always execute before timers, which are created using either setTimeout() or setInterval(). Timers are implemented using tasks, not microtasks, and so will yield back to the event loop before they execute their tasks.

To make the code in Pledge look for like the specification, I’ve defined a hostEnqueuePledgeJob() function that simple calls queueMicrotask():

export function hostEnqueuePledgeJob(job) { queueMicrotask(job); } The NewPromiseResolveThenJob job

In my previous post, I stopped short of showing how to resolve a promise when another promise was passed to resolve. As opposed to non-thenable values, calling resolve with another promise means the first promise cannot be resolved until the second promise has been resolved, and to do that, you need NewPromiseResolveThenableJob().

The NewPromiseResolveThenableJob() accepts three arguments: the promise to resolve, the thenable that was passed to resolve, and the then() function to call. The job then attaches the resolve and reject functions for promise to resolve to the thenable’s then() method while catching any potential errors that might occur.

To implement NewPromiseResolveThenableJob(), I decided to use a class with a constructor that returns a function. This looks a little strange but will allow the code to look like you are creating a new job using the new operator instead of creating a function whose name begins with new (which I find strange). Here’s my implementation:

export class PledgeResolveThenableJob { constructor(pledgeToResolve, thenable, then) { return () => { const { resolve, reject } = createResolvingFunctions(pledgeToResolve); try { // same as thenable.then(resolve, reject) then.apply(thenable, [resolve, reject]); } catch (thenError) { // same as reject(thenError) reject.apply(undefined, [thenError]); } }; } }

You’ll note the use of createResolvingFunctions(), which was also used in the Pledge constructor. The call here creates a new set of resolve and reject functions that are separate from the original ones used inside of the constructor. Then, an attempt is made to attach those functions as fulfillment and rejection handlers on the thenable. The code looks a bit weird because I tried to make it look as close to the spec as possible, but really all it’s doing is thenable.then(resolve, reject). That code is wrapped in a try-catch just in case there’s an error that needs to be caught and passed to the reject function. Once again, the code looks a bit more complicated as I tried to capture the spirit of the specification, but ultimately all it’s doing is reject(thenError).

Now you can go back and complete the definition of the resolve function inside of createResolvingFunctions() to trigger a PledgeResolveThenableJob as the last step:

export function createResolvingFunctions(pledge) { const alreadyResolved = { value: false }; const resolve = resolution => { if (alreadyResolved.value) { return; } alreadyResolved.value = true; // can't resolve to the same pledge if (Object.is(resolution, pledge)) { const selfResolutionError = new TypeError("Cannot resolve to self."); return rejectPledge(pledge, selfResolutionError); } // non-objects fulfill immediately if (!isObject(resolution)) { return fulfillPledge(pledge, resolution); } let thenAction; try { thenAction = resolution.then; } catch (thenError) { return rejectPledge(pledge, thenError); } // if the thenAction isn't callable then fulfill the pledge if (!isCallable(thenAction)) { return fulfillPledge(pledge, resolution); } /* * If `thenAction` is callable, then we need to wait for the thenable * to resolve before we can resolve this pledge. */ const job = new PledgeResolveThenableJob(pledge, resolution, thenAction); hostEnqueuePledgeJob(job); }; // attach the record of resolution and the original pledge resolve.alreadyResolved = alreadyResolved; resolve.pledge = pledge; // reject function omitted for ease of reading return { resolve, reject }; }

If resolution is a thenable, then the PledgeResolveThenableJob is created and queued. That’s important, because anything a thenable is passed to resolve, it means that the promise isn’t resolved synchronously and you must wait for at least one microtask to complete.

Wrapping Up

The most important concept to grasp in this post is how jobs work and how they relate to microtasks in JavaScript runtimes. Jobs are a central part of promise functionality and in this post you learned how to use a job to resolve a promise to another promise. With that background, you’re ready to move into implementing then(), catch(), and finally(), all of which rely on the same type of job to trigger their handlers. That’s coming up in the next post in this series.

Remember: All of this code is available in the Pledge on GitHub. I hope you’ll download it and try it out to get a better understanding of promises.

References
  1. Jobs and Host Operations to Enqueue Jobs  ↩2

  2. Microtask queueing 

Categories: Tech-n-law-ogy

Creating a JavaScript promise from scratch, Part 1: Constructor

Mon, 09/21/2020 - 20:00

Early on in my career, I learned a lot by trying to recreate functionality I saw on websites. I found it helpful to investigate why something worked the way that it worked, and that lesson has stuck with me for decades. The best way to know if you really understand something is to take it apart and put it back together again. That’s why, when I decided to deepen my understanding of promises, I started thinking about creating promises from scratch.

Yes, I wrote a book on ECMAScript 6 in which I covered promises, but at that time, promises were still very new and not yet implemented everywhere. I made my best guess as to how certain things worked but I never felt truly comfortable with my understanding. So, I decided to turn ECMA-262’s description of promises1 and implement that functionality from scratch.

In this series of posts, I’ll be digging into the internals of my promise library, Pledge. My hope is that exploring this code will help everyone understand how JavaScript promises work.

An Introduction to Pledge

Pledge is a standalone JavaScript library that implements the ECMA-262 promises specification. I chose the name “Pledge” instead of using “Promise” so that I could make it clear whether something was part of native promise functionality or if it was something in the library. As such, wherever the spec using the term “promise,”, I’ve replaced that with the word “pledge” in the library.

If I’ve implemented it correctly, the Pledge class should work the same as the native Promise class. Here’s an example:

import { Pledge } from "https://unpkg.com/@humanwhocodes/pledge/dist/pledge.js"; const pledge = new Pledge((resolve, reject) => { resolve(42); // or reject(42); }); pledge.then(value => { console.log(then); }).catch(reason => { console.error(reason); }).finally(() => { console.log("done"); }); // create resolved pledges const fulfilled = Pledge.resolve(42); const rejected = Pledge.reject(new Error("Uh oh!"));

Being able to see behind each code example has helped me understand promises a lot better, and I hope it will do the same for you.

Note: This library is not intended for use in production. It’s intended only as an educational tool. There’s no reason not to use the native Promise functionality.

Internal properties of a promise

ECMA-2622 specifies the following internal properties (called slots in the spec) for instances of Promise:

Internal Slot Description [[PromiseState]] One of pending, fulfilled, or rejected. Governs how a promise will react to incoming calls to its then method. [[PromiseResult]] The value with which the promise has been fulfilled or rejected, if any. Only meaningful if [[PromiseState]] is not pending. [[PromiseFulfillReactions]] A List of PromiseReaction records to be processed when/if the promise transitions from the pending state to the fulfilled state. [[PromiseRejectReactions]] A List of PromiseReaction records to be processed when/if the promise transitions from the pending state to the rejected state. [[PromiseIsHandled]] A boolean indicating whether the promise has ever had a fulfillment or rejection handler; used in unhandled rejection tracking.

Because these properties are not supposed to be visible to developers but need to exist on the instances themselves for easy tracking and manipulation, I chose to use symbols for their identifiers and created the PledgeSymbol object as an easy way to reference them in various files:

export const PledgeSymbol = Object.freeze({ state: Symbol("PledgeState"), result: Symbol("PledgeResult"), isHandled: Symbol("PledgeIsHandled"), fulfillReactions: Symbol("PledgeFulfillReactions"), rejectReactions: Symbol("PledgeRejectReactions") });

With PledgeSymbol now defined, it’s time to move on to creating the Pledge constructor.

How does the Promise constructor work?

The Promise constructor is used to create a new promise in JavaScript. You pass in a function (called the executor) that receives two arguments, resolve and reject which are functions that bring the promise’s lifecycle to completion. The resolve() function resolves the promise to some value (or no value) and the reject() function rejects the promise with a given reason (or no reason). For example:

const promise = new Promise((resolve, reject) => { resolve(42); }); promise.then(value => { console.log(value); // 42 })

The executor is run immediately so the variable promise in this example is already fulfilled with the value 42 (the internal [[PromiseState]] property is Fulfilled). (If you used reject() instead of resolve(), then promise would be in a rejected state.)

Additionally, if the executor throws an error, then that error is caught and the promise is rejected, as in this example:

const promise = new Promise((resolve, reject) => { throw new Error("Oops!"); }); promise.catch(reason => { console.log(reason.message); // "Oops!" })

A couple of other notes about how the constructor works:

  1. If the executor is missing then an error is thrown
  2. If the executor is not a function then an error is thrown

In both cases, the error is thrown as usual and does not result in a rejected promise.

With all of this background information, here’s what the code to implement these behaviors looks like:

export class Pledge { constructor(executor) { if (typeof executor === "undefined") { throw new TypeError("Executor missing."); } if (!isCallable(executor)) { throw new TypeError("Executor must be a function."); } // initialize properties this[PledgeSymbol.state] = "pending"; this[PledgeSymbol.result] = undefined; this[PledgeSymbol.isHandled] = false; this[PledgeSymbol.fulfillReactions] = []; this[PledgeSymbol.rejectReactions] = []; const { resolve, reject } = createResolvingFunctions(this); /* * The executor is executed immediately. If it throws an error, then * that is a rejection. The error should not be allowed to bubble * out of this function. */ try { executor(resolve, reject); } catch(error) { reject(error); } } }

After checking the validity of the executor argument, the constructor next initializes all of the internal properties by using PledgeSymbol. These properties are close approximations of what the specification describes, where a string is used for the state instead of an enum and the fulfill and reject reactions are instances of Array because there is no List class in JavaScript.

Next, the resolve and reject functions used in the executor are created using the createResolvingFunctions() function. (I’ll go into detail about this function later in this post.) Last, the executor is run, passing in resolve and reject. It’s important to run the executor inside of a try-catch statement to ensure that any error results in a promise rejection rather than a thrown error.

The isCallable() function is just a helper function I created to make the code read more like the specification. Here’s the implementation:

export function isCallable(argument) { return typeof argument === "function"; }

I think you’ll agree that the Pledge constructor itself is not very complicated and follows a fairly standard process of validating the input, initializing instance properties, and then performing some operations. The real work is done inside of createResolvingFunctions().

Creating the resolving functions

The specification defines a CreateResolvingFunctions abstract operation3, which is a fancy way of saying that it’s a series of steps to perform as part of some other function or method. To make it easy to go back and forth between the specification and the Pledge library, I’ve opted to use the same name for an actual function. The details in the specification aren’t all relevant to implementing the code in JavaScript, so I’ve omitted or changed some parts. I’ve also kept some parts that may seem nonsensical within the context of JavaScript – I’ve done that intentionally, once again, for ease of going back and forth with the specification.

The createResolvingFunctions() function is responsible for creating the resolve and reject functions that are passed into the executor. However, this function is actually used elsewhere, as well, allowing any parts of the library to retrieve these functions in order to manipulate existing Pledge instances.

To start, the basic structure of the function is as follows:

export function createResolvingFunctions(pledge) { // this "record" is used to track whether a Pledge is already resolved const alreadyResolved = { value: false }; const resolve = resolution => { // TODO }; // attach the record of resolution and the original pledge resolve.alreadyResolved = alreadyResolved; resolve.pledge = pledge; const reject = reason => { // TODO }; // attach the record of resolution and the original pledge reject.alreadyResolved = alreadyResolved; reject.pledge = pledge; return { resolve, reject }; }

The first oddity of this function is the alreadyResolved object. The specification states that it’s a record, so I’ve chosen to implement it using an object. Doing so ensures the same value is being read and modified regardless of location (using a simple boolean value would not have allowed for this sharing if the value was being written to or read from the resolve and reject properties).

The specification also indicates that the resolve and reject functions should have properties containing alreadyResolved and the original promise (pledge). This is done so that the resolve and reject functions can access those values while executing. However, that’s not necessary in JavaScript because both functions are closures and can access those same values directly. I’ve opted to keep this detail in the code for completeness with the specification but they won’t actually be used.

As mentioned previously, the contents of each function is where most of the work is done. However, the functions vary in how complex they are. I’ll start by describing the reject function, as that is a great deal simpler than resolve.

Creating the reject function

The reject function accepts a single argument, the reason for the rejection, and places the promise in a rejected state. That means any rejection handlers added using then() or catch() will be executed. The first step in that process is to ensure that the promise hasn’t already been resolved, so you check the value of alreadyResolved.value, and if true, just return without doing anything. If alreadyResolved.value is false then you can continue on and the value to true. This ensures that this set of resolve and reject handlers can only be called once. After that, you can continue on change the internal state of the promise. Here’s what that function looks like in the Pledge library:

export function createResolvingFunctions(pledge) { const alreadyResolved = { value: false }; // resolve function omitted for ease of reading const reject = reason => { if (alreadyResolved.value) { return; } alreadyResolved.value = true; return rejectPledge(pledge, reason); }; reject.pledge = pledge; reject.alreadyResolved = alreadyResolved; return { resolve, reject }; }

The rejectPledge() function is another abstract operation from the specification4 that is used in multiple places and is responsible for changing the internal state of a promise. Here’s the steps directly from the specification:

  1. Assert: The value of promise.[[PromiseState]] is pending.
  2. Let reactions be promise.[[PromiseRejectReactions]].
  3. Set promise.[[PromiseResult]] to reason.
  4. Set promise.[[PromiseFulfillReactions]] to undefined.
  5. Set promise.[[PromiseRejectReactions]] to undefined.
  6. Set promise.[[PromiseState]] to rejected.
  7. If promise.[[PromiseIsHandled]] is false, perform HostPromiseRejectionTracker(promise, "reject").
  8. Return TriggerPromiseReactions(reactions, reason).

For the time being, I’m going to skip steps 7 and 8, as those are concepts I’ll cover later in this series of blog posts. The rest can be almost directly translated into JavaScript code like this:

export function rejectPledge(pledge, reason) { if (pledge[PledgeSymbol.state] !== "pending") { throw new Error("Pledge is already settled."); } const reactions = pledge[PledgeSymbol.rejectReactions]; pledge[PledgeSymbol.result] = reason; pledge[PledgeSymbol.fulfillReactions] = undefined; pledge[PledgeSymbol.rejectReactions] = undefined; pledge[PledgeSymbol.state] = "rejected"; if (!pledge[PledgeSymbol.isHandled]) { // TODO: perform HostPromiseRejectionTracker(promise, "reject"). } // TODO: Return `TriggerPromiseReactions(reactions, reason)`. }

All rejectPledge() is really doing is setting the various internal properties to the appropriate values for a rejection and then triggering the reject reactions. Once you understand that promises are being ruled by their internal properties, they become a lot less mysterious.

The next step is to implement the resolve function, which is quite a bit more involved than reject but fundamentally is still modifying internal state.

Creating the resolve function

I’ve saved the resolve function for last due to the number of steps involved. If you’re unfamiliar with promises, you may wonder why it’s more complicated than reject, as they should be doing most of the same steps but with different values. The complexity comes due to the different ways resolve handles different types of values:

  1. If the resolution value is the promise itself, then throw an error.
  2. If the resolution value is a non-object, then fulfill the promise with the resolution value.
  3. If the resolution value is an object with a then property:
    1. If the then property is not a method, then fulfill the promise with the resolution value.
    2. If the then property is a method (that makes the object a thenable), then call then with both a fulfillment and a rejection handler that will resolve or reject the promise.

So the resolve function only fulfills a promise immediately in the case of a non-object resolution value or a resolution value that is an object but doesn’t have a callable then property. If a second promise is passed to resolve then the original promise can’t be settled (either fulfilled or rejected) until the second promise is settled. Here’s what the code looks like:

export function createResolvingFunctions(pledge) { const alreadyResolved = { value: false }; const resolve = resolution => { if (alreadyResolved.value) { return; } alreadyResolved.value = true; // can't resolve to the same pledge if (Object.is(resolution, pledge)) { const selfResolutionError = new TypeError("Cannot resolve to self."); return rejectPledge(pledge, selfResolutionError); } // non-objects fulfill immediately if (!isObject(resolution)) { return fulfillPledge(pledge, resolution); } let thenAction; /* * At this point, we know `resolution` is an object. If the object * is a thenable, then we need to wait until the thenable is resolved * before resolving the original pledge. * * The `try-catch` is because retrieving the `then` property may cause * an error if it has a getter and any errors must be caught and used * to reject the pledge. */ try { thenAction = resolution.then; } catch (thenError) { return rejectPledge(pledge, thenError); } // if the thenAction isn't callable then fulfill the pledge if (!isCallable(thenAction)) { return fulfillPledge(pledge, resolution); } /* * If `thenAction` is callable, then we need to wait for the thenable * to resolve before we can resolve this pledge. */ // TODO: Let job be NewPromiseResolveThenableJob(promise, resolution, thenAction). // TODO: Perform HostEnqueuePromiseJob(job.[[Job]], job.[[Realm]]). }; // attach the record of resolution and the original pledge resolve.alreadyResolved = alreadyResolved; resolve.pledge = pledge; // reject function omitted for ease of reading return { resolve, reject }; }

As with the reject function, the first step in the resolve function is to check the value of alreadyResolved.value and either return immediately if true or set to true. After that, the resolution value needs to be checked to see what action to take. The last step in the resolve function (marked with TODO comments) is for the case of a thenable that needs handlers attached. This will be discussed in my next post.

The fulfillPledge() function referenced in the resolve function looks a lot like the rejectPledge() function referenced in the reject function and simply sets the internal state:

export function fulfillPledge(pledge, value) { if (pledge[PledgeSymbol.state] !== "pending") { throw new Error("Pledge is already settled."); } const reactions = pledge[PledgeSymbol.fulfillReactions]; pledge[PledgeSymbol.result] = value; pledge[PledgeSymbol.fulfillReactions] = undefined; pledge[PledgeSymbol.rejectReactions] = undefined; pledge[PledgeSymbol.state] = "fulfilled"; // TODO: Return `TriggerPromiseReactions(reactions, reason)`. }

As with rejectPledge(), I’m leaving off the TriggerPromiseReactions operations for discussion in the next post.

Wrapping Up

At this point, you should have a good understanding of how a Promise constructor works. The most important thing to remember is that every operation so far is synchronous; there is no asynchronous operation until we start dealing with then(), catch(), and finally(), which will be covered in the next post. When you create a new instance of Promise and pass in an executor, that executor is run immediately, and if either resolve or reject is called synchronously, then the newly created promise is already fulfilled or rejected, respectively. It’s only what happens after that point where you get into asynchronous operations.

All of this code is available in the Pledge on GitHub. I hope you’ll download it and try it out to get a better understanding of promises.

References
  1. Promise Objects 

  2. Properties of Promise instances 

  3. CreateResolvingFunctions(promise) 

  4. RejectPromise(promise, reason) 

Categories: Tech-n-law-ogy

How to safely use GitHub Actions in organizations

Mon, 07/20/2020 - 20:00

GitHub Actions1 are programs designed to run inside of workflows2, triggered by specific events inside a GitHub repository. To date, people use GitHub Actions to do things like run continuous integration (CI) tests, publish releases, respond to issues, and more. Because the workflows are executed inside a fresh virtual machine that is deleted after the workflow completes, there isn’t much risk of abuse inside of the system. There is a risk, however, to your data.

This post is aimed at those who are using GitHub organizations to manage their projects, which is to say, there is more than one maintainer. In that situation, you may not always be aware of who is accessing your repository, whether that be another coworker or a collaborator you’ve never met. If you are the only maintainer of a project then your risk is limited to people who steal your credentials and the other recommendations in this post aren’t as necessary.

Credential stealing risk

The primary risk for your workflows is credential stealing, where you provided some sensitive information inside of the workflow and somehow that information is stolen. This credential stealing generally takes two forms:

  1. Opportunistic - sensitive information is accidentally output to the log and an attacker finds it and uses it
  2. Intentional - an attacker is able to insert a program into your workflow that steals credentials and sends them to the attacker

GitHub, to its credit, is aware of this possibility and allows you to store sensitive information in secrets3. You can store secrets either on a single repository or on an organization, where they can be shared across multiple repositories. You can store things like API tokens or deploy keys securely and then reference them directly inside of a workflow.

By default, there are some important security features built in to GitHub secrets:

  1. Once a secret is created, you can never view the value inside of the GitHub interface or retrieve it using the API; you can only rename the secret, change the value, or delete the secret.
  2. Secrets are automatically masked from log output when GitHub Actions execute. You’ll never accidentally configure a secret to show in the log.
  3. Only administrators can create, modify, or delete secrets. For individuals that means you must be the owner of the repository; for organizations that means you must be an administrator.

These measures are a good default starting place for securing sensitive information, but that doesn’t mean this data is completely safe by default.

Showing secrets in the log

Workflow logs are displayed on each repository under the “Actions” tab and are visible to the public. GitHub Actions tend to hide a lot of their own output for security purposes but not every command inside of a workflow is implemented with a GitHub Action. Luckily, workflows are designed to hide secrets by default, so it’s unlikely that you’ll end up accidentally outputting the secrets in plain text. When you access a secret as in the following workflow, the output will be masked in the log. For example, suppose this is part of your workflow:

steps: - name: Try to output a secret run: echo 'SECRET:$'

Accessing data off of the secrets object automatically masks the value in the log, so you’ll end up seeing something like this in the log:

SECRET:***

You’re safe so long as your secrets stay within the confines of a workflow where GitHub will mask the values for you. The more dangerous situation is what happens with the command executed as part of your workflow. If they make use of a secret, they could potentially reveal it in the log.

For example, suppose you have a Node.js file named echo.js containing the following:

console.log(process.argv[2]);

This file will output the first argument passed to the Node.js process. If you configure it in a workflow, you could very easily display a secret accidentally, such as:

steps: - name: Try to output a secret run: node ./echo.js $

While the command line itself will be masked in the log, there is no accounting for the output of the command, which will output whatever is passed in.

Key points about this scenario:

  • This is most likely an accident rather than an attack. An attacker would most likely want to hide the fact that they were able to get access to your secret. By outputting it into the log, it’s there for anyone to see and trace back to the source.
  • An accident like this can open the door for opportunistic credential stealing4 by someone who notices the secrets were exposed.

Although accidentally outputting secrets to the log is a bad situation, remote credential stealing is worse.

Remote credential stealing

This scenario is more likely an attack than an accident. The way this happens is that a rogue command has made it into your workflow file and is able to read your secrets and then transmit them to a different server. There isn’t any overt indication that this has happened in the log so it may go unnoticed for a long time (or forever).

There are a number of ways for these rogue utilities to be introduced because GitHub workflows rely on installing external dependencies to execute. Whether you need to execute a third-party GitHub action or install something using a package manager, you are assuming that you’re not using malicious software.

The most important question to ask is how might a malicious utility make it into your workflow files? There are two answers: accidentally or intentionally. However, there are several ways each can play out:

  • As with outputting secrets to the log, a well-intentioned developer might have copy-pasted something from another workflow file and introduced it into your codebase. Maybe it was committed directly to the development branch without review because it’s a small project. This scenario plays out every day as attackers try to trick developers into installing malicious software that otherwise looks harmless.
  • An attacker might have gained control of a package that already has a reputation as reliable and update it to contain malicious code. (I’m painfully aware of how this can happen.5) Your workflow may blindly pull in the package and use it expecting it to be safe.
  • An attacker might submit a pull request to your repository containing a workflow change, hoping no one will look at it too closely before merging.
  • An attacker might have stolen someone’s credentials and used them to modify a workflow to contain a malicious command.

In any case, there are enough ways for attackers to introduce malicious software into your workflow. Fortunately, there are a number of ways to protect yourself.

Protection strategies

Generally speaking, the strategies to further protect your GitHub workflows fall into the following categories:

  1. Protect yourself
  2. Protect your development branch
  3. Limit scopes
  4. Workflow best practices
Protect yourself

The easiest way to steal credentials is for an attacker to pretend that they’re you. Once they have control of your GitHub or package manager account, they have all the access they need to not only harm you but also harm others. The advice here is timeless, but worth repeating:

  • Use a password manager and generate a strong, unique password for each site you use. Your GitHub password should not be the same as your npm password, for example.
  • Enable two-factor authentication (2FA) on GitHub6 and any other sites you use. Prefer to use an authentication app or a security key instead of text messages whenever possible.
  • If you are a GitHub organization administrator, require all organization members to enable 2FA.7

By protecting your own login information, you make it a lot harder for attackers to use your projects to attack you or others.

Protect your branches

At a minimum, you should protect your development branch with rules about what is allowed to be merged. Your development branch is the branch where pull requests are sent and where your releases are cut from. In many cases that will be the master branch, but some teams also use dev, trunk, or any number of other names. Once code makes it into your development branch, it is effectively “live” (for workflows) and highly likely to make it into a release (where it could negatively affect others). That’s why protecting your development branch is important.

GitHub allows you to protect any branch in a number of ways.8 To set up a protected branch, go to your repository settings, click on “Branches” on the menu, then under “Branch Protection Rules” click the “Add Rule” button. Then, you can specify the branches to protect and exactly how to protect them.

There are a lot of options, but here are the ones I recommend as a starting point for your development branch:

  1. Require pull requests before merging - this prevents you from pushing directly to the development branch. All changes must go through a pull request, even from admins (though you can override this to allow specific people to override the protection – but that’s not advisable). This is important to ensure that there’s some notification of any changes made to the development branch and someone has the opportunity to review them before merging.
  2. Required approval reviews - by default this is set to one. Ideally, you should require approvals from at least two people to avoid the case where a malicious actor has secured the login of one team member and can therefore self-approve a pull request.
  3. Dismiss stale pull request approvals when new commits are pushed - by default this is off, and you should turn it on. This prevents an attack where a malicious actor submits an appropriate pull request, waits for approval, and then adds new commits to the pull request before merging. With this option enabled, new commits pushed to the pull request will invalidate previous approvals.
  4. Require review from Code Owners - it’s a good idea to set up code owners8 for workflow files and other sensitive files. Once you do, you can enable this option to require the code owners approve any pull requests related to the code they own. This ensures that those who are most knowledgeable about GitHub Actions are required to approve any pull requests.
  5. Require status checks to pass before merging - assuming you have status checks running on pull requests (such as automated testing or linting), enable this option to ensure pull requests can’t be merged that have failing status checks. This is another layer of security to prevent malicious code from making it into your repository.
  6. Include administrators - this option ensures that even administrators must adhere to the rules you’ve set up for the branch. While a compromised administrator account can turn this setting off, turning it on ensures administrators don’t accidentally merge or push changes.
  7. Allow force pushes - this is off by default and should remain off. Force pushes allow someone to completely overwrite the remote branch, which opens you up to all kinds of bad situations. Force pushes to the development branch should never be allowed in an organization.
  8. Allow deletions - this is also off by default and should remain off. You don’t want to accidentally delete your development branch.

While these settings won’t prevent all attacks, they certainly make a number of common attacks a lot more difficult. You can, of course, create rules that are more strict if you have other needs.

Because GitHub Actions and workflows are executed in every branch of your repository, it’s important to consider whether or not you need to protect all of your remote branches. If your team doesn’t use remote branches for feature development then I would recommend protecting all of your branches.

Limit scopes

One of the classic pieces of computer security advice is to always limit the scope of changes allowed at one time. For protecting your secrets, here are a number of ways you can limit scope:

  • Favor repository-only secrets - if you only have one repository that needs access to a secret, then create the secret only on the repository instead of on the organization. This further limits the attack surface.
  • Limit organization secret scope - organization secrets can be scoped to only public, only private, or just specific repositories. Limiting the number of repositories with access to the secrets also decreases the attack surface. Your credentials are only as secure as your least secure repository with access to your secrets.
  • Limit the number of admins - keep the number of repository or organization administrators small. Only admins can manage GitHub secrets, so keeping this group small will also minimize the risk.
  • Minimize credentials - ensure that any credentials generated to use in secrets have the minimal required permissions to be useful. If an app needs write permission and not read permission, then generate a credential that only allows writes. This way you minimize the damage if a credential is stolen.

Even if you don’t follow any of the other advice in this article, limiting the scope of your secrets is really the minimum you should do to protect them.

Never store a GitHub token with administrator privileges as a secret. This would allow any workflow in any branch (even unprotected branches) to modify your repository in any way it wants, including pushing to protected branches.9

Workflow best practices

The last step is to ensure your workflows are as safe as possible. The concern here is that you pass secrets into a utility that will either log that data unmasked or steal the credentials silently. Naturally, the first step is to verify the actions and utilities you are using are safe to use.

Disabling Actions

If you don’t intend to use GitHub Actions in your organization, you can disable them for the entire organization. On the organization Settings page, go to “Actions” and then select “Disable actions for this organization.”10 This ensures that no repositories can use GitHub Actions and is the safest setting if you don’t intend to use them.

Use only local Actions

Another options is to allow the organization to use workflows but only with actions that are contained inside the same repository. This effectively forces repositories to install their own copies of actions to control which actions may be executed.

To enable this setting, go to the organization Settings page, go to “Actions”, and then select “Enable local Actions only for this organization.”10

Identifying safe Actions

There are a couple ways you can know that a published GitHub Action is safe:

  1. It begins with action/, such as actions/checkout. These are published by GitHub itself and are therefore safe to use.
  2. The action is published in the GitHub Action Marketplace11 and has a “verified creator” badge next to the author. This indicates that the creator is a verified partner of GitHub and therefore the action is safe.

If an action doesn’t fall into one of these two categories, that doesn’t mean it’s not safe, just that you need to do more research into the action.

All actions in the GitHub Action Marketplace link back to the source code repository they are published from. You should always look at the source code to ensure that it is performing the operations it claims to be performing (and doing nothing else). Of course, you happen to know and trust the publisher of the Action, you may want to trust that the action does what it says.

Provide secrets one command at a time

When configuring a workflow, ensure that you are limiting the number of commands with access. For example, you might configure a secret as an environment variable to run a command, such as this:

steps: - name: Run a command run: some-command env: GITHUB_TOKEN: $

Here, the GITHUB_TOKEN environment variable is set with the secrets.GITHUB_TOKEN secret value. The some-command utility has access to that environment variable. Assuming that some-command is a trusted utility, there is no problem. The problem occurs when you run multiple commands inside of a run statement, such as:

steps: - name: Run a command - run: | some-command some-other-command yet-another-command env: GITHUB_TOKEN: $

In this case, the run statement is running multiple commands at once. The env statement now applies to all of those commands and will be available whether they need access to GITHUB_TOKEN or not. If the only utility that needs GITHUB_TOKEN is some-command, then limit the use of env to just that command, such as:

steps: - name: Run a command run: some-command env: GITHUB_TOKEN: $ - run: | some-other-command yet-another-command

With this rewritten example, only some-command has access to GITHUB_TOKEN while the other commands are run separately without GITHUB_TOKEN. Limiting which commands have access to your secrets is another important step in preventing credential stealing.

Conclusion

While GitHub Actions are a great addition to the GitHub development ecosystem, it’s still important to take security into account when using them. The security considerations are quite a bit different when you’re dealing with a GitHub organization maintaining projects rather than a single maintainer. The more people who can commit directly to your development branch, the more chances there are for security breaches.

The most important takeaway from this post is that you need to have protections, both automated and manual, in order to safely using GitHub Actions in organizations. Whether you decide to only allow local actions or to assign someone as a code owner who must approve all workflows, it’s better to have some protections in place than to have none. That is especially true when you have credentials stored as GitHub secrets that would allow people to interact with outside systems on your behalf.

Remember, you are only as secure as your least secure user, branch, or repository.

  1. GitHub: GitHub Actions 

  2. GitHub: Configuring and managing workflow files and runs 

  3. GitHub: Creating and storing encrypted secrets 

  4. Credential Stealing as an Attack Vector 

  5. [ESLint postmortem for malicious package publishes]](https://eslint.org/blog/2018/07/postmortem-for-malicious-package-publishes) 

  6. GitHub: Securing your account with two-factor authentication (2FA) 

  7. GitHub: Requiring two-factor authentication in your organization 

  8. GitHub: Configuring protected branches  ↩2

  9. Allowing github-actions(bot) to push to protected branch 

  10. GitHub: Disabling or limiting GitHub Actions for your organization  ↩2

  11. GitHub Actions Marketplace 

Categories: Tech-n-law-ogy