6.031
6.031 — Software Construction
Fall 2021

Reading 22: Promises

Software in 6.031

Safe from bugsEasy to understandReady for change
Correct today and correct in the unknown future. Communicating clearly with future programmers, including future you. Designed to accommodate change without rewriting.

Objectives

This reading discusses concurrent computation using promises.

We start at the highest level, with the promise abstraction, and the await operator and async function declaration that allow concurrent computations to happen in TypeScript in a way that closely resembles familiar synchronous programming.

Then we will dig below the covers to understand more about what is really happening with Promise, await, and async.

Promises

A promise represents a concurrent computation that has been started but might still be unfinished, whose result may not be ready yet. The name comes from the notion that it is a promise to finish the computation and provide a result sometime in the future.

The Promise type in TypeScript is generic: a Promise<T> represents a concurrent computation that at some point should produce a value of type T.

Here are examples of concurrent computations using promises. (Some of these are actual functions provided by Node packages; others don’t currently exist as library functions but could be readily implemented.)

  • readFile(pathname:string, ...) returning a Promise<string>. The file is loaded concurrently, and the value it promises to eventually produce is the content of the file.
  • diskSpace(folder:string) returning a Promise<number>. A folder tree in the filesystem is traversed in the background, the sizes of all of its files is added up, and the resulting number of bytes is the promised result.
  • fetch(url:string) returning a Promise<Response>. The URL is opened in the background, and the promised result is an HTTP Response object.
  • timeout(milliseconds:number) returning a Promise<void>. A timer runs for the given number of milliseconds, and the promised result has type void (the same as a function that returns no result). This may seem like an empty promise (ha), but we’ll see shortly that even empty promises are useful, because we can trigger additional computation when the computation completes.

Importantly, each of these functions returns its promise immediately. The function starts a long-running computation, like loading a web page or scanning the filesystem, but does not wait for that computation to complete. In lieu of returning a computed value, it returns a promise associated with the concurrent computation that it started. When the computation finishes, its result will be accessible through the promise.

What does this let us do? One benefit we can get from promises is parallelism: by starting multiple computations and collecting a promise for each one, the computations can proceed concurrently with each other:

import fetch from 'node-fetch';
const promise1 = fetch('http://www.mit.edu/');
const promise2 = fetch('http://www.harvard.edu/');
const promise3 = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers in parallel

More concretely, a Promise<T> is a mutable value with three states:

  • pending: the computation associated with the promise has not finished yet.
  • fulfilled: the computation has finished, and the promise now holds the value of type T that the computation produced.
  • rejected: the computation failed for some reason, and the promise now holds an Error object describing the failure.

A promise starts in the pending state, and eventually transitions into either fulfilled (if the computation completes) or rejected (if the computation throws an exception). It may also stay in the pending state forever (for example, if the computation is waiting for an event that never happens). Once a promise is fulfilled or rejected, it remains in that state; there is no way to reset a promise back to the pending state.

If you print a promise for debugging purposes, you will see its state and the result stored inside it (if it is no longer pending). For example, at the ts-node TypeScript prompt, you can print a promise immediately to see it in the pending state:

> import fs from 'fs/promises';
> const promise = fs.readFile('account', { encoding: 'utf-8' }); console.log(promise);
Promise { <pending> }

Assuming the file account contains 200 (representing a bank account with $200 in it), then your next command should see that the promise has been fulfilled (since it takes very little time to finish loading the file, relative to your typing speed):

> console.log(promise);
Promise { '200' }

If on the other hand the file account does not exist, you will see that the promise has been rejected. The exception object that would normally have been thrown to describe the error is instead stored in the promise:

> console.log(promise);
Promise {
  <rejected> [Error: ENOENT: no such file or directory, open 'account'] { ... }
}

Await

We’ve seen how to start up concurrent computations and obtain Promise values associated with them. Now how can we interact with the final values produced by those computations? One simple way is to use the await keyword:

const promise: Promise<string> = fs.readFile('account', { encoding: 'utf-8' });
const data: string = await promise;
// data is now '200'

The await keyword is a built-in operator that converts a value of type Promise<T> into a value of type T. It waits until the promise has been fulfilled, and then unpacks the promise to extract the promised value. If the promise is rejected, then await throws an exception instead, using the Error object that the computation stored in the promise.

Note that await is not triggering the computation that the promise represents. The computation is already underway! It was set in motion by the original function that produced the promise. The computation may have made some progress, or even run to completion, by the time execution arrives at the await.

A better mental model for await is that it is handling a deferred return from the function that produced the promise. The await waits for the computation to finish, and then provides its return value, or else a thrown exception, just like the result of a normal function call.

Using await, now we can wrap up the parallel computations that we started:

const promise1: Promise<Response> = fetch('http://www.mit.edu/');
const promise2: Promise<Response> = fetch('http://www.harvard.edu/');
const promise3: Promise<Response> = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers in parallel

const response1: Response = await promise1;
const response2: Response = await promise2;
const response3: Response = await promise3;
// we have now received initial responses from all three servers
// (unless one failed and threw an exception)

Note that there is a better way to handle waiting for multiple promises like this, using Promise.all(), which we will discuss later in the reading.

And we can see now why even an empty promise can be useful, just like a function with a void return value:

const promise: Promise<void> = timeout(2000);
// started up a timer for 2000 milliseconds

await promise;
// no useful return value, but -- now we know that 2000 milliseconds have indeed passed

Note that void is a different type from undefined, which we’ve been using in other contexts to denote the idea of “no useful value”. The void type is designed as the return type of functions that don’t return anything. Conceptually, it has an empty set of values – there is no value that belongs to the void type. It’s simply empty. That’s why we use Promise<void> for computations that will not produce a value, and are just running for the sake of some side-effect (like a timer delay). By contrast, undefined is a type with exactly one value in it, undefined, which is a first-class value that can be passed around, assigned to variables, and stored in data structures.

reading exercises

Timing

Consider these two functions, both designed to wait for a certain amount of time:

// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds:number):Promise<void>;

// sleeps for `milliseconds` milliseconds before returning
function sleep(milliseconds:number):void;

Note that timeout uses a promise, but sleep does not.

Approximately how long does the following code take to reach the point where it prints done?

const promise1 = timeout(1000);
const promise2 = timeout(2000);
await promise1;
await promise2;
console.log('done');

(missing explanation)

sleep(1000);
sleep(2000);
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
await promise2;
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
const promise2 = timeout(2000);
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
console.log('done');

(missing explanation)

const promise = timeout(1000);
sleep(2000);
await promise;
console.log('done');

(missing explanation)

const promise = timeout(1000);
await promise;
await promise;
await promise;
console.log('done');

(missing explanation)

Asynchronous functions

The functions we have been using that return promises – readFile, fetch, timeout – are examples of asynchronous functions. An asynchronous function is a function that returns control to its caller before its computation is done.

Contrast that with the more familiar synchronous function, which returns control after its computation is finished, and its return value is already known. Almost every function we’ve used to this point in the course has been synchronous in this way.

Returning a promise is one way to implement an asynchronous function, but it’s not the only way. Callback functions are another way to do it, which we saw in a previous reading. Prior to the introduction of promises in JavaScript, callback functions were the most common way to implement asynchronous function behavior, and there are still many uses of callbacks in the JavaScript library ecosystem.

If you are using promises instead of callbacks, then an asynchronous function can be declared using the async keyword, and it must be declared with return type Promise:

async function getBalance():Promise<number> {
    const data: string = await fs.readFile('account', { encoding: 'utf-8' });
    const balance: number = parseInt(data);
    if ( isNaN(balance) ) throw new Error('account does not contain a number')
    return balance;
}

Note that the body of an async function can use return and throw statements in the same way that a synchronous function would. These are automatically converted into effects on the function’s promise. The return balance statement fulfills the promise with the value of balance, and the throw statement rejects the promise with the given Error object.

Since getBalance() is an asynchronous function returning a promise, a caller of getBalance() needs to interact with the promise appropriately, i.e. using await:

const myBalance = await getBalance();

Only async functions are allowed to await

It turns out that the await operator has an important limitation. Because of the way JavaScript handles concurrency, await can only be used inside an async function.

Here is a brief explanation; we’ll return to this again when we unpack how await and async are actually implemented in terms of lower-level Promise operations, later in this reading. Recall that an asynchronous function is designed to give up control to its caller before a computation is done. Whenever an await needs to wait for a pending promise, it doesn’t just sit there spinning its wheels. Instead, it returns control back up to higher levels of the program. For the first await encountered in the function body, this actually does return control to the original caller of the function. When the promise that the await is waiting for finally changes state (either fulfilling or rejecting), then the await regains control by being called from that promise, and continues running the function body. Subsequent awaits in the function body return control to the promise-processing system. So you should think of await as being like a temporary return, which will reactivate when the promise it is waiting for finally fires.

TypeScript statically checks this requirement. It will produce a static error if await occurs anywhere except inside an async function.

reading exercises

Windfall

Here are two versions of the bank-account-reading function. The first one is asynchronous:

async function getBalance():Promise<number> {
    const data: string = await fs.readFile('account', { encoding: 'utf-8' });
    return parseInt(data);
}

The second is synchronous:

function getBalanceSync():number {
    const data: string = fs.readFileSync('account', { encoding: 'utf-8' });
    return parseInt(data);
}

Which of the following are correct ways to use getBalance() in an expression?

(missing explanation)

Promises promises

Using getBalance() and getBalanceSync() as defined in the previous exercise, what is the type of each of the following expressions?

[ getBalance(), getBalance() ]

(missing explanation)

[ getBalanceSync(), getBalanceSync() ]

(missing explanation)

[ await getBalance(), await getBalance() ]

(missing explanation)

Tick tock

Recall that we have defined timeout(milliseconds) as an asynchronous function that creates a timer – a promise that will fulfill milliseconds after timeout() was called.

Now consider this asynchronous function:

async function clock(milliseconds:number):________ {
  while (true) {
    await timeout(milliseconds);
    console.log('tick');
  }
}

What is the return type of clock() (the blank in the code above)?

(missing explanation)

Assume the blank is filled in with the correct return type.

What would clock(1000) do?

(missing explanation)

What would await clock(1000) do?

(missing explanation)

It’s awaits all the way down

Consider this simple program, which first calls main, which calls f, which calls g:

function main():void { console.log(f()); }
function f():number { return g()+50;  }
function g():number { return 0; }
main();

Now suppose g() is changed to use await:

function g():number { return await getBalance(); }

The edited program is not compiling yet; there are static errors.

Which additional changes need to be made to g()?

(missing explanation)

What changes need to be made to f()?

(missing explanation)

What changes need to be made to main()?

(missing explanation)

What changes need to be made to the last line of the code, the call to main()?

(missing explanation)

then()

Let’s dig into the Promise abstract data type, to understand more about how it works and how it interacts with await and async function.

An abstract data type is defined by its operations, and it turns out that a Promise<T> has one key operation, called then. Here is the simplest version of then:

interface Promise<T> {
  then(callback: T => U): Promise<U>;
}

The then operation is a Promise producer. It is called on an existing promise expected to result in a value of type T, and attaches a callback function to that promise. The callback function consumes the T value the promise eventually produces, and generates a value of some other type U (possibly the same as T, of course).

Here’s an example of then in action:

const promise : Promise<string> = fs.readFile('account', { encoding: 'utf-8' });

const biggerPromise : Promise<number> = promise.then(function(data:string) {
  return parseInt(data);
});

Here, the original promise, which is reading a file into a string, is composed by then with a callback function that parses that string into a number. The resulting biggerPromise represents the entire computation – both reading the file and parsing it into a number – and promises to eventually produce that number.

It may help to think about then as a composition operation for asynchronous computations. Conventional function composition takes two functions f and g and composes them into a new function g ⚬ f, such that (g ⚬ f)(x) = g(f(x)). If f: S → T and g: T → U, then the type of the resulting composition is f ⚬ g: S → U.

In much the same way, then composes one computation f, represented by a promise of its return value Promise<T>, with another computation g that expects to consume that T value and produce a U value. The result of the composition is a combined computation that promises to eventually produce that U value, so its type is Promise<U>.

It may also help to think about Promise<T> like a one-element list that (eventually) has a T value in it. From that point of view, then(g) is like map(g) – once the T element arrives, the g callback is called on it, to produce a U value that ends up as the single element of the resulting Promise<U>. (One place this analogy breaks down, however, is with Promise<void>, since such a promise never actually has any element value. Using then(g) on a void promise still calls g when the promise fulfills, but doesn’t pass any value to g.)

We need to expand this notion of then() in one important way. The callback function doesn’t have to return an already-computed U value. It can instead return a promise of its own, of type Promise<U>:

interface Promise<T> {
  then(callback: T => U|Promise<U>): Promise<U>;
}

The callback might need to do this if computing the U also requires another long-running computation that needs to run in the background. Web page retrieval offers a good example of this, because getting the Response from a fetch() call just means you’ve managed to connect to the web server – actually downloading the full page may take a while longer. Here’s how that promise chain might look:

const promisedResponse: Promise<Response> = fetch('http://www.mit.edu/');

const promisedText: Promise<string> = promisedResponse.then(function(response:Response) {
  // we've made a connection, which is why the `response` object exists
  const downloadingPromise: Promise<string> = response.text();
  return downloadingPromise;
});

Note the several promises shown in this example code:

  • promisedResponse represents the computation that is making the initial connection to www.mit.edu
  • downloadingPromise represents the computation that downloads the MIT homepage after the connection has already been made
  • promisedText represents the composition of both computations: the initial connection followed by the download. The final value of promisedText comes from the value of downloadingPromise, the text of the webpage.

Going back to the function composition analogy, if promisedResponse and downloadingPromise are like the functions f and g, respectively, then promisedText is like the composition f ⚬ g.

You can call then as many times as you want on a promise, to attach different callbacks that might do different things with the promise’s value.

You can also call then regardless of what state the promise is currently in. Most often, the promise is still pending, so the then callback will run sometime in the future. But if the promise is already fulfilled, then the then callback will run right away.

The then operation is the fundamental operation of a promise. then() is the only way to access the value that a promise computes. (await actually uses then, as we’ll see in a moment.) This is partly a safety property, because it ensures that client code can never be able to look inside a promise and see a missing value. But more important, then() is a feature of concurrency design. It allows a concurrent computation to be built up by a sequence of composed computations – a sequence of then() callbacks, which can be interleaved in controlled, predictable ways. We will discuss this more in the next reading.

reading exercises

Then more timing

Let’s return to our friend timeout, which returns a timer promise:

// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds:number):Promise<void>

Approximately how long does the following code take to reach the point where it prints done?

timeout(1000).then(function() {
  console.log('done');
});
timeout(2000);

(missing explanation)

timeout(1000).then(function() {
  return timeout(2000);
});
console.log('done');

(missing explanation)

timeout(1000).then(function() {
  return timeout(2000).then(function() {
    console.log('done');
  });
});

(missing explanation)

Unpacking an asynchronous function

The then operation now gives us enough power to see what await and async function mean, and how they use promises to construct a computation.

Here’s an async function containing a line that awaits a promise, and then does some computation on the result of the promise:

async function getBalance():Promise<number> {
    const promise: Promise<string> = fs.readFile('account', { encoding: 'utf-8' });
    const data: string = await promise;
    const balance: number = parseInt(data);
    return balance;
}

When await is encountered in the execution of a function like this, TypeScript effectively takes the remainder of the computation that the function would do, and wraps it up into a then callback. So this code:

const data: string = await promise;
const balance: number = parseInt(data);
return balance;

becomes something like this:

promise.then(function(data: string) {
  const balance: number = parseInt(data);
  return balance;
});

Instead of spinning its wheels waiting for the file-loading promise to fulfill, await creates a composite promise, which combines the file-loading computation with the parseInt computation. It then immediately returns control to the caller, along with this composite promise. So the equivalent code, without await or async, looks something like this:

function getBalance():Promise<number> {
    const promise: Promise<string> = fs.readFile('account', { encoding: 'utf-8' });
    return promise.then(function(data: string) {
      const balance: number = parseInt(data);
      return balance;
    });
}

Note that this version of getBalance is fairly simple because it’s just straight-line code. When await is used inside a loop, then the “rest of the function’s computation” includes the remaining iterations of the loop, which can’t be easily shown by a syntactic transformation like this, because it chops the loop in pieces. But this example shows semantically what await means.

Note also that this version of getBalance doesn’t have the async keyword any more, because we’re showing what the async keyword means in terms of lower-level TypeScript code. This version shows what the original async function means in terms of lower-level TypeScript. But this version of getBalance is indeed still asynchronous, because it returns a promise representing a concurrent computation that has yet to complete.

Aggregating promises

When running multiple parallel computations, it’s often useful to combine their promises using operations similar to AND and OR.

Promise.all() is like a logical AND – it combines an iterable collection of promises into a single promise that waits for all the promises to fulfill, and returns an array of their results:

const allResponses:Array<Response> = 
  await Promise.all( [ fetch('http://www.mit.edu/'),
                       fetch('http://www.harvard.edu/'),
                       fetch('http://www.tufts.edu/') ] );

But if any of the individual promises fail, then the entire Promise.all() also fails.

Promise.race() is a logical-OR that waits for any individual promise to either fulfill or reject, and immediately fulfills or rejects in the same way. One use of Promise.race() is putting a timeout on an operation:

const responseOrTimeout: Response|void =
  await Promise.race( [ fetch('http://www.mit.edu/'), 
                        timeout(5000) ] );

Making and keeping a promise

We noted above that the key operation of the Promise type is then(), which allows a consumer of the promise’s value to specify what should happen once the promise is fulfilled. The await operator translates into a call to then().

But there’s clearly something missing in the Promise operations we’ve discussed. A promise is a mutable thing – where are its mutator operations? How does its state change from pending to fulfilled, or rejected?

The answer to this question lies in the fact that Promise has two different clients: the consumer of the promise, who uses then(); and the promiser, the long-running computation that is computing the value for the promise. Promises are designed so that only the promiser should have access to the resolve() and reject() mutators that change the promise’s state.

One way to do that uses a convenient helper type, Deferred<T>:

import Q from 'q';

const deferred : Deferred<T> = Q.defer();

This Deferred value has three components:

  • deferred.promise is a fresh Promise<T> associated with the Deferred value
  • deferred.resolve(t:T) is a mutator that fulfills the associated promise with the value t
  • deferred.reject(err:Error) is a mutator that rejects the associate promise with the given error

So Deferred has the necessary mutators, needed by the promiser. The promiser holds onto this so that it can mutate the promise later. The promiser hands the associated Promise value back to the consumer of the promise, for use in await or then().

Here is an example of how timeout() might be written using Deferred:

/**
 * @param milliseconds duration to wait
 * @returns a promise that fulfills no less than `milliseconds` after timeout() was called
 */
async function timeout(milliseconds:number):Promise<void> {
    const deferred: Q.Deferred<void> = Q.defer();
    setTimeout(function() {
        deferred.resolve(); // this mutates deferred.promise into the fulfilled state
        // or we could call deferred.reject(new Error(...)) to reject the promise instead
    }, milliseconds);
    return deferred.promise;
}

You can see other ways to implement timeout(), plus a larger TinyQueue example, that demonstrate how to create and fulfill your own promises.

Never busy-wait

We’ve seen that the Promise type has one producer operation, then, and two (hidden) mutator operations, resolve and reject. The mutator operations are only available to the original promiser, not to the downstream consumer of the promise.

Notice that a Promise has no direct observer operations. There is no method you can call to interrogate a promise for its state (“are you still pending?”), or to extract its value.

There is a good reason why those operations are not provided. If they were, clients of the promise might be tempted to busy-wait, waiting for the promise to fulfill. Busy-waiting means sitting in a tight loop waiting for some event to occur, without giving up control. For example, here’s a busy-waiting timer:

async function sleep(milliseconds: number): Promise<void> {
  const now = new Date(); // returns the current system clock time 
  const deadline = new Date(now.valueOf() + milliseconds);
  while (new Date() < deadline) {
      // do nothing, just busy-wait until the system clock time reaches deadline
  }
}

This code works in that sense that calling sleep(500) will indeed wait for 500 milliseconds, and its promise will fulfill after that time elapses. But because sleep never releases control during that time period — notice that its loop body contains no await whatsoever — no other asynchronous functions in the program will have a chance to run during that time. The program will simply freeze until the entire body of sleep() finishes and returns.

If a Promise provided observers, then programmers would be tempted to write busy-waiting code like this:

const promise = readFile('account', { encoding: 'utf-8' });
while ( promise.isPending() ) { // just hypothetical, isPending() doesn't exist
  // busy-wait! bad idea
}
const data: string = promise.get() // just hypothetical, get() doesn't exist

This code not only freezes the program, but it doesn’t even do what the programmer wants. The busy-waiting loop never gives up control, which means that we never return to the top-level event-handling loop that is processing promise events. When the input data that readFile() is waiting for finally arrives, it results in an event on the event queue. But because we never return to the event-handling loop, the callback that readFile() registered to handle that event is never called. So readFile() never marks its promise as fulfilled, the putative promise.isPending() operation never returns true, and we never exit the busy-waiting loop.

Busy-waiting is generally a bad idea in concurrenct programming (with some rare exceptions like spinlocks), but it is particularly deadly for promises and async/await code. Always use await or then to operate on the value of a promise, or to react to its state transition from pending to fulfilled.

Summary

This reading talked about asynchronous functions returning promises.

These ideas connect to our three key properties of good software as follows:

  • Safe from bugs. Static checking of promise types, and the requirement that a promise be turned into a value with await or then(), ensures that code that depends on an asynchronous computation cannot continue until the values it needs are ready.

  • Easy to understand. Using await to transform a promise into a value makes asynchronous code read very much like straight-line synchronous code. But like all concurrency, promises and asynchronous functions can be subtle and produce surprising effects.

  • Ready for change. Promises can be composed and combined in ways that are not as straightforward with other concurrency techniques (like threads and workers).