6.031TS
6.031TS — Software Construction
TypeScript Pilot — Spring 2021

Reading 21: Promises

Software in 6.031

Safe from bugsEasy to understandReady for change
Correct today and correct in the unknown future. Communicating clearly with future programmers, including future you. Designed to accommodate change without rewriting.

Objectives

This reading discusses concurrent computation using promises.

We start at the highest level, with the promise abstraction, and the await operator and async function declaration that allow concurrent computations to happen in TypeScript in a way that closely resembles familiar synchronous programming.

Then we will dig below the covers to understand more about what is really happening with Promise, await, and async.

Promises

A promise represents a concurrent computation that has been started but might still be unfinished, whose result may not be ready yet. The name comes from the notion that it is a promise to finish the computation and provide a result sometime in the future.

The Promise type in TypeScript is generic: a Promise<T> represents a concurrent computation that at some point should produce a value of type T. Here are some examples of concurrent computations using promises:

  • readFile(pathname:string) returns a Promise<string>. The file is loaded concurrently, and the value it promises to eventually produce is the content of the file.
  • diskSpace(folder:string) returns a Promise<number>. A folder tree in the filesystem is traversed in the background, the sizes of all of its files is added up, and the resulting number of bytes is the promised result.
  • fetch(url:string) returns a Promise<Response>. The URL is opened in the background, and the promised result is an HTTP Response object.
  • timeout(milliseconds:number) returns a Promise<void>. A timer runs for the given number of milliseconds, and the promised result has type void (the same as a function that returns no result). This may seem like an empty promise (ha), but we’ll see shortly that even empty promises are useful, because we can trigger additional computation when the computation completes.

Importantly, each of these functions returns its promise immediately. The function starts a long-running computation, like loading a web page or scanning the filesystem, but does not wait for that computation to complete. In lieu of returning a computed value, it returns a promise associated with the concurrent computation that it started. When the computation finishes, its result will be accessible through the promise.

What does this let us do? One benefit we can get from promises is parallelism: by starting multiple computations and collecting a promise for each one, the computations can proceed concurrently with each other:

const promise1 = fetch('http://www.mit.edu/');
const promise2 = fetch('http://www.harvard.edu/');
const promise3 = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers in parallel

More concretely, a Promise<T> is a mutable value with three states:

  • pending: the computation associated with the promise has not finished yet.
  • fulfilled: the computation has finished, and the promise now holds the value of type T that the computation produced.
  • rejected: the computation failed for some reason, and the promise now holds an Error object describing the failure.

A promise starts in the pending state, and eventually transitions into either fulfilled (if the computation completes) or rejected (if the computation throws an exception). It may also stay in the pending state forever (for example, if the computation is waiting for an event that never happens). Once a promise is fulfilled or rejected, it remains in that state; there is no way to reset a promise back to the pending state.

If you print a promise for debugging purposes, you will see its state and the result stored inside it (if it is no longer pending). For example, at the ts-node TypeScript prompt, you can print a promise immediately to see it in the pending state:

> const promise = fs.readFile('account'); console.log(promise);
Promise { <pending> }

Assuming the file account contains 200 (representing a bank account with $200 in it), then your next command should see that the promise has been fulfilled (since it takes very little time to finish loading the file, relative to your typing speed):

> console.log(promise);
Promise { '200' }

If on the other hand the file account does not exist, you will see that the promise has been rejected. The exception object that would normally have been thrown to describe the error is instead stored in the promise:

> console.log(promise);
Promise {
  <rejected> [Error: ENOENT: no such file or directory, open 'account'] { ... }
}

Await

We’ve seen how to start up concurrent computations and obtain Promise values associated with them. Now how can we interact with the final values produced by those computations? One simple way is to use the await keyword:

const promise: Promise<string> = fs.readFile('account');
const data: string = await promise;
// data is now '200'

The await keyword is a built-in operator that converts a value of type Promise<T> into a value of type T. It waits until the promise has been fulfilled, and then unpacks the promise to extract the promised value. If the promise is rejected, then await throws an exception instead, using the Error object that the computation stored in the promise.

Note that await is not triggering the computation that the promise represents. The computation is already underway! It was set in motion by the original function that produced the promise. The computation may have made some progress, or even run to completion, by the time execution arrives at the await.

A better mental model for await is that it is handling a deferred return from the function that produced the promise. The await waits for the computation to finish, and then provides its return value, or else a thrown exception, just like the result of a normal function call.

Using await, now we can wrap up the parallel computations that we started:

const promise1: Promise<Response> = fetch('http://www.mit.edu/');
const promise2: Promise<Response> = fetch('http://www.harvard.edu/');
const promise3: Promise<Response> = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers in parallel

const response1: Response = await promise1;
const response2: Response = await promise2;
const response3: Response = await promise3;
// we have now received initial responses from all three servers
// (unless one failed and threw an exception)

Note that there is a better way to handle waiting for multiple promises like this, using Promise.all(), which we will discuss later in the reading.

And we can see now why even an empty promise can be useful, just like a function with a void return value:

const promise: Promise<void> = timeout(2000);
// started up a timer for 2000 milliseconds

await promise;
// no useful return value, but -- now we know that 2000 milliseconds have indeed passed

reading exercises

Timing

Consider these two functions, both designed to wait for a certain amount of time:

// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds:number):Promise<void>

// sleeps for `milliseconds` milliseconds before returning
function sleep(milliseconds:number):void

Note that timeout uses a promise, but sleep does not.

Approximately how long does the following code take to reach the point where it prints done?

const promise1 = timeout(1000);
const promise2 = timeout(2000);
await promise1;
await promise2;
console.log('done');

(missing explanation)

sleep(1000);
sleep(2000);
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
await promise2;
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
const promise2 = timeout(2000);
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
console.log('done');

(missing explanation)

const promise = timeout(1000);
sleep(2000);
await promise;
console.log('done');

(missing explanation)

const promise = timeout(1000);
await promise;
await promise;
await promise;
console.log('done');

(missing explanation)

Asynchronous functions

The functions we have been using that return promises – readFile, fetch, timeout – are examples of asynchronous functions. An asynchronous function is a function that returns control to its caller before its computation is done.

Contrast that with the more familiar synchronous function, which returns control after its computation is finished, and its return value is already known. Almost every function we’ve used to this point in the course has been synchronous in this way.

Returning a promise is one way to implement an asynchronous function, but it’s not the only way. Callback functions are another way to do it, which we will see in a future reading. Prior to the introduction of promises in JavaScript, callback functions were the most common way to implement asynchronous function behavior, and there are still many uses of callbacks in the JavaScript library ecosystem.

If you are using promises instead of callbacks, then an asynchronous function can be declared using the async keyword, and it must return a Promise:

async function getBalance():Promise<number> {
    const data: string = await fs.readFile('account');
    const balance: number = parseInt(data);
    if ( isNaN(balance) ) throw new Error('account does not contain a number')
    return balance;
}

Note that the body of an async function can use return and throw statements in the same way that a synchronous function would. These are automatically converted into effects on the function’s promise. The return balance statement fulfills the promise with the value of balance, and the throw statement rejects the promise with the given Error object.

Since getBalance() is an asynchronous function returning a promise, a caller of getBalance() needs to interact with the promise appropriately, i.e. using await:

const myBalance = await getBalance();

Only async functions are allowed to await

It turns out that the await operator has an important limitation. Because of the way JavaScript handles concurrency, await can only be used inside an async function.

Here is a brief explanation; we’ll return to this again when we unpack how await and async are actually implemented in terms of lower-level Promise operations, later in this reading. Recall that an asynchronous function is designed to give up control to its caller before a computation is done. Whenever an await needs to wait for a pending promise, it doesn’t just sit there spinning its wheels. Instead, it returns control back up to higher levels of the program. For the first await encountered in the function body, this actually does return control to the original caller of the function. When the promise that the await is waiting for finally changes state (either fulfilling or rejecting), then the await regains control by being called from that promise, and continues running the function body. Subsequent awaits in the function body return control to the promise-processing system. So you should think of await as being like a temporary return, which will reactivate when the promise it is waiting for finally fires.

TypeScript statically checks this requirement. It will produce a static error if await occurs anywhere except inside an async function.

reading exercises

Windfall

Here are two versions of the bank-account-reading function. The first one is asynchronous:

async function getBalance():Promise<number> {
    const data: string = await fs.readFile('account');
    return parseInt(data);
}

The second is synchronous:

function getBalanceSync():number {
    const data: string = fs.readFileSync('account');
    return parseInt(data);
}

Which of the following are correct ways to use getBalance() in an expression?

(missing explanation)

Promises promises

Using getBalance() and getBalanceSync() as defined in the previous exercise, what is the type of each of the following expressions?

[ getBalance(), getBalance() ]

(missing explanation)

[ getBalanceSync(), getBalanceSync() ]

(missing explanation)

[ await getBalance(), await getBalance() ]

(missing explanation)

Tick tock

Recall that we have defined timeout(milliseconds) as an asynchronous function that creates a timer – a promise that will fulfill milliseconds after timeout() was called.

Now consider this asynchronous function:

async function clock(milliseconds:number) {
  while (true) {
    await timeout(milliseconds);
    console.log('tick');
  }
}

What is the return type of clock()?

(missing explanation)

What would clock(1000) do?

(missing explanation)

What would await clock(1000) do?

(missing explanation)

then()

Let’s dig into the Promise abstract data type, to understand more about how it works and how it interacts with await and async function.

An abstract data type is defined by its operations, and it turns out that a Promise<T> has one key operation, called then. Here is the simplest version of then:

interface Promise<T> {
  then(handler: T => U): Promise<U>;
}

The then operation is a Promise producer. It is called on an existing promise expected to result in a value of type T, and attaches a handler function to that promise. The handler function consumes the T value the promise eventually produces, and generates a value of some other type U (which might be the same as T, of course).

The result of then is a new promise representing the entire computation – the original computation represented by the Promise<T>, followed by the execution of the handler function. The type of this new promise is Promise<U>, because that is what the handler returns.

Here’s an example of then in action:

const promise : Promise<string> = fs.readFile('account');

const biggerPromise : Promise<number> = promise.then(function(data:string) {
  return parseInt(data);
});

We need to expand this notion of then() in one important way. The handler function doesn’t have to return an already-computed U value. It can instead return a promise of its own, of type Promise<U>:

interface Promise<T> {
  then(handler: T => U|Promise<U>): Promise<U>;
}

The handler might need to do this if computing the U also requires another long-running computation that needs to run in the background. Web page retrieval offers a good example of this, because getting the Response from a fetch() call just means you’ve managed to connect to the web server – actually downloading the full page may take a while longer. Here’s how that promise chain might look:

const promisedResponse: Promise<Response> = fetch('http://www.mit.edu/');

const promisedText: Promise<string> = promisedResponse.then(function(response:Response) {
  // we've made a connection, which is why the `response` object exists
  const downloadingPromise: Promise<string> = response.text();
  return downloadingPromise;
});

Note carefully that there are three promises shown in this example code:

  • promisedResponse represents the computation that is making the initial connection to www.mit.edu
  • downloadingPromise represents the computation that downloads the MIT homepage after the connection has already been made
  • promisedText represents the composition of those two computations: first the connection, and then the download. The final value of promisedText comes from the value of downloadingPromise, the text of the webpage.

One way to think about it is like synchronous function composition: we might have a function f(x), another function g(y), and we’re composing them to make a function h(x) = g(f(x)). f, g, and h are in analogy to promisedResponse, downloadingPromise, promisedText.

You can call then as many times as you want on a promise, to attach different handlers that might do different things with the promise’s value.

You can also call then regardless of what state the promise is currently in. Most often, the promise is still pending, so the then handler will run sometime in the future. But if the promise is already fulfilled, then the then handler will run right away.

The then operation is the fundamental operation of a promise. then() is the only way to access the value that a promise computes. This is partly a safety property, because it ensures that client code can never be able to look inside a promise and see a missing value. But more important, then() is a feature of concurrency design. It allows a concurrent computation to be built up by a sequence of composed computations – a sequence of then() handlers, which can be interleaved in controlled, predictable ways. We will discuss this more in the next reading.

reading exercises

Then more timing

Let’s return to our friend timeout, which returns a timer promise:

// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds:number):Promise<void>

Approximately how long does the following code take to reach the point where it prints done?

timeout(1000).then(function() {
  console.log('done');
});
timeout(2000);

(missing explanation)

timeout(1000).then(function() {
  return timeout(2000);
});
console.log('done');

(missing explanation)

timeout(1000).then(function() {
  return timeout(2000).then(function() {
    console.log('done');
  });
});

(missing explanation)

Unpacking an asynchronous function

The then operation now gives us enough power to see what await and async function mean, and how they use promises to construct a computation.

Here’s an async function containing a line that awaits a promise, and then does some computation on the result of the promise:

async function getBalance():Promise<number> {
    const promise: Promise<string> = fs.readFile('account');
    const data: string = await promise;
    const balance: number = parseInt(data);
    return balance;
}

When await is encountered in the execution of a function like this, TypeScript effectively takes the remainder of the computation that the function would do, and wraps it up into a then handler. So this code:

const data: string = await promise;
const balance: number = parseInt(data);
return balance;

becomes something like this:

promise.then(function(data: string) {
  const balance: number = parseInt(data);
  return balance;
});

Instead of spinning its wheels waiting for the file-loading promise to fulfill, await creates a composite promise, which combines the file-loading computation with the parseInt computation. It then immediately returns control to the caller, along with this composite promise. So the equivalent code, without await or async, looks something like this:

function getBalance():Promise<number> {
    const promise: Promise<string> = fs.readFile('account');
    return promise.then(function(data: string) {
      const balance: number = parseInt(data);
      return balance;
    });
}

(Note that straight-line code like this is a special case – await can also be used inside loops, and then the “rest of the function’s computation” includes the remaining iterations of the loop, which can’t be easily shown by a syntactic transformation like this, because it chops the loop in pieces. But this example shows semantically what await means.)

Aggregating promises

When running multiple parallel computations, it’s often useful to combine their promises using operations similar to AND and OR.

Promise.all() is like a logical AND – it combines an iterable collection of promises into a single promise that waits for all the promises to fulfill, and returns an array of their results:

const allResponses:Array<Response> = 
  await Promise.all( [ fetch('http://www.mit.edu/'),
                       fetch('http://www.harvard.edu/'),
                       fetch('http://www.tufts.edu/') ] );

But if any of the individual promises fail, then the entire Promise.all() also fails.

For logical OR, Promise.any() produces a promise that waits for any of the individual promises to successfully fulfill, and only fails if all the promises fail. You might use it for running redundant computations that might fail independently:

const firstResponse: Response = 
  await Promise.any( [ fetch('http://www.mit1.edu/'),
                       fetch('http://www.mit2.edu/'),
                       fetch('http://www.mit3.edu/') ] );

Finally, Promise.race() is a logical-OR that waits for any individual promise to either fulfill or reject, and immediately fulfills or rejects in the same way. One use of Promise.race() is putting a timeout on an operation:

const responseOrTimeout: Response|void =
  await Promise.race( [ fetch('http://www.mit.edu/'), 
                        timeout(5000) ] );

Making and keeping a promise

We noted above that the key operation of the Promise type is then(), which allows a consumer of the promise’s value to specify what should happen once the promise is fulfilled. The await operator translates into a call to then().

But there’s clearly something missing in the Promise operations we’ve discussed. A promise is a mutable thing – where are its mutator operations? How does its state change from pending to fulfilled, or rejected?

The answer to this question lies in the fact that Promise has two different clients: the consumer of the promise, who uses then(); and the promiser, the long-running computation that is computing the value for the promise. Promises are designed so that only the promiser should have access to the resolve() and reject() mutators that change the promise’s state.

One way to do that uses a convenient helper type, Deferred<T>:

import Q from `q`;

const deferred : Deferred<T> = Q.defer();

This Deferred value has three components:

  • deferred.promise is a fresh Promise<T> associated with the Deferred value
  • deferred.resolve(t:T) is a mutator that fulfills the associated promise with the value t
  • deferred.reject(err:Error) is a mutator that rejects the associate promise with the given error

So Deferred has the necessary mutators, needed by the promiser. The promiser holds onto this so that it can mutate the promise later. The promiser hands the associate Promise value back to the consumer of the promise, for use in await or then().

Here is an example of how timeout() might be written using Deferred:

/**
 * @param milliseconds duration to wait
 * @returns a promise that fulfills no less than `milliseconds` after timeout() was called
 */
async function timeout(milliseconds:number):Promise<void> {
    const deferred: Q.Deferred<void> = Q.defer();
    setTimeout(function() {
        deferred.resolve(); // this mutates deferred.promise into the fulfilled state
        // or we could call deferred.reject(new Error(...)) to reject the promise instead
    }, milliseconds);
    return deferred.promise;
}

You can see other ways to implement timeout(), plus a larger TinyQueue example, that demonstrate how to create and fulfill your own promises.

Summary

This reading talked about asynchronous functions returning promises.

These ideas connect to our three key properties of good software as follows:

  • Safe from bugs. Static checking of promise types, and the requirement that a promise be turned into a value with await or then(), ensures that code that depends on an asynchronous computation cannot continue until the values it needs are ready.

  • Easy to understand. Using await to transform a promise into a value makes asynchronous code read very much like straight-line synchronous code. But like all concurrency, promises and asynchronous functions can be subtle and produce surprising effects.

  • Ready for change. Promises can be composed and combined in ways that are not as straightforward with other concurrency techniques (like threads and workers).