6.102
6.102 — Software Construction
Spring 2024

Reading 15: Promises

Software in 6.102

Safe from bugsEasy to understandReady for change
Correct today and correct in the unknown future. Communicating clearly with future programmers, including future you. Designed to accommodate change without rewriting.

Objectives

This reading discusses concurrent computation using promises.

We start at the highest level, with the promise abstraction, and the await operator and async function declaration that allow concurrent computations to happen in TypeScript in a way that closely resembles familiar synchronous programming.

Then we will dig below the covers to understand more about what is happening with Promise, await, and async.

Promises

A promise represents a concurrent computation that has been started but might still be unfinished, whose result may not be ready yet. The name comes from the notion that it is a promise to finish the computation and provide a result sometime in the future.

The Promise type in TypeScript is generic: a Promise<T> represents a concurrent computation that at some point should produce a value of type T.

Here are examples of concurrent computations using promises. (Some of these are actual functions provided by Node packages; others don’t currently exist as library functions but could be readily implemented.)

  • readFile(pathname: string, ...) returning a Promise<string>. The file is loaded concurrently, and the value it promises to eventually produce is the content of the file.
  • diskSpace(folder: string) returning a Promise<number>. A folder tree in the filesystem is traversed in the background, the sizes of all of its files are added up, and the resulting number of bytes is the promised result.
  • fetch(url: string) returning a Promise<Response>. The URL is opened in the background, and the promised result is an HTTP Response object.
  • timeout(milliseconds: number) returning a Promise<void>. A timer runs for the given number of milliseconds, and the promised result has type void (the same as a function that returns no result). This may seem like an empty promise (ha), but we’ll see shortly that even empty promises are useful, because we can trigger additional computation when the timeout computation completes.

Importantly, each of these functions returns its promise immediately. The function starts a long-running computation, like loading a web page or scanning the filesystem, but does not wait for that computation to complete. In lieu of returning a computed value, it returns a promise associated with the concurrent computation that it started. When the computation finishes, its result will be accessible through the promise.

What does this let us do? One benefit we can get from promises is concurrency: by starting multiple computations and collecting a promise for each one, the computations can proceed concurrently with each other:

const promise1 = fetch('http://www.mit.edu/');
const promise2 = fetch('http://www.harvard.edu/');
const promise3 = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers concurrently

More concretely, a Promise<T> is a mutable value with three states:

  • pending: the computation associated with the promise has not finished yet.
  • fulfilled: the computation has finished, and the promise now holds the value of type T that the computation produced.
  • rejected: the computation failed for some reason, and the promise now holds an Error object describing the failure.

A promise starts in the pending state, and eventually transitions into either fulfilled (if the computation completes) or rejected (if the computation throws an exception). It may also stay in the pending state forever (for example, if the computation is waiting for an event that never happens). Once a promise is fulfilled or rejected, it remains in that state; there is no way to reset a promise back to the pending state.

If you print a promise for debugging purposes, you will see its state and the result stored inside it (if it is no longer pending). For example, at the npx ts-node TypeScript prompt, you can print a promise immediately to see it in the pending state:

> const promise = fs.promises.readFile('account', { encoding: 'utf-8' }); console.log(promise);
Promise { <pending> }

Assuming the file account contains 200 (representing a bank account with $200 in it), then your next command should see that the promise has been fulfilled (since it takes very little time to finish loading the file, relative to your typing speed):

> console.log(promise);
Promise { '200' }

If on the other hand the file account does not exist, you will see that the promise has been rejected. The exception object that would normally have been thrown to describe the error is instead stored in the promise:

> console.log(promise);
Promise {
  <rejected> [Error: ENOENT: no such file or directory, open 'account'] { ... }
}

Await

We’ve seen how to start up concurrent computations and obtain Promise values associated with them. Now how can we interact with the final values produced by those computations? One simple way is to use the await keyword:

const promise: Promise<string> = fs.promises.readFile('account', { encoding: 'utf-8' });
const data: string = await promise;
// data is now '200'

The await keyword is a built-in operator that converts a value of type Promise<T> into a value of type T. It waits until the promise has been fulfilled, and then unpacks the promise to extract the promised value. If the promise is rejected, then await throws an exception instead, using the Error object that the computation stored in the promise.

Note that await is not triggering the computation that the promise represents. The computation is already underway! It was set in motion by the original function that produced the promise. The computation may have made some progress, or even run to completion, by the time execution arrives at the await.

A better mental model for await is that it is handling a deferred return from the function that produced the promise. The await waits for the computation to finish, and then provides its return value, or else a thrown exception, just like the result of a normal function call.

Using await, now we can wrap up the concurrent computations that we started:

const promise1: Promise<Response> = fetch('http://www.mit.edu/');
const promise2: Promise<Response> = fetch('http://www.harvard.edu/');
const promise3: Promise<Response> = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers concurrently

const response1: Response = await promise1;
const response2: Response = await promise2;
const response3: Response = await promise3;
// we have now received initial responses from all three servers
// (unless one failed and threw an exception)

Note that there is a better way to handle waiting for multiple promises like this, using Promise.all(), which we will discuss later in the reading.

And we can see now why even an empty promise can be useful, just like a function with a void return value:

const promise: Promise<void> = timeout(2000);
// started up a timer for 2000 milliseconds

await promise;
// no useful return value, but -- now we know that 2000 milliseconds have indeed passed

Note that void is a different type from undefined, which we’ve been using in other contexts to denote the idea of “no useful value”. The void type is designed as the return type of functions that don’t return anything. That’s why we use Promise<void> for computations that will not produce a value, and are just running for the sake of some side-effect (like a timer delay). By contrast, undefined represents the lack of a value, the difference being that void tells other programmers that a function shouldn’t be used for its return value and undefined tells other programmers that the lack of a value is important.

reading exercises

Timing

Consider these two functions, both designed to wait for a certain amount of time:

// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds: number): Promise<void>;

// waits for `milliseconds` milliseconds before returning
function busyWait(milliseconds: number): void;

Note that timeout uses a promise, but busyWait does not.

Approximately how long does the following code take to reach the point where it prints done?

const promise1 = timeout(1000);
const promise2 = timeout(2000);
await promise1;
await promise2;
console.log('done');

(missing explanation)

busyWait(1000);
busyWait(2000);
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
await promise2;
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
const promise2 = timeout(2000);
console.log('done');

(missing explanation)

const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
console.log('done');

(missing explanation)

const promise = timeout(1000);
busyWait(2000);
await promise;
console.log('done');

(missing explanation)

const promise = timeout(1000);
await promise;
await promise;
await promise;
console.log('done');

(missing explanation)

Asynchronous functions

The functions we have been using that return promises – readFile, fetch, timeout – are examples of asynchronous functions. An asynchronous function is a function that returns control to its caller before its computation is done.

Contrast that with the more familiar synchronous function, which returns control after its computation is finished, and its return value is already known. Almost every function we’ve used to this point in the course has been synchronous in this way.

In modern TS/JS programming, using promises, an asynchronous function can be declared using the async keyword, and must have return type Promise:

async function getBalance(): Promise<number> {
    const data: string = await fs.promises.readFile('account', { encoding: 'utf-8' });
    const balance: number = parseInt(data);
    if ( isNaN(balance) ) throw new Error('account does not contain a number')
    return balance;
}

Note that the body of an async function can use return and throw statements in the same way that a synchronous function would. These are automatically converted into effects on the function’s promise. The return balance statement fulfills the promise with the value of balance, and the throw statement rejects the promise with the given Error object.

Since getBalance() is an asynchronous function returning a promise, a caller of getBalance() needs to interact with the promise appropriately, i.e. using await:

const myBalance = await getBalance();

Note that await cannot be used inside a synchronous function (i.e., a function not declared async). TypeScript statically checks this requirement. It will produce a static error if await occurs inside a non-async function.

reading exercises

Windfall

Which of the following are correct ways to use getBalance() in an expression?

(missing explanation)

Double balance

What is the type of each of the following expressions?

[ getBalance(), getBalance() ]

(missing explanation)

[ await getBalance(), await getBalance() ]

(missing explanation)

It’s awaits all the way down

Consider this simple program, in which main calls f which calls g:

function main(): void { console.log(f()); }
function f(): number { return g()+50;  }
function g(): number { return 0; }

Now suppose g() is changed to use await:

function g(): number { return await getBalance(); }

The edited program is not compiling yet; there are static errors.

Which additional changes need to be made to g()?

(missing explanation)

What changes need to be made to f()?

(missing explanation)

What changes need to be made to main()?

(missing explanation)

Top-level await

We saw above that await cannot be used in a synchronous function; a function must be declared async in order to use await.

But what about top-level code, outside of any function? Consider the previous exercise, in which we had to change synchronous functions into async functions starting from g to f all the way up to the entry-point function main. But somewhere in the top-level code, we will need to have a call to main to kick things off:

async function main(): Promise<void> { ... }
...
main();  // what do we do with the promise returned by main()?

Can we change this line to await main(), when we don’t have a surrounding function to mark async?

The answer unfortunately depends on what kind of JavaScript context the code will run in. In code which is compiled and run as a modern JavaScript module (also called “ECMAScript 6 module” or “ES6 module” after the version of the JavaScript standard that introduced modules), you can use await in top-level code, and it behaves just as would inside an async function:

async function main(): Promise<void> { ... }
...
await main();  // waits for main() to finish before moving on -- works in modern ES6 module contexts

Our code in 6.102 is using these new modules as much as possible, so that await works at the top level.

In older JavaScript contexts, unfortunately, the top-level script runs synchronously and is not allowed to use await, so await main() would produce an error. If you find yourself in that kind of context, the best thing to do is to put all your top-level code inside an asynchronous main function (which we’ve already done here), and call it without waiting for its promise. Does this mean that the program will exit right away, before the promise is fulfilled? No — it turns out that node itself keeps track of pending promises, and the process will not exit until the promise returned by main() has either been fulfilled or rejected. So await is not necessary if you are just calling one function at the top level.

Depending on your configuration, you may still get a warning at compile-time that you’re not handling the promise return value, so you can declare that you don’t intend to by preceding the call with the void operator:

async function main(): Promise<void> { ... }
...
void main();  // calls main() but doesn't wait for it to finish -- works in all JavaScript contexts

Concurrency model

JavaScript has only one thread of control per global environment. (As we saw in a previous reading, Workers can create additional threads, but each worker gets a fresh global environment, and generally communicates with other workers or the main program by message passing, not by directly sharing mutable objects in memory.)

This raises some questions: if there is only one thread in a JavaScript environment, then what does it mean to have multiple asynchronous functions in progress at the same time? After an asynchronous function returns a promise to its caller, before having finished its computation, when and how does it get control back, so that it can compute some more and eventually fulfill or reject the promise?

Let’s see how this works with an example that gets the balances from two bank accounts and adds them together:

async function totalBalance(): Promise<number> {
    const checkingPromise = getBalance('checking');
    const savingsPromise = getBalance('savings');
    return (await checkingPromise) + (await savingsPromise);
}

async function getBalance(account: string): Promise<number> {
    const data: string = await fs.promises.readFile(account, { encoding: 'utf-8' });
    return parseInt(data);
}

Suppose a client calls totalBalance(). Here’s what happens. (You can hover over each step in the text below and see the corresponding code highlighted on the right.)

  1. totalBalance calls getBalance('checking'), which calls readFile, which starts reading the file and immediately returns a Promise<string> for its contents.
  2. getBalance('checking') reaches an await for that file promise, which means it must give up control. It constructs a new Promise<number> for its own result, and returns that promise back to its caller, totalBalance.
  3. totalBalance doesn’t wait for that promise yet, but stores it away in checkingPromise, and proceeds to call getBalance('savings').
  4. getBalance('savings') evolves in the same way, reaching its await and returning a promise for the savings-account balance, which totalBalance stores in savingsPromise.
    • Note that totalBalance has now created two concurrent computations, one loading the checking account and the other the savings account. By holding onto their promises and reserving the await for later, it allows the computations to run independently, rather than forcing one to finish before even starting the other. This is the essence of promise-based concurrency.
  5. totalBalance must await each of the promises before it can compute the final sum. TypeScript/JavaScript evaluates expressions from left to right (not all languages do…), so await checkingPromise will happen first. Since checkingPromise is still pending, this means that totalBalance constructs a new Promise<number> for its own eventual result, and returns that promise to the original caller.
  6. The original caller (not shown) continues executing. In order for JavaScript’s concurrency model to work correctly, we depend on the caller to eventually return control back to the JavaScript runtime system (the original caller of the starting point of our code). The runtime system handles low-level processing, like file loading.
  7. Suppose the savings file finishes loading first, and its promise fulfills with the string "200". The JavaScript runtime system then gives control to getBalance('savings'), which was waiting on that promise, and execution resumes with the await evaluating to "200". getBalance('savings') finishes its execution, fulfills savingsPromise with the number 200, and returns control back to the runtime system. But totalBalance is not awaiting savingsPromise (yet), so nothing further happens.
  8. The checking file finishes loading, and fulfills with the string "50". The runtime system gives control to getBalance('checking'), which fulfills checkingPromise with the number 50, and returns control back to the runtime system.
  9. Since totalBalance was waiting for checkingPromise, the runtime system gives control to totalBalance with the value 50. totalBalance then does await savingsPromise, so it gives up control to the runtime system again. But because savingsPromise is already fulfilled with the value 200, the runtime system will very shortly (possibly immediately) give control back again. totalBalance proceeds with its own computation, finally fulfilling its own promise with the sum 250.

This example reveals some high-level points about the flow of control:

  • Every await is a place where an asynchronous function gives up control.
  • For the first await in an asynchronous function, “giving up control” means returning the function’s own promise to its caller. Subsequent awaits give up control directly back to the runtime system.
  • When the await resumes, control comes back from the runtime system.
  • An asynchronous function is effectively divided into pieces of computation between awaits, and those pieces can interleave with pieces of other asynchronous function calls.

This model of concurrency is called cooperative or non-preemptive. There is only one thread of control, and concurrent computations must cooperate to release control to each other at well-defined points (such as await and return).

Note that await always gives up control, even if the promise it would wait for has already been fulfilled. It may gain control back almost immediately if that’s the case, but first it gives other concurrent computations a chance to continue their execution.

reading exercises

Promises promises

At step 6 in the totalBalance example above, when the computation first returns control back to the JavaScript runtime system…

How many Promise<number> promises are pending? (Just consider the code shown.)

(missing explanation)

How many Promise<string> promises are pending?

(missing explanation)

Checking carefully

Let’s consider some other ways to write the body of totalBalance, and how they might affect the concurrency of the computation.

Note that types are omitted in the variable declarations below. Some variables may be number, and others Promise<number>. Some examples may even fail to compile because of a static type error. Look carefully at the awaits, and remember that they convert a promise into a value.

What is the concurrency behavior of each of the following?

const savings = getBalance('savings');
const checking = getBalance('checking');
return (await savings) + (await checking);

(missing explanation)

const checking = await getBalance('checking');
const savings = await getBalance('savings');
return checking + savings;

(missing explanation)

const checking = getBalance('checking');
const savings = getBalance('savings');
return checking + savings;

(missing explanation)

const checking = getBalance('checking');
return (await getBalance('savings')) + (await checking);

(missing explanation)

Tick tock

Recall that we have defined timeout(milliseconds) as an asynchronous function that creates a timer — a promise that will fulfill milliseconds after timeout() was called.

Now consider this asynchronous function:

async function clock(milliseconds: number): ________ {
  while (true) {
    await timeout(milliseconds);
    console.log('tick');
  }
}

What is the return type of clock() (the blank in the code above)?

(missing explanation)

Assume the blank is filled in with the correct return type.

What would clock(1000) do?

(missing explanation)

What would await clock(1000) do?

(missing explanation)

Aggregating promises

When running multiple concurrent computations, it’s often useful to combine their promises using operations similar to AND and OR.

Promise.all() is like a logical AND – it combines an iterable collection of promises into a single promise that waits for all the promises to fulfill, and returns an array of their results:

const allResponses: Array<Response> = 
  await Promise.all( [ fetch('http://www.mit.edu/'),
                       fetch('http://www.harvard.edu/'),
                       fetch('http://www.tufts.edu/') ] );

But if any of the individual promises fail, then the entire Promise.all() also fails.

For logical OR, Promise.any() produces a promise that waits for any of the individual promises to successfully fulfill (and returns that promise’s result), and only fails if all the promises fail. You might use it for running redundant computations that might fail independently:

const firstResponse: Response = 
  await Promise.any( [ fetch('http://www.mit1.edu/'),
                       fetch('http://www.mit2.edu/'),
                       fetch('http://www.mit3.edu/') ] );

Promise.race() is a logical OR that waits for any individual promise to either fulfill or reject, and immediately fulfills or rejects in the same way. One use of Promise.race() is putting a timeout on an operation:

const responseOrTimeout: Response|void =
  await Promise.race( [ fetch('http://www.mit.edu/'), 
                        timeout(5000) ] );

Never busy-wait

We’ve seen that a Promise has one observer operation, await, which lets a client obtain the promise’s value (and always gives up control, and waiting if necessary for the promise to fulfill).

But there is no method you can call to interrogate a promise directly for its state (“are you still pending?”), or to extract its value without giving up control.

There is a good reason why those operations are not provided. If they were, clients of the promise might be tempted to busy-wait, waiting for the promise to fulfill. Busy-waiting means sitting in a tight loop waiting for some event to occur, without giving up control. For example, here’s a busy-waiting timer:

async function busyWait(milliseconds: number): Promise<void> {
    const now = new Date().getTime(); // new Date().getTime() is always the current clock time in milliseconds
    const deadline = now + milliseconds;
    while (new Date().getTime() < deadline) {
        // do nothing, just busy-wait until the system clock time reaches deadline
    }
}

This code works in the narrow sense that calling busyWait(500) will indeed wait for 500 milliseconds, and its promise will fulfill after that time elapses. But because busyWait never releases control during that time period — notice that its loop body contains no await whatsoever — no other asynchronous functions in the program will have a chance to run during that time. We will never get back to the JavaScript runtime system to give control to them. The program will simply freeze until the entire body of busyWait() finishes and returns. So busyWait() is quite useless as an asynchronous function — it can’t run concurrently with any other asynchronous code.

If a Promise provided observers other than await, then programmers would be tempted to write busy-waiting code like this:

const promise = readFile('account', { encoding: 'utf-8' });
while ( promise.isPending() ) { // just hypothetical, isPending() doesn't exist
  // busy-wait! bad idea
}
const data: string = promise.get() // just hypothetical, get() doesn't exist

This code not only freezes the program, but it doesn’t even do what the programmer wants. The busy-waiting loop never gives up control, which means that we never return to the top-level runtime system that is processing file input/output. So readFile() never marks its promise as fulfilled, the putative promise.isPending() operation never returns true, and we never exit the busy-waiting loop.

Busy-waiting is generally a bad idea in concurrent programming (with some rare exceptions like spinlocks), but it is particularly deadly for promises and async/await code. Instead, use await to operate on the value of a promise, or to react to its state transition from pending to fulfilled.

Summary

This reading talked about asynchronous functions returning promises.

  • A promise is a concurrent computation that has been started but might still be unfinished, whose result may not be ready yet.
    • Promises allow us to write concurrent code that never busy-waits.
  • await waits for and accesses the value that a promise computes.
  • Promise.all(), Promise.any(), Promise.race() are ways to aggregate multiple promises together.
  • Functions that return a promise are asynchronous.
    • These functions give up control to their caller or to the JavaScript runtime system.
    • A normal function can be transformed to an asynchronous one by using the async and await keywords and returning a promise type.

These ideas connect to our three key properties of good software as follows:

  • Safe from bugs. Static checking of promise types, and the requirement that a promise be turned into a value with await, ensures that code that depends on an asynchronous computation cannot continue until the values it needs are ready.

  • Easy to understand. Using await to transform a promise into a value makes asynchronous code read very much like straight-line synchronous code. But like all concurrency, promises and asynchronous functions can be subtle and produce surprising effects.

  • Ready for change. Promises can be composed and combined in ways that other concurrency techniques (like threads and workers) do not easily support.