Reading 22: Promises
Software in 6.031
Objectives
This reading discusses concurrent computation using promises.
We start at the highest level, with the promise abstraction, and the await
operator and async
function declaration that allow concurrent computations to happen in TypeScript in a way that closely resembles familiar synchronous programming.
Then we will dig below the covers to understand more about what is really happening with Promise
, await
, and async
.
Promises
A promise represents a concurrent computation that has been started but might still be unfinished, whose result may not be ready yet. The name comes from the notion that it is a promise to finish the computation and provide a result sometime in the future.
The Promise
type in TypeScript is generic: a Promise<T>
represents a concurrent computation that at some point should produce a value of type T
.
Here are examples of concurrent computations using promises. (Some of these are actual functions provided by Node packages; others don’t currently exist as library functions but could be readily implemented.)
readFile(pathname:string, ...)
returning aPromise<string>
. The file is loaded concurrently, and the value it promises to eventually produce is the content of the file.diskSpace(folder:string)
returning aPromise<number>
. A folder tree in the filesystem is traversed in the background, the sizes of all of its files is added up, and the resulting number of bytes is the promised result.fetch(url:string)
returning aPromise<Response>
. The URL is opened in the background, and the promised result is an HTTP Response object.timeout(milliseconds:number)
returning aPromise<void>
. A timer runs for the given number of milliseconds, and the promised result has typevoid
(the same as a function that returns no result). This may seem like an empty promise (ha), but we’ll see shortly that even empty promises are useful, because we can trigger additional computation when the computation completes.
Importantly, each of these functions returns its promise immediately. The function starts a long-running computation, like loading a web page or scanning the filesystem, but does not wait for that computation to complete. In lieu of returning a computed value, it returns a promise associated with the concurrent computation that it started. When the computation finishes, its result will be accessible through the promise.
What does this let us do? One benefit we can get from promises is parallelism: by starting multiple computations and collecting a promise for each one, the computations can proceed concurrently with each other:
import fetch from 'node-fetch';
const promise1 = fetch('http://www.mit.edu/');
const promise2 = fetch('http://www.harvard.edu/');
const promise3 = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers in parallel
More concretely, a Promise<T>
is a mutable value with three states:
- pending: the computation associated with the promise has not finished yet.
- fulfilled: the computation has finished, and the promise now holds the value of type
T
that the computation produced. - rejected: the computation failed for some reason, and the promise now holds an
Error
object describing the failure.
A promise starts in the pending state, and eventually transitions into either fulfilled (if the computation completes) or rejected (if the computation throws an exception). It may also stay in the pending state forever (for example, if the computation is waiting for an event that never happens). Once a promise is fulfilled or rejected, it remains in that state; there is no way to reset a promise back to the pending state.
If you print a promise for debugging purposes, you will see its state and the result stored inside it (if it is no longer pending).
For example, at the ts-node
TypeScript prompt, you can print a promise immediately to see it in the pending state:
> import fs from 'fs/promises';
> const promise = fs.readFile('account', { encoding: 'utf-8' }); console.log(promise);
Promise { <pending> }
Assuming the file account
contains 200
(representing a bank account with $200 in it), then your next command should see that the promise has been fulfilled (since it takes very little time to finish loading the file, relative to your typing speed):
> console.log(promise);
Promise { '200' }
If on the other hand the file account
does not exist, you will see that the promise has been rejected. The exception object that would normally have been thrown to describe the error is instead stored in the promise:
> console.log(promise);
Promise {
<rejected> [Error: ENOENT: no such file or directory, open 'account'] { ... }
}
Await
We’ve seen how to start up concurrent computations and obtain Promise
values associated with them.
Now how can we interact with the final values produced by those computations?
One simple way is to use the await
keyword:
const promise: Promise<string> = fs.readFile('account', { encoding: 'utf-8' });
const data: string = await promise;
// data is now '200'
The await
keyword is a built-in operator that converts a value of type Promise<T>
into a value of type T
.
It waits until the promise has been fulfilled, and then unpacks the promise to extract the promised value.
If the promise is rejected, then await
throws an exception instead, using the Error
object that the computation stored in the promise.
Note that await
is not triggering the computation that the promise represents.
The computation is already underway!
It was set in motion by the original function that produced the promise.
The computation may have made some progress, or even run to completion, by the time execution arrives at the await
.
A better mental model for await
is that it is handling a deferred return from the function that produced the promise.
The await
waits for the computation to finish, and then provides its return value, or else a thrown exception, just like the result of a normal function call.
Using await
, now we can wrap up the parallel computations that we started:
const promise1: Promise<Response> = fetch('http://www.mit.edu/');
const promise2: Promise<Response> = fetch('http://www.harvard.edu/');
const promise3: Promise<Response> = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers in parallel
const response1: Response = await promise1;
const response2: Response = await promise2;
const response3: Response = await promise3;
// we have now received initial responses from all three servers
// (unless one failed and threw an exception)
Note that there is a better way to handle waiting for multiple promises like this, using Promise.all()
, which we will discuss later in the reading.
And we can see now why even an empty promise can be useful, just like a function with a void
return value:
const promise: Promise<void> = timeout(2000);
// started up a timer for 2000 milliseconds
await promise;
// no useful return value, but -- now we know that 2000 milliseconds have indeed passed
Note that void
is a different type from undefined
, which we’ve been using in other contexts to denote the idea of “no useful value”. The void
type is designed as the return type of functions that don’t return anything. Conceptually, it has an empty set of values – there is no value that belongs to the void
type. It’s simply empty. That’s why we use Promise<void>
for computations that will not produce a value, and are just running for the sake of some side-effect (like a timer delay). By contrast, undefined
is a type with exactly one value in it, undefined
, which is a first-class value that can be passed around, assigned to variables, and stored in data structures.
reading exercises
Consider these two functions, both designed to wait for a certain amount of time:
// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds:number):Promise<void>;
// sleeps for `milliseconds` milliseconds before returning
function sleep(milliseconds:number):void;
Note that timeout
uses a promise, but sleep
does not.
Approximately how long does the following code take to reach the point where it prints done
?
const promise1 = timeout(1000);
const promise2 = timeout(2000);
await promise1;
await promise2;
console.log('done');
(missing explanation)
sleep(1000);
sleep(2000);
console.log('done');
(missing explanation)
const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
await promise2;
console.log('done');
(missing explanation)
const promise1 = timeout(1000);
const promise2 = timeout(2000);
console.log('done');
(missing explanation)
const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
console.log('done');
(missing explanation)
const promise = timeout(1000);
sleep(2000);
await promise;
console.log('done');
(missing explanation)
const promise = timeout(1000);
await promise;
await promise;
await promise;
console.log('done');
(missing explanation)
Asynchronous functions
The functions we have been using that return promises – readFile
, fetch
, timeout
– are examples of asynchronous functions.
An asynchronous function is a function that returns control to its caller before its computation is done.
Contrast that with the more familiar synchronous function, which returns control after its computation is finished, and its return value is already known. Almost every function we’ve used to this point in the course has been synchronous in this way.
Returning a promise is one way to implement an asynchronous function, but it’s not the only way. Callback functions are another way to do it, which we saw in a previous reading. Prior to the introduction of promises in JavaScript, callback functions were the most common way to implement asynchronous function behavior, and there are still many uses of callbacks in the JavaScript library ecosystem.
If you are using promises instead of callbacks, then an asynchronous function can be declared using the async
keyword, and it must be declared with return type Promise
:
async function getBalance():Promise<number> {
const data: string = await fs.readFile('account', { encoding: 'utf-8' });
const balance: number = parseInt(data);
if ( isNaN(balance) ) throw new Error('account does not contain a number')
return balance;
}
Note that the body of an async
function can use return
and throw
statements in the same way that a synchronous function would.
These are automatically converted into effects on the function’s promise.
The return balance
statement fulfills the promise with the value of balance
, and the throw
statement rejects the promise with the given Error
object.
Since getBalance()
is an asynchronous function returning a promise, a caller of getBalance()
needs to interact with the promise appropriately, i.e. using await
:
const myBalance = await getBalance();
Only async
functions are allowed to await
It turns out that the await
operator has an important limitation.
Because of the way JavaScript handles concurrency, await
can only be used inside an async
function.
Here is a brief explanation; we’ll return to this again when we unpack how await
and async
are actually implemented in terms of lower-level Promise
operations, later in this reading.
Recall that an asynchronous function is designed to give up control to its caller before a computation is done.
Whenever an await
needs to wait for a pending promise, it doesn’t just sit there spinning its wheels.
Instead, it returns control back up to higher levels of the program.
For the first await
encountered in the function body, this actually does return control to the original caller of the function.
When the promise that the await
is waiting for finally changes state (either fulfilling or rejecting), then the await
regains control by being called from that promise, and continues running the function body.
Subsequent await
s in the function body return control to the promise-processing system.
So you should think of await
as being like a temporary return, which will reactivate when the promise it is waiting for finally fires.
TypeScript statically checks this requirement.
It will produce a static error if await
occurs anywhere except inside an async
function.
reading exercises
Here are two versions of the bank-account-reading function. The first one is asynchronous:
async function getBalance():Promise<number> {
const data: string = await fs.readFile('account', { encoding: 'utf-8' });
return parseInt(data);
}
function getBalanceSync():number {
const data: string = fs.readFileSync('account', { encoding: 'utf-8' });
return parseInt(data);
}
Which of the following are correct ways to use getBalance()
in an expression?
(missing explanation)
Using getBalance()
and getBalanceSync()
as defined in the previous exercise, what is the type of each of the following expressions?
[ getBalance(), getBalance() ]
(missing explanation)
[ getBalanceSync(), getBalanceSync() ]
(missing explanation)
[ await getBalance(), await getBalance() ]
(missing explanation)
Recall that we have defined timeout(milliseconds)
as an asynchronous function that creates a timer – a promise that will fulfill milliseconds
after timeout()
was called.
Now consider this asynchronous function:
async function clock(milliseconds:number):________ {
while (true) {
await timeout(milliseconds);
console.log('tick');
}
}
What is the return type of clock()
(the blank in the code above)?
(missing explanation)
Assume the blank is filled in with the correct return type.
(missing explanation)
What would await clock(1000)
do?
(missing explanation)
Consider this simple program, which first calls main
, which calls f
, which calls g
:
function main():void { console.log(f()); }
function f():number { return g()+50; }
function g():number { return 0; }
main();
Now suppose g()
is changed to use await
:
function g():number { return await getBalance(); }
The edited program is not compiling yet; there are static errors.
Which additional changes need to be made to g()
?
(missing explanation)
What changes need to be made to f()
?
(missing explanation)
What changes need to be made to main()
?
(missing explanation)
What changes need to be made to the last line of the code, the call to main()
?
(missing explanation)
then()
Let’s dig into the Promise
abstract data type, to understand more about how it works and how it interacts with await
and async function
.
An abstract data type is defined by its operations, and it turns out that a Promise<T>
has one key operation, called then
.
Here is the simplest version of then
:
interface Promise<T> {
then(callback: T => U): Promise<U>;
}
The then
operation is a Promise
producer.
It is called on an existing promise expected to result in a value of type T
, and attaches a callback function to that promise.
The callback function consumes the T
value the promise eventually produces, and generates a value of some other type U
(possibly the same as T
, of course).
Here’s an example of then
in action:
const promise : Promise<string> = fs.readFile('account', { encoding: 'utf-8' });
const biggerPromise : Promise<number> = promise.then(function(data:string) {
return parseInt(data);
});
Here, the original promise
, which is reading a file into a string, is composed by then
with a callback function that parses that string into a number. The resulting biggerPromise
represents the entire computation – both reading the file and parsing it into a number – and promises to eventually produce that number.
It may help to think about then
as a composition operation for asynchronous computations. Conventional function composition takes two functions f
and g
and composes them into a new function g ⚬ f
, such that (g ⚬ f)(x) = g(f(x))
. If f: S → T
and g: T → U
, then the type of the resulting composition is f ⚬ g: S → U
.
In much the same way, then
composes one computation f
, represented by a promise of its return value Promise<T>
, with another computation g
that expects to consume that T
value and produce a U
value.
The result of the composition is a combined computation that promises to eventually produce that U
value, so its type is Promise<U>
.
It may also help to think about Promise<T>
like a one-element list that (eventually) has a T
value in it. From that point of view, then(g)
is like map(g)
– once the T
element arrives, the g
callback is called on it, to produce a U
value that ends up as the single element of the resulting Promise<U>
. (One place this analogy breaks down, however, is with Promise<void>
, since such a promise never actually has any element value. Using then(g)
on a void
promise still calls g
when the promise fulfills, but doesn’t pass any value to g
.)
We need to expand this notion of then()
in one important way.
The callback function doesn’t have to return an already-computed U
value.
It can instead return a promise of its own, of type Promise<U>
:
interface Promise<T> {
then(callback: T => U|Promise<U>): Promise<U>;
}
The callback might need to do this if computing the U
also requires another long-running computation that needs to run in the background.
Web page retrieval offers a good example of this, because getting the Response
from a fetch()
call just means you’ve managed to connect to the web server – actually downloading the full page may take a while longer.
Here’s how that promise chain might look:
const promisedResponse: Promise<Response> = fetch('http://www.mit.edu/');
const promisedText: Promise<string> = promisedResponse.then(function(response:Response) {
// we've made a connection, which is why the `response` object exists
const downloadingPromise: Promise<string> = response.text();
return downloadingPromise;
});
Note the several promises shown in this example code:
promisedResponse
represents the computation that is making the initial connection towww.mit.edu
downloadingPromise
represents the computation that downloads the MIT homepage after the connection has already been madepromisedText
represents the composition of both computations: the initial connection followed by the download. The final value ofpromisedText
comes from the value ofdownloadingPromise
, the text of the webpage.
Going back to the function composition analogy, if promisedResponse
and downloadingPromise
are like the functions f
and g
, respectively, then promisedText
is like the composition f ⚬ g
.
You can call then
as many times as you want on a promise, to attach different callbacks that might do different things with the promise’s value.
You can also call then
regardless of what state the promise is currently in.
Most often, the promise is still pending, so the then
callback will run sometime in the future.
But if the promise is already fulfilled, then the then
callback will run right away.
The then
operation is the fundamental operation of a promise. then()
is the only way to access the value that a promise computes. (await
actually uses then
, as we’ll see in a moment.) This is partly a safety property, because it ensures that client code can never be able to look inside a promise and see a missing value. But more important, then()
is a feature of concurrency design. It allows a concurrent computation to be built up by a sequence of composed computations – a sequence of then()
callbacks, which can be interleaved in controlled, predictable ways. We will discuss this more in the next reading.
reading exercises
Let’s return to our friend timeout
, which returns a timer promise:
// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds:number):Promise<void>
Approximately how long does the following code take to reach the point where it prints done
?
timeout(1000).then(function() {
console.log('done');
});
timeout(2000);
(missing explanation)
timeout(1000).then(function() {
return timeout(2000);
});
console.log('done');
(missing explanation)
timeout(1000).then(function() {
return timeout(2000).then(function() {
console.log('done');
});
});
(missing explanation)
Unpacking an asynchronous function
The then
operation now gives us enough power to see what await
and async function
mean, and how they use promises to construct a computation.
Here’s an async
function containing a line that await
s a promise, and then does some computation on the result of the promise:
async function getBalance():Promise<number> {
const promise: Promise<string> = fs.readFile('account', { encoding: 'utf-8' });
const data: string = await promise;
const balance: number = parseInt(data);
return balance;
}
When await
is encountered in the execution of a function like this, TypeScript effectively takes the remainder of the computation that the function would do, and wraps it up into a then
callback.
So this code:
const data: string = await promise;
const balance: number = parseInt(data);
return balance;
promise.then(function(data: string) {
const balance: number = parseInt(data);
return balance;
});
Instead of spinning its wheels waiting for the file-loading promise to fulfill, await
creates a composite promise, which combines the file-loading computation with the parseInt
computation.
It then immediately returns control to the caller, along with this composite promise.
So the equivalent code, without await
or async
, looks something like this:
function getBalance():Promise<number> {
const promise: Promise<string> = fs.readFile('account', { encoding: 'utf-8' });
return promise.then(function(data: string) {
const balance: number = parseInt(data);
return balance;
});
}
Note that this version of getBalance
is fairly simple because it’s just straight-line code. When await
is used inside a loop, then the “rest of the function’s computation” includes the remaining iterations of the loop, which can’t be easily shown by a syntactic transformation like this, because it chops the loop in pieces. But this example shows semantically what await
means.
Note also that this version of getBalance
doesn’t have the async
keyword any more, because we’re showing what the async
keyword means in terms of lower-level TypeScript code.
This version shows what the original async
function means in terms of lower-level TypeScript.
But this version of getBalance
is indeed still asynchronous, because it returns a promise representing a concurrent computation that has yet to complete.
Aggregating promises
When running multiple parallel computations, it’s often useful to combine their promises using operations similar to AND and OR.
Promise.all()
is like a logical AND – it combines an iterable collection of promises into a single promise that waits for all the promises to fulfill, and returns an array of their results:
const allResponses:Array<Response> =
await Promise.all( [ fetch('http://www.mit.edu/'),
fetch('http://www.harvard.edu/'),
fetch('http://www.tufts.edu/') ] );
But if any of the individual promises fail, then the entire Promise.all()
also fails.
Promise.race()
is a logical-OR that waits for any individual promise to either fulfill or reject, and immediately fulfills or rejects in the same way.
One use of Promise.race()
is putting a timeout on an operation:
const responseOrTimeout: Response|void =
await Promise.race( [ fetch('http://www.mit.edu/'),
timeout(5000) ] );
Making and keeping a promise
We noted above that the key operation of the Promise
type is then()
, which allows a consumer of the promise’s value to specify what should happen once the promise is fulfilled.
The await
operator translates into a call to then()
.
But there’s clearly something missing in the Promise
operations we’ve discussed.
A promise is a mutable thing – where are its mutator operations?
How does its state change from pending to fulfilled, or rejected?
The answer to this question lies in the fact that Promise
has two different clients: the consumer of the promise, who uses then()
; and the promiser, the long-running computation that is computing the value for the promise. Promises are designed so that only the promiser should have access to the resolve()
and reject()
mutators that change the promise’s state.
One way to do that uses a convenient helper type, Deferred<T>
:
import Q from 'q';
const deferred : Deferred<T> = Q.defer();
This Deferred
value has three components:
deferred.promise
is a freshPromise<T>
associated with theDeferred
valuedeferred.resolve(t:T)
is a mutator that fulfills the associated promise with the valuet
deferred.reject(err:Error)
is a mutator that rejects the associate promise with the given error
So Deferred
has the necessary mutators, needed by the promiser.
The promiser holds onto this so that it can mutate the promise later.
The promiser hands the associated Promise
value back to the consumer of the promise, for use in await
or then()
.
Here is an example of how timeout()
might be written using Deferred
:
/**
* @param milliseconds duration to wait
* @returns a promise that fulfills no less than `milliseconds` after timeout() was called
*/
async function timeout(milliseconds:number):Promise<void> {
const deferred: Q.Deferred<void> = Q.defer();
setTimeout(function() {
deferred.resolve(); // this mutates deferred.promise into the fulfilled state
// or we could call deferred.reject(new Error(...)) to reject the promise instead
}, milliseconds);
return deferred.promise;
}
You can see other ways to implement timeout()
, plus a larger TinyQueue
example, that demonstrate how to create and fulfill your own promises.
Never busy-wait
We’ve seen that the Promise
type has one producer operation, then
, and two (hidden) mutator operations, resolve
and reject
. The mutator operations are only available to the original promiser, not to the downstream consumer of the promise.
Notice that a Promise
has no direct observer operations.
There is no method you can call to interrogate a promise for its state (“are you still pending?”), or to extract its value.
There is a good reason why those operations are not provided. If they were, clients of the promise might be tempted to busy-wait, waiting for the promise to fulfill. Busy-waiting means sitting in a tight loop waiting for some event to occur, without giving up control. For example, here’s a busy-waiting timer:
async function sleep(milliseconds: number): Promise<void> {
const now = new Date(); // returns the current system clock time
const deadline = new Date(now.valueOf() + milliseconds);
while (new Date() < deadline) {
// do nothing, just busy-wait until the system clock time reaches deadline
}
}
This code works in that sense that calling sleep(500)
will indeed wait for 500 milliseconds, and its promise will fulfill after that time elapses.
But because sleep
never releases control during that time period — notice that its loop body contains no await
whatsoever — no other asynchronous functions in the program will have a chance to run during that time.
The program will simply freeze until the entire body of sleep()
finishes and returns.
If a Promise
provided observers, then programmers would be tempted to write busy-waiting code like this:
const promise = readFile('account', { encoding: 'utf-8' });
while ( promise.isPending() ) { // just hypothetical, isPending() doesn't exist
// busy-wait! bad idea
}
const data: string = promise.get() // just hypothetical, get() doesn't exist
This code not only freezes the program, but it doesn’t even do what the programmer wants.
The busy-waiting loop never gives up control, which means that we never return to the top-level event-handling loop that is processing promise events.
When the input data that readFile()
is waiting for finally arrives, it results in an event on the event queue.
But because we never return to the event-handling loop, the callback that readFile()
registered to handle that event is never called.
So readFile()
never marks its promise as fulfilled, the putative promise.isPending()
operation never returns true, and we never exit the busy-waiting loop.
Busy-waiting is generally a bad idea in concurrenct programming (with some rare exceptions like spinlocks), but it is particularly deadly for promises and async
/await
code.
Always use await
or then
to operate on the value of a promise, or to react to its state transition from pending to fulfilled.
Summary
This reading talked about asynchronous functions returning promises.
These ideas connect to our three key properties of good software as follows:
Safe from bugs. Static checking of promise types, and the requirement that a promise be turned into a value with
await
orthen()
, ensures that code that depends on an asynchronous computation cannot continue until the values it needs are ready.Easy to understand. Using
await
to transform a promise into a value makes asynchronous code read very much like straight-line synchronous code. But like all concurrency, promises and asynchronous functions can be subtle and produce surprising effects.Ready for change. Promises can be composed and combined in ways that are not as straightforward with other concurrency techniques (like threads and workers).