Reading 15: Promises
Software in 6.102
Objectives
This reading discusses concurrent computation using promises.
We start at the highest level, with the promise abstraction, and the await
operator and async
function declaration that allow concurrent computations to happen in TypeScript in a way that closely resembles familiar synchronous programming.
Then we will dig below the covers to understand more about what is happening with Promise
, await
, and async
.
Promises
A promise represents a concurrent computation that has been started but might still be unfinished, whose result may not be ready yet. The name comes from the notion that it is a promise to finish the computation and provide a result sometime in the future.
The Promise
type in TypeScript is generic: a Promise<T>
represents a concurrent computation that at some point should produce a value of type T
.
Here are examples of concurrent computations using promises. (Some of these are actual functions provided by Node packages; others don’t currently exist as library functions but could be readily implemented.)
readFile(pathname: string, ...)
returning aPromise<string>
. The file is loaded concurrently, and the value it promises to eventually produce is the content of the file.diskSpace(folder: string)
returning aPromise<number>
. A folder tree in the filesystem is traversed in the background, the sizes of all of its files are added up, and the resulting number of bytes is the promised result.fetch(url: string)
returning aPromise<Response>
. The URL is opened in the background, and the promised result is an HTTP Response object.timeout(milliseconds: number)
returning aPromise<void>
. A timer runs for the given number of milliseconds, and the promised result has typevoid
(the same as a function that returns no result). This may seem like an empty promise (ha), but we’ll see shortly that even empty promises are useful, because we can trigger additional computation when thetimeout
computation completes.
Importantly, each of these functions returns its promise immediately. The function starts a long-running computation, like loading a web page or scanning the filesystem, but does not wait for that computation to complete. In lieu of returning a computed value, it returns a promise associated with the concurrent computation that it started. When the computation finishes, its result will be accessible through the promise.
What does this let us do? One benefit we can get from promises is concurrency: by starting multiple computations and collecting a promise for each one, the computations can proceed concurrently with each other:
const promise1 = fetch('http://www.mit.edu/');
const promise2 = fetch('http://www.harvard.edu/');
const promise3 = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers concurrently
More concretely, a Promise<T>
is a mutable value with three states:
- pending: the computation associated with the promise has not finished yet.
- fulfilled: the computation has finished, and the promise now holds the value of type
T
that the computation produced. - rejected: the computation failed for some reason, and the promise now holds an
Error
object describing the failure.
A promise starts in the pending state, and eventually transitions into either fulfilled (if the computation completes) or rejected (if the computation throws an exception). It may also stay in the pending state forever (for example, if the computation is waiting for an event that never happens). Once a promise is fulfilled or rejected, it remains in that state; there is no way to reset a promise back to the pending state.
If you print a promise for debugging purposes, you will see its state and the result stored inside it (if it is no longer pending).
For example, at the npx ts-node
TypeScript prompt, you can print a promise immediately to see it in the pending state:
> const promise = fs.promises.readFile('account', { encoding: 'utf-8' }); console.log(promise);
Promise { <pending> }
Assuming the file account
contains 200
(representing a bank account with $200 in it), then your next command should see that the promise has been fulfilled (since it takes very little time to finish loading the file, relative to your typing speed):
> console.log(promise);
Promise { '200' }
If on the other hand the file account
does not exist, you will see that the promise has been rejected. The exception object that would normally have been thrown to describe the error is instead stored in the promise:
> console.log(promise);
Promise {
<rejected> [Error: ENOENT: no such file or directory, open 'account'] { ... }
}
Await
We’ve seen how to start up concurrent computations and obtain Promise
values associated with them.
Now how can we interact with the final values produced by those computations?
One simple way is to use the await
keyword:
const promise: Promise<string> = fs.promises.readFile('account', { encoding: 'utf-8' });
const data: string = await promise;
// data is now '200'
The await
keyword is a built-in operator that converts a value of type Promise<T>
into a value of type T
.
It waits until the promise has been fulfilled, and then unpacks the promise to extract the promised value.
If the promise is rejected, then await
throws an exception instead, using the Error
object that the computation stored in the promise.
Note that await
is not triggering the computation that the promise represents.
The computation is already underway!
It was set in motion by the original function that produced the promise.
The computation may have made some progress, or even run to completion, by the time execution arrives at the await
.
A better mental model for await
is that it is handling a deferred return from the function that produced the promise.
The await
waits for the computation to finish, and then provides its return value, or else a thrown exception, just like the result of a normal function call.
Using await
, now we can wrap up the concurrent computations that we started:
const promise1: Promise<Response> = fetch('http://www.mit.edu/');
const promise2: Promise<Response> = fetch('http://www.harvard.edu/');
const promise3: Promise<Response> = fetch('http://www.tufts.edu/');
// we have now started trying to contact all three web servers concurrently
const response1: Response = await promise1;
const response2: Response = await promise2;
const response3: Response = await promise3;
// we have now received initial responses from all three servers
// (unless one failed and threw an exception)
Note that there is a better way to handle waiting for multiple promises like this, using Promise.all()
, which we will discuss later in the reading.
And we can see now why even an empty promise can be useful, just like a function with a void
return value:
const promise: Promise<void> = timeout(2000);
// started up a timer for 2000 milliseconds
await promise;
// no useful return value, but -- now we know that 2000 milliseconds have indeed passed
Note that void
is a different type from undefined
, which we’ve been using in other contexts to denote the idea of “no useful value”. The void
type is designed as the return type of functions that don’t return anything. That’s why we use Promise<void>
for computations that will not produce a value, and are just running for the sake of some side-effect (like a timer delay). By contrast, undefined
represents the lack of a value, the difference being that void
tells other programmers that a function shouldn’t be used for its return value and undefined
tells other programmers that the lack of a value is important.
reading exercises
Consider these two functions, both designed to wait for a certain amount of time:
// returns a promise that becomes fulfilled `milliseconds` milliseconds after the call to timeout
function timeout(milliseconds: number): Promise<void>;
// waits for `milliseconds` milliseconds before returning
function busyWait(milliseconds: number): void;
Note that timeout
uses a promise, but busyWait
does not.
Approximately how long does the following code take to reach the point where it prints done
?
const promise1 = timeout(1000);
const promise2 = timeout(2000);
await promise1;
await promise2;
console.log('done');
(missing explanation)
busyWait(1000);
busyWait(2000);
console.log('done');
(missing explanation)
const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
await promise2;
console.log('done');
(missing explanation)
const promise1 = timeout(1000);
const promise2 = timeout(2000);
console.log('done');
(missing explanation)
const promise1 = timeout(1000);
await promise1;
const promise2 = timeout(2000);
console.log('done');
(missing explanation)
const promise = timeout(1000);
busyWait(2000);
await promise;
console.log('done');
(missing explanation)
const promise = timeout(1000);
await promise;
await promise;
await promise;
console.log('done');
(missing explanation)
Asynchronous functions
The functions we have been using that return promises – readFile
, fetch
, timeout
– are examples of asynchronous functions.
An asynchronous function is a function that returns control to its caller before its computation is done.
Contrast that with the more familiar synchronous function, which returns control after its computation is finished, and its return value is already known. Almost every function we’ve used to this point in the course has been synchronous in this way.
In modern TS/JS programming, using promises, an asynchronous function can be declared using the async
keyword, and must have return type Promise
:
async function getBalance(): Promise<number> {
const data: string = await fs.promises.readFile('account', { encoding: 'utf-8' });
const balance: number = parseInt(data);
if ( isNaN(balance) ) throw new Error('account does not contain a number')
return balance;
}
Note that the body of an async
function can use return
and throw
statements in the same way that a synchronous function would.
These are automatically converted into effects on the function’s promise.
The return balance
statement fulfills the promise with the value of balance
, and the throw
statement rejects the promise with the given Error
object.
Since getBalance()
is an asynchronous function returning a promise, a caller of getBalance()
needs to interact with the promise appropriately, i.e. using await
:
const myBalance = await getBalance();
Note that await
cannot be used inside a synchronous function (i.e., a function not declared async
).
TypeScript statically checks this requirement.
It will produce a static error if await
occurs inside a non-async
function.
reading exercises
Consider this simple program, in which main
calls f
which calls g
:
function main(): void { console.log(f()); }
function f(): number { return g()+50; }
function g(): number { return 0; }
Now suppose g()
is changed to use await
:
function g(): number { return await getBalance(); }
The edited program is not compiling yet; there are static errors.
Which additional changes need to be made to g()
?
(missing explanation)
What changes need to be made to f()
?
(missing explanation)
What changes need to be made to main()
?
(missing explanation)
Top-level await
We saw above that await
cannot be used in a synchronous function; a function must be declared async
in order to use await
.
But what about top-level code, outside of any function? Consider the previous exercise, in which we had to change synchronous functions into async
functions starting from g
to f
all the way up to the entry-point function main
. But somewhere in the top-level code, we will need to have a call to main
to kick things off:
async function main(): Promise<void> { ... }
...
main(); // what do we do with the promise returned by main()?
Can we change this line to await main()
, when we don’t have a surrounding function to mark async
?
The answer unfortunately depends on what kind of JavaScript context the code will run in.
In code which is compiled and run as a modern JavaScript module (also called “ECMAScript 6 module” or “ES6 module” after the version of the JavaScript standard that introduced modules), you can use await
in top-level code, and it behaves just as would inside an async
function:
async function main(): Promise<void> { ... }
...
await main(); // waits for main() to finish before moving on -- works in modern ES6 module contexts
Our code in 6.102 is using these new modules as much as possible, so that await
works at the top level.
In older JavaScript contexts, unfortunately, the top-level script runs synchronously and is not allowed to use await
, so await main()
would produce an error.
If you find yourself in that kind of context, the best thing to do is to put all your top-level code inside an asynchronous main
function (which we’ve already done here), and call it without waiting for its promise.
Does this mean that the program will exit right away, before the promise is fulfilled?
No — it turns out that node
itself keeps track of pending promises, and the process will not exit until the promise returned by main()
has either been fulfilled or rejected. So await
is not necessary if you are just calling one function at the top level.
Depending on your configuration, you may still get a warning at compile-time that you’re not handling the promise return value, so you can declare that you don’t intend to by preceding the call with the void
operator:
async function main(): Promise<void> { ... }
...
void main(); // calls main() but doesn't wait for it to finish -- works in all JavaScript contexts
Concurrency model
JavaScript has only one thread of control per global environment. (As we saw in a previous reading, Workers
can create additional threads, but each worker gets a fresh global environment, and generally communicates with other workers or the main program by message passing, not by directly sharing mutable objects in memory.)
This raises some questions: if there is only one thread in a JavaScript environment, then what does it mean to have multiple asynchronous functions in progress at the same time? After an asynchronous function returns a promise to its caller, before having finished its computation, when and how does it get control back, so that it can compute some more and eventually fulfill or reject the promise?
Let’s see how this works with an example that gets the balances from two bank accounts and adds them together:
async function totalBalance(): Promise<number> {
const checkingPromise = getBalance('checking');
const savingsPromise = getBalance('savings');
return (await checkingPromise) + (await savingsPromise);
}
async function getBalance(account: string): Promise<number> {
const data: string = await fs.promises.readFile(account, { encoding: 'utf-8' });
return parseInt(data);
}
Suppose a client calls totalBalance()
.
Here’s what happens.
(You can hover over each step in the text below and see the corresponding code highlighted on the right.)
totalBalance
callsgetBalance('checking')
, which callsreadFile
, which starts reading the file and immediately returns aPromise<string>
for its contents.getBalance('checking')
reaches anawait
for that file promise, which means it must give up control. It constructs a newPromise<number>
for its own result, and returns that promise back to its caller,totalBalance
.totalBalance
doesn’t wait for that promise yet, but stores it away incheckingPromise
, and proceeds to callgetBalance('savings')
.getBalance('savings')
evolves in the same way, reaching itsawait
and returning a promise for the savings-account balance, whichtotalBalance
stores insavingsPromise
.- Note that
totalBalance
has now created two concurrent computations, one loading the checking account and the other the savings account. By holding onto their promises and reserving theawait
for later, it allows the computations to run independently, rather than forcing one to finish before even starting the other. This is the essence of promise-based concurrency.
- Note that
totalBalance
mustawait
each of the promises before it can compute the final sum. TypeScript/JavaScript evaluates expressions from left to right (not all languages do…), soawait checkingPromise
will happen first. SincecheckingPromise
is still pending, this means thattotalBalance
constructs a newPromise<number>
for its own eventual result, and returns that promise to the original caller.- The original caller (not shown) continues executing. In order for JavaScript’s concurrency model to work correctly, we depend on the caller to eventually return control back to the JavaScript runtime system (the original caller of the starting point of our code). The runtime system handles low-level processing, like file loading.
- Suppose the savings file finishes loading first, and its promise fulfills with the string
"200"
. The JavaScript runtime system then gives control togetBalance('savings')
, which was waiting on that promise, and execution resumes with theawait
evaluating to"200"
.getBalance('savings')
finishes its execution, fulfillssavingsPromise
with the number200
, and returns control back to the runtime system. ButtotalBalance
is not awaitingsavingsPromise
(yet), so nothing further happens. - The checking file finishes loading, and fulfills with the string
"50"
. The runtime system gives control togetBalance('checking')
, which fulfillscheckingPromise
with the number50
, and returns control back to the runtime system. - Since
totalBalance
was waiting forcheckingPromise
, the runtime system gives control tototalBalance
with the value50
.totalBalance
then doesawait savingsPromise
, so it gives up control to the runtime system again. But becausesavingsPromise
is already fulfilled with the value200
, the runtime system will very shortly (possibly immediately) give control back again.totalBalance
proceeds with its own computation, finally fulfilling its own promise with the sum250
.
This example reveals some high-level points about the flow of control:
- Every
await
is a place where an asynchronous function gives up control. - For the first
await
in an asynchronous function, “giving up control” means returning the function’s own promise to its caller. Subsequentawait
s give up control directly back to the runtime system. - When the
await
resumes, control comes back from the runtime system. - An asynchronous function is effectively divided into pieces of computation between
await
s, and those pieces can interleave with pieces of other asynchronous function calls.
This model of concurrency is called cooperative or non-preemptive.
There is only one thread of control, and concurrent computations must cooperate to release control to each other at well-defined points (such as await
and return
).
Note that await
always gives up control, even if the promise it would wait for has already been fulfilled.
It may gain control back almost immediately if that’s the case, but first it gives other concurrent computations a chance to continue their execution.
reading exercises
Let’s consider some other ways to write the body of totalBalance
, and how they might affect the concurrency of the computation.
Note that types are omitted in the variable declarations below.
Some variables may be number
, and others Promise<number>
.
Some examples may even fail to compile because of a static type error.
Look carefully at the await
s, and remember that they convert a promise into a value.
What is the concurrency behavior of each of the following?
const savings = getBalance('savings');
const checking = getBalance('checking');
return (await savings) + (await checking);
(missing explanation)
const checking = await getBalance('checking');
const savings = await getBalance('savings');
return checking + savings;
(missing explanation)
const checking = getBalance('checking');
const savings = getBalance('savings');
return checking + savings;
(missing explanation)
const checking = getBalance('checking');
return (await getBalance('savings')) + (await checking);
(missing explanation)
Recall that we have defined timeout(milliseconds)
as an asynchronous function that creates a timer — a promise that will fulfill milliseconds
after timeout()
was called.
Now consider this asynchronous function:
async function clock(milliseconds: number): ________ {
while (true) {
await timeout(milliseconds);
console.log('tick');
}
}
What is the return type of clock()
(the blank in the code above)?
(missing explanation)
Assume the blank is filled in with the correct return type.
(missing explanation)
What would await clock(1000)
do?
(missing explanation)
Aggregating promises
When running multiple concurrent computations, it’s often useful to combine their promises using operations similar to AND and OR.
Promise.all()
is like a logical AND – it combines an iterable collection of promises into a single promise that waits for all the promises to fulfill, and returns an array of their results:
const allResponses: Array<Response> =
await Promise.all( [ fetch('http://www.mit.edu/'),
fetch('http://www.harvard.edu/'),
fetch('http://www.tufts.edu/') ] );
But if any of the individual promises fail, then the entire Promise.all()
also fails.
For logical OR, Promise.any()
produces a promise that waits for any of the individual promises to successfully fulfill (and returns that promise’s result), and only fails if all the promises fail.
You might use it for running redundant computations that might fail independently:
const firstResponse: Response =
await Promise.any( [ fetch('http://www.mit1.edu/'),
fetch('http://www.mit2.edu/'),
fetch('http://www.mit3.edu/') ] );
Promise.race()
is a logical OR that waits for any individual promise to either fulfill or reject, and immediately fulfills or rejects in the same way.
One use of Promise.race()
is putting a timeout on an operation:
const responseOrTimeout: Response|void =
await Promise.race( [ fetch('http://www.mit.edu/'),
timeout(5000) ] );
Never busy-wait
We’ve seen that a Promise
has one observer operation, await
, which lets a client obtain the promise’s value (and always gives up control, and waiting if necessary for the promise to fulfill).
But there is no method you can call to interrogate a promise directly for its state (“are you still pending?”), or to extract its value without giving up control.
There is a good reason why those operations are not provided. If they were, clients of the promise might be tempted to busy-wait, waiting for the promise to fulfill. Busy-waiting means sitting in a tight loop waiting for some event to occur, without giving up control. For example, here’s a busy-waiting timer:
async function busyWait(milliseconds: number): Promise<void> {
const now = new Date().getTime(); // new Date().getTime() is always the current clock time in milliseconds
const deadline = now + milliseconds;
while (new Date().getTime() < deadline) {
// do nothing, just busy-wait until the system clock time reaches deadline
}
}
This code works in the narrow sense that calling busyWait(500)
will indeed wait for 500 milliseconds, and its promise will fulfill after that time elapses.
But because busyWait
never releases control during that time period — notice that its loop body contains no await
whatsoever — no other asynchronous functions in the program will have a chance to run during that time.
We will never get back to the JavaScript runtime system to give control to them.
The program will simply freeze until the entire body of busyWait()
finishes and returns.
So busyWait()
is quite useless as an asynchronous function — it can’t run concurrently with any other asynchronous code.
If a Promise
provided observers other than await
, then programmers would be tempted to write busy-waiting code like this:
const promise = readFile('account', { encoding: 'utf-8' });
while ( promise.isPending() ) { // just hypothetical, isPending() doesn't exist
// busy-wait! bad idea
}
const data: string = promise.get() // just hypothetical, get() doesn't exist
This code not only freezes the program, but it doesn’t even do what the programmer wants.
The busy-waiting loop never gives up control, which means that we never return to the top-level runtime system that is processing file input/output.
So readFile()
never marks its promise as fulfilled, the putative promise.isPending()
operation never returns true, and we never exit the busy-waiting loop.
Busy-waiting is generally a bad idea in concurrent programming (with some rare exceptions like spinlocks), but it is particularly deadly for promises and async
/await
code.
Instead, use await
to operate on the value of a promise, or to react to its state transition from pending to fulfilled.
Summary
This reading talked about asynchronous functions returning promises.
- A promise is a concurrent computation that has been started but might still be unfinished, whose result may not be ready yet.
- Promises allow us to write concurrent code that never busy-waits.
await
waits for and accesses the value that a promise computes.Promise.all()
,Promise.any()
,Promise.race()
are ways to aggregate multiple promises together.- Functions that return a promise are asynchronous.
- These functions give up control to their caller or to the JavaScript runtime system.
- A normal function can be transformed to an asynchronous one by using the
async
andawait
keywords and returning a promise type.
These ideas connect to our three key properties of good software as follows:
Safe from bugs. Static checking of promise types, and the requirement that a promise be turned into a value with
await
, ensures that code that depends on an asynchronous computation cannot continue until the values it needs are ready.Easy to understand. Using
await
to transform a promise into a value makes asynchronous code read very much like straight-line synchronous code. But like all concurrency, promises and asynchronous functions can be subtle and produce surprising effects.Ready for change. Promises can be composed and combined in ways that other concurrency techniques (like threads and workers) do not easily support.