Reading 9: Functional Programming
Software in 6.102
Objectives
In this reading we’ll talk about several design patterns for operating on sequences of elements, and you’ll see how treating functions themselves as first-class values that we can pass around and manipulate in our programs is an especially powerful idea.
First-class functions
Let’s start by reviewing an important Big Idea that you should have already encountered in 6.101 [formerly 6.009]: functions as first-class data values, meaning that they can be stored in variables, passed as arguments to functions, and created dynamically like other values.
For example, Math.sqrt
is a reference to an object representing the sqrt
function.
The type of that object is (x: number) => number
.
But you can also assign that function to another variable if you like, and it still behaves like sqrt
:
const mySquareRoot: (x: number) => number = Math.sqrt;
mySquareRoot(16.0); // returns 4.0
The type that we declared on mySquareRoot
is a function type expression.
Note that the parameter name x
is required!
If you write (number) => number
, it means “a function with a parameter named number
of type any
that returns a number
“, because omitted types are implicitly any
.
You can also pass a reference to the function as a parameter to another function. You can use functions the same way you would use any other value in TypeScript, like numbers or string references or other object references. In other words, functions in TypeScript are first-class, which means they can be treated like any other value in the language: passed as parameters, returned as return values, and stored in variables and data structures.
Programming languages are full of things that are not first-class.
For example, access-control is not first-class – you can’t pass public
or private
as a parameter into a function, or store it in a data structure.
The notion of public
and private
is an aspect of your program that TypeScript offers no way to refer to or manipulate at runtime.
Similarly, a while
loop or if
statement is not first-class.
You can’t refer to that piece of code all by itself, or manipulate it at runtime.
In old programming languages, only data was first-class: built-in types (like numbers) and user-defined types. But in modern programming languages, like Python and JavaScript, both data and functions are first-class. First-class functions are a very powerful programming idea. The first practical programming language that used them was Lisp, invented by John McCarthy at MIT. But the idea of programming with functions as first-class values actually predates computers, tracing back to Alonzo Church’s lambda calculus. The lambda calculus used the Greek letter λ to define new functions; this term stuck, and you’ll see it as a keyword not only in Lisp and its descendants, but also in Python.
Guido van Rossum, the creator of Python, wrote a blog post about the design principle that led not only to first-class functions in Python, but first-class methods as well: First-class Everything.
Function expressions in TypeScript
We have already been using function expressions, starting as far back as Testing, since the mocha framework uses them:
describe("Math.max", function() {
it("covers a < b", function() {
assert.strictEqual(Math.max(1, 2), 2);
});
});
This code has two function expressions – the first one is passed as an argument to describe()
, and the second (nested inside it) is passed as an argument to it()
.
TypeScript also has a more compact arrow syntax that avoids the need for a function
keyword:
describe("Math.max", () => {
it("covers a < b", () => {
assert.strictEqual(Math.max(1, 2), 2);
});
});
If the body of the function consists of only a single expression (i.e. like a Python lambda
expression), then even the curly braces can be omitted:
it("covers a < b", () => assert.strictEqual(Math.max(1, 2), 2) );
But it turns out that there is a technical difference between arrow functions and function
expressions that is important when using methods: a function
expression may redefine this
, but an arrow function uses the definition of this
from its surrounding context.
So arrow functions should always be used inside an instance method, rather than function
expressions.
Understanding This, Bind, Call, and Apply in JavaScript has a good explanation of the issues behind the meaning of this
in JavaScript.
reading exercises
interface Dog {
name(): string;
breed(): Breed;
loudnessOfBark(): number;
}
dogs: Array<Dog>
The Array.sort
method takes a function argument that tells it how to compare two objects in the set; in this case, two Dog
s.
The type of this argument is (dog1: Dog, dog2: Dog) => number
, and it should return a negative number if dog1 < dog2
(meaning dog1
should sort before dog2
), a positive number if dog1 > dog2
, or zero if dog1 = dog2
.
Which of these would sort our dogs from quietest bark to loudest?
dogs.sort(function(dog1: Dog, dog2: Dog) {
return dog2.loudnessOfBark() - dog1.loudnessOfBark();
});
(missing explanation)
dogs.sort(function(dog1: Dog, dog2: Dog) {
return dog1.loudnessOfBark() > dog2.loudnessOfBark() ? dog1 : dog2;
});
(missing explanation)
dogs.sort((dog1: Dog, dog2: Dog) => {
return dog1.loudnessOfBark() - dog2.loudnessOfBark();
});
dogs.sort((dog1: Dog, dog2: Dog) => dog1.loudnessOfBark() - dog2.loudnessOfBark());
(missing explanation)
function quietestFirst(dog1: Dog, dog2: Dog) {
return dog1.loudnessOfBark() - dog2.loudnessOfBark();
};
dogs.sort(quietestFirst());
(missing explanation)
Abstracting out control
In this reading we discuss map/filter/reduce, a design pattern that substantially simplifies the implementation of functions that operate over sequences of elements.
In this example, we’ll have lots of sequences — arrays of files; input streams that are sequences of lines; lines that are sequences of words.
Map/filter/reduce will enable us to operate on those sequences with no explicit control statements in our code — not a single for
loop or if
statement.
Suppose we’re given the following problem: write a function that finds all the words in the TypeScript files in your project.
Following good practice, we break it down into several simpler steps and write a function for each one:
- find all the files in the project, by scanning recursively from the project’s root folder
- restrict them to files with a particular suffix, in this case
.ts
- open each file and read it in line-by-line
- break each line into words
Writing the individual functions for these substeps, we’ll find ourselves writing a lot of low-level iteration code. For example, here’s what the recursive traversal of the project folder might look like:
import fs from 'fs';
import path from 'path';
/**
* Find names of all files in the filesystem subtree rooted at folder.
* @param folder root of subtree, requires fs.lstatSync(folder).isDirectory() === true
* @returns array of names of all ordinary files (not folders) that have folder as
* their ancestor
*/
function allFilesIn(folder: string): Array<string> {
let files: Array<string> = [];
for (const child of fs.readdirSync(folder)) {
const fullNameOfChild = path.join(folder, child);
if (fs.lstatSync(fullNameOfChild).isDirectory()) {
files = files.concat(allFilesIn(fullNameOfChild));
} else if (fs.lstatSync(fullNameOfChild).isFile()) {
files.push(fullNameOfChild);
}
}
return files;
}
And here’s what the filtering function might look like, which restricts that file array down to just the TypeScript files.
Imagine calling this like onlyFilesWithSuffix(files, ".ts")
:
/**
* Filter an array of files to those that end with suffix.
* @param filenames array of filenames
* @param suffix string to test
* @returns a new array consisting of only those files whose names end with suffix
*/
function onlyFilesWithSuffix(filenames: Array<string>, suffix: string): Array<string> {
const result: Array<string> = [];
for (const f of filenames) {
if (f.endsWith(suffix)) {
result.push(f);
}
}
return result;
}
Iterator
Before we dive into map/filter/reduce, we’ll look at a simpler design pattern for sequences that also abstracts away from the details of control. An iterator is an object that steps through a sequence of elements and returns the elements one by one.
Iterators are used under the covers in TypeScript when you’re using a for...of
loop to step through an array.
An Iterator<E>
over a collection of elements of type E
has one method:
next()
returns an object{done: boolean, value?: E}
- where
value
is the next element in the collection ifdone
is false - and there are no more elements if
done
is true
- where
let filenames: Array<string> = ...;
for (const f of filenames) {
console.log(f);
}
is implemented by TypeScript by code roughly equivalent (this translation ignores some details) to this:
let filenames: Array<string> = ...;
let iter: Iterator<string> = filenames.iterator();
for (let result = iter.next(); ! result.done; result = iter.next()) {
const f: string = result.value;
console.log(f);
}
Note that the next()
method is a mutator operation for Iterator
, not only returning an element but also advancing the iterator so that the subsequent call to next()
will return a different element.
Why do iterators exist?
There are many kinds of collection data structures (linked lists, maps, hash tables) with different kinds of internal representations.
The iterator concept allows a single uniform way to access them all, so that client code is simpler and the collection implementation can change without changing the client code that iterates over it.
For example, although we could iterate over filenames
using indices:
for (let i = 0; i < filenames.length; i++) {
let f = filenames[i];
// ...
…this code now depends on the length
property of Array
, as well as the [..]
indexing operation, which might be different (or nonexistent) in another data structure. A Set
, for example, uses size
instead of length, and has no [..]
indexing at all.
The iterator design pattern abstracts away from these details, so we can write the same kind of for...of
loop for many different data structures.
Most modern languages (including Python, C#, and Ruby) use the notion of an iterator. It’s an effective design pattern — a well-tested solution to a common design problem. Map/filter/reduce is also a design pattern, and we’ll see others as we move through the course.
Iterable vs. Iterator
It’s important to understand the difference between an iterator and an iterable.
An iterable type represents a sequence of elements that can be iterated over.
In TypeScript, the Iterable<E>
interface represents an iterable, and requires the type to have an iterator
method which returns a fresh Iterator
object for the sequence.
Python has a similar notion of iterable, where the required method is called __iter__
.
An iterator, on the other hand, does not represent an entire sequence of elements.
Its state represents a particular point in an iteration over a sequence (so it actually represents the remainder of the sequence that it is iterating over).
But an iterator is a mutable object.
Each time next()
is called, the iterator advances to the next element, and previous elements are no longer accessible through the iterator.
Once the iterator reaches the end, it cannot be reset back to the beginning of the sequence.
As a result, when you are working with sequences – storing them in variables, passing them to functions, returning them from functions – you should be using iterable types, not Iterator
itself.
Often these iterable types are specific types like Array
or Set
.
But if you are writing a function that doesn’t need any of the particular operations of these types — that only needs to iterate using for...of
— then consider using the supertype Iterable
instead.
reading exercises
Suppose you have a function doSomething()
defined like this:
// Uses the strings in `sequence` to compute something
function doSomething(sequence: Iterator<string>): string;
And you have a variable myStrings
that you pass to doSomething()
:
doSomething(myStrings);
Which of the following are true?
(missing explanation)
MyIterator
To better understand how an iterator works, here’s a simple implementation of an iterator for Array<string>
.
Because our simple iterator can only handle strings, we’ll also simplify next()
to return string|undefined
– i.e., either the next value from the array, or undefined
if we’ve reached the end.
(Note that Iterator<E>
can’t do this in general, because undefined
might very well be one of the elements in the collection, if it’s part of the E
type.)
/**
* A MyIterator is a mutable object that iterates over
* the elements of a Array<string>, from first to last.
* This is just an example to show how an iterator works.
* In practice, you should use the Array's own iterator
* object, returned by its iterator() method.
*/
class MyIterator {
private readonly array: Array<string>;
private index: number;
// AF(array, index) = the sequence of elements array[index]...array[array.length-1]
// RI: index is a nonnegative integer
/**
* Make an iterator.
* @param array array to iterate over
*/
public constructor(array: Array<string>) {
this.array = array;
this.index = 0;
}
/**
* Get the next element of the array.
* Modifies: this iterator to advance it to the element
* following the returned element.
* @returns the next element in the array,
* or undefined if there are no more elements
*/
public next(): string|undefined {
if (this.index >= this.array.length) {
return undefined;
} else {
const value = this.array[this.index];
++this.index;
return value;
}
}
}
Here’s a snapshot diagram showing a typical state for a MyIterator
object in action:
Note that we draw the arrow from array
with a double line, to indicate that it’s readonly.
That means the arrow can’t change once it’s drawn.
But the array object it points to is mutable — elements can be changed within it — and declaring array
as readonly has no effect on that.
Mutation undermines an iterator
Let’s try using our iterator for a simple job.
Suppose we have an array of filenames, like ["a.txt", "b.txt", "c.pdf"]
.
We want a function removeFilesWithExtension
that will delete filenames with a particular extension, like .txt
, from the array, leaving the other filenames behind.
For this example, the function should remove the elements "a.txt"
and "b.txt"
, leaving behind a one-element array ["c.pdf"]
.
Following test-first programming, we write a spec, choose and implement test cases to cover some partitions, and finally implement the function, using our MyIterator
class:
/**
* Remove filenames with a given extension.
* Modifies the filenames array by removing filenames that end with the extension.
*
* @param filenames array of filenames
* @param extension must start with ".", e.g. ".txt"
*/
function removeFilesWithExtension(filenames: Array<string>, extension: string): void {
let iter: MyIterator = new MyIterator(filenames);
for (let filename = iter.next(); filename !== undefined; filename = iter.next()) {
if (filename.endsWith(extension)) {
// remove the filename from the array
filenames.splice(filenames.indexOf(filename), 1);
}
}
}
Now we run our test cases, and they work! … almost. This test case fails:
// removeFilesWithExtension(["a.txt", "b.txt", "c.pdf"], ".txt")
// expected ["c.pdf"], actual ["b.txt","c.pdf"]
We got the wrong answer: removeFilesWithExtension
left a matching filename behind in the array!
Why?
Trace through what happens.
It will help to use a snapshot diagram showing the MyIterator
object and the Array
object and update it while you work through the code.
reading exercises
Let’s draw a snapshot diagram to illustrate the bug.
You will need to refer back to the source code for the MyIterator
class and the removeFilesWithExtension()
method above.
We are running the test case removeFilesWithExtension(["a.txt", "b.txt", "c.pdf"], ".txt")
.
In the box below, use Snapdown to draw a snapshot diagram showing the state of the program right at the start of the body of removeFilesWithExtension()
.
Open the Snapdown help sidebar to get the details of Snapdown syntax – it’s a little language for drawing diagrams with text.
|
(missing explanation)
Note that this isn’t just a bug in our MyIterator
.
The built-in iterator in Array
suffers from the same problem, and so does the for
loop that’s syntactic sugar for it.
So do lists and for
loops in Python.
Mutating a list that is currently being iterated over is simply not safe in general.
Mutation can make simple contracts very complex
This is a fundamental issue with mutable types. Multiple references to the same mutable object (aliasing) may mean that multiple places in your program — possibly widely separated — are relying on that object to remain consistent.
To put it in terms of specifications, contracts can’t be enforced in just one place anymore, e.g. between the client of a class and the implementer of a class. Contracts involving mutable objects now depend on the good behavior of everyone who has a reference to the mutable object.
As a symptom of this non-local contract phenomenon, consider iterators.
Try to find where it documents the crucial requirement on the client that we’ve just discovered — that you shouldn’t mutate an array while you’re iterating over it.
Who takes responsibility for it?
The iterator specification?
Array
?
Can you find it?
The need to reason about global properties like this makes it much harder to understand, and be confident in the correctness of, programs with mutable data structures. We still often do it — for performance and convenience — but we pay a big price in bug safety for doing so.
For these reasons, functional programming largely avoids mutation. The map/filter/reduce pattern discussed below produces new sequences of elements, rather than mutating existing ones, and the first-class functions that are passed to map/filter/reduce are generally pure functions, which return an output rather than mutating any of their inputs.
Map/filter/reduce abstraction
The map/filter/reduce patterns in this reading do something similar to Iterator, but at an even higher level: they treat the entire sequence of elements as a unit.
In this paradigm, the control statements disappear: specifically, the for
statements, the if
statements, and the return
statements in the code from our introductory example will be gone.
We’ll also be able to get rid of most of the temporary names (i.e., the local variables filenames
, f
, and result
).
And we will be able to stop mutating the sequences we are using – no more calling push
or splice
on the arrays we are manipulating and creating.
We’ll have three operations for iterable types: map, filter, and reduce.
Let’s look at each one in turn, and then look at how they work together.
We’ll focus on using Array
as the iterable type, because it provides map, filter, and reduce operations natively in TypeScript.
Map
Map applies a unary function to each element in the sequence and returns a new sequence containing the results, in the same order:
map : Array<E> × (E → F) → Array<F>
const array: Array<number> = [1, 4, 9, 16];
const result = array.map(Math.sqrt);
This code starts with an array containing the numbers 1, 4, 9, 16
and applies the square root function to each element to produce the values 1.0, 2.0, 3.0, 4.0
.
In the call array.map(Math.sqrt)
:
map
has typeArray<number> × (number → number) → Array<number>
- the input
array
has typeArray<number>
- the mapping function
Math.sqrt
has typenumber → number
- the return value has type
Array<number>
We can write the same expression more compactly as:
[1, 4, 9, 16].map(Math.sqrt);
which is the syntax that future examples will use.
["A", "b", "C"].map(s => s.toLocaleLowerCase())
which produces an array containing "a", "b", "c"
.
(Notice that when we call a function like Math.sqrt
, we can refer to it directly.
Calling a method like toLocaleLowerCase
requires a lambda expression to wrap the method call syntax.)
Map is useful even if you don’t care about the return value of the function.
When you have a sequence of mutable objects, for example, you may want to map a mutator operation over them.
Because mutator operations typically return void
, however, you can use forEach
instead of map
.
forEach
applies the function to each element of the sequence, but does not collect their return values into a new sequence:
itemsToRemove.forEach(item => mySet.delete(item));
(A lambda is needed here, too.
The expression mySet.delete
by itself would refer to the delete
function without binding mySet
to this
, which will not work.)
The map
and forEach
operations capture a common pattern for operating over sequences: doing the same thing to each element of the sequence.
Filter
Our next important sequence operation is filter, which tests each element of an Array<E>
with a unary function from E
to boolean
.
Elements that satisfy the predicate are kept; those that don’t are removed. A new sequence is returned; filter doesn’t modify its input sequence.
filter : Array<E> × (E → boolean) → Array<E>
[1, 2, 3, 4]
.filter(x => x%2 === 1);
// returns [1, 3]
["x", "y", "2", "3", "a"]
.filter(s => "abcdefghijklmnopqrstuvwxyz".includes(s)));
// returns ["x", "y", "a"]
const isNonempty = (s: string) => s.length > 0;
["abc", "", "d"]
.filter(isNonempty);
// returns ["abc", "d"]
Reduce
Our final operator, reduce, combines the elements of the sequence together, using a binary function. In addition to the function and the array, it also takes an initial value that initializes the reduction, and that ends up being the return value if the array is empty:
reduce : Array<E> × (E × E → E) × E → E
arr.reduce(f, init)
combines the elements of the array together.
Here is one way that it might compute the result:
result0 = init
result1 = f(result0, arr[0])
result2 = f(result1, arr[1])
...
resultn = f(resultn-1, arr[n-1])
resultn is the final result for an n-element sequence.
Adding numbers is probably the most straightforward example:
[1,2,3].reduce( (x,y) => x+y, 0)
…which computes as shown below:
reduce( [1, 2, 3], + , 0 ) = ((0 + 1) + 2) + 3 = 6 |
Initial value
There are three design choices in the reduce operation.
First is whether to require an initial value.
In TypeScript, the initial value can be omitted, in which case reduce uses the first element of the sequence as the initial value of the reduction.
But if the sequence is empty, then reduce has no value to return, and reduce throws a TypeError
.
It’s important to keep this in mind when using reducers like max
, which have no well-defined initial value:
[5, 8, 3, 1].reduce((x,y) => Math.max(x,y))
// computes max(max(max(5,8),3),1) and returns 8
[].reduce((x,y) => Math.max(x,y))
// throws TypeError!
Reduction to another type
The second design choice is the return type of the reduce operation.
It doesn’t necessarily have to match the element type of the original sequence.
For example, we can use reduce to concatenate an array of numbers (type E
) into a string (type F
).
This changes the reduce operation in two ways:
- the initial value now has type
F
- the binary function is now an accumulator, of type
F × E → F
, that takes the current result (of typeF
) and a new element from the sequence (of typeE
), and produces an accumulated result of typeF
.
So a more general form of the reduce operation now looks like this:
reduce : Array<E> × (F × E → F) × F → F
In the special case where F
is the same type as E
, this type signature is the same as above.
Here’s a simple example that concatenates a sequence of numbers into a string:
[1,2,3].reduce( (s: string, n: number) => s + n, "" );
// returns "123"
We’ve included parameter type declarations in the lambda expression above in order to clarify the difference between the two parameters of the accumulator function.
If F
is the same type as E
, it’s ok to omit these type declarations in the lambda expression.
Order of operations
The third design choice is the order in which the elements are accumulated.
TypeScript’s reduce
operation combines the sequence starting from the left (the first element):
result0 = init
result1 = f(result0, arr[0])
result2 = f(result1, arr[1])
...
resultn = f(resultn-1, arr[n-1])
Another TypeScript operation, reduceRight
, goes in the other direction:
result0 = init
result1 = f(result0, arr[n-1])
result2 = f(result1, arr[n-2])
...
resultn = f(resultn-1, arr[0])
to produce resultn as the final result.
Here’s a diagram that contrasts the two ways to reduce, from the left or from the right. If the operator is non-commutative (like string concatenation, shown here), then each direction might produce a different answer:
reduce( [1, 2, 3], + , "" ) = (("" + 1) + 2) + 3 = "123" |
reduceRight( [1, 2, 3], + , "" ) = (("" + 3) + 2) + 1 = "321" |
|||
reading exercises
This exercise explores three different ways to reduce an array using Math.min()
, finding the minimum value.
Unlike the reducing functions in the previous exercises, which have identity elements that are natural to use as initial values of the reduction, Math.min()
has no natural identity element.
const array: Array<number> = [...];
…here are three possible ways to compute the minimum value of the array:
const result = array.reduce((x, y) => Math.min(x, y));
let result;
try {
result = array.reduce((x, y) => Math.min(x, y));
} catch (e) {
result = 0;
}
const result = array.reduce((x, y) => Math.min(x, y), Number.POSITIVE_INFINITY);
(missing explanation)
(missing explanation)
(missing explanation)
Back to the motivating example
Going back to the example we started with, where we want to find all the words in the TypeScript files in our project, let’s try creating a useful abstraction for filtering files by suffix:
const endsWith = (suffix: string) => {
return (filename: string) => filename.endsWith(suffix);
}
TypeScript’s string.endsWith
is a function string × string → boolean
.
Our new endsWith
wrapper returns functions that are useful as filters.
It takes a filename suffix like .ts
and dynamically generates a function that we can use with filter to test for that suffix.
Given a Array<string> filenames
, we can now write, e.g., filenames.filter(endsWith(".ts"))
to obtain a new filtered array.
endsWith
is a different kind of beast than our usual functions.
It’s a higher-order function, meaning that it’s a function that takes another function as an argument, or returns another function as its result, as endsWith
does.
Higher-order functions are operations on the data type of functions.
We’ve already seen several higher-order functions in this reading, of course: map, filter, and reduce all take a function as an argument.
In this case, endsWith
is a creator of functions.
Its signature is: string → (string → boolean
).
Now let’s use map and filter to recursively traverse the folder tree:
function allFilesIn(folder: string): Array<string> {
const children: Array<string> = fs.readdirSync(folder)
.map(f => path.join(folder, f));
const descendants: Array<string> = children
.filter(f => fs.lstatSync(f).isDirectory())
.map(allFilesIn)
.flat();
return [
...descendants,
...children.filter(f => fs.lstatSync(f).isFile())
];
}
The first line gets all the children of the folder, which might look like this:
["src/client", "src/server", "src/Main.ts", ...]
The second line is the key bit: it filters the children for just the subfolders using the isDirectory
method, and then recursively maps allFilesIn
against this array of subfolders!
The result might look like this:
[["src/client/MyClient.ts", ...], ["src/server/MyServer.ts", ...], ...]
So we have to flatten it to remove the nested structure, which is what flat
) does:
["src/client/MyClient.ts", ..., "src/server/MyServer.ts", ...]
(This pattern of map-and-flatten is common enough that arrays also have a method to do both that we could have used instead: flatMap
.)
Finally we add the immediate children that are plain files (not folders), and that’s our result.
We can also do the other pieces of the problem with map/filter/reduce. Once we have the array of all files underneath the current folder, we can filter it to just the TS files:
const filenames: Array<string> = allFilesIn(".")
.filter(endsWith(".ts"));
Now that we have the files we want to extract words from, we’re ready to load their contents:
const fileContents: Array<Array<string>> = filenames.map(f => {
try {
const data = fs.readFileSync(f, { encoding: "utf8", flag: "r" });
return data.split("\n");
} catch (e) {
console.error(e);
}
});
Finally, we can flatten the array of arrays of lines into a simple array of lines:
const lines: Array<string> = fileContents.flat();
and then extract the nonempty words from each line:
const words: Array<string> = lines.map(line => line.split(/\W+/)
.filter(s => s.length > 0))
.flat();
(Here we call split
not with a simple string delimiter, but with a regular expression, which will be discussed more in a future reading.
This regular expression identifies substrings of non-word characters.)
And we’re done, we have our array of all words in the project’s TypeScript files! As promised, the control statements have disappeared.
As can be seen in the example above, one common pattern when using map/filter/reduce is method chaining. Since each function returns a new array, calls to map/filter/reduce can be placed one after another.
Benefits of abstracting out control
Map/filter/reduce can often make code shorter and simpler, and allow the programmer to focus on the heart of the computation rather than on the details of loops, branches, and other control statements.
This pattern is so useful that Google put it at the heart of an influential distributed system, MapReduce, that processes massive amounts of data on a cluster of computers.
reading exercises
This function computes the product of just the odd integers in an array:
function productOfOdds(array: Array<number>): number {
let result = 1;
for (let x of array) {
if (x % 2 === 1) {
result *= x;
}
}
return result;
}
Rewrite this code using map, filter, and reduce:
function productOfOdds(array: Array<number>): number {
return ▶▶EXPR◀◀;
}
const x: number = 5;
const y: number = 8;
function isOdd(x: number): boolean {
return x % 2 === 1;
}
const xIsOdd: boolean = x % 2 === 1;
function alwaysOdd(x: number): number {
return isOdd(x) ? x : 1;
}
function identityFunction(x: number): number {
return x;
}
function sum(x: number, y: number): number {
return x + y;
}
const identity = (x: number) => x;
function product(x: number, y: number): number {
return x * y;
}
function alwaysTrue(x: number): boolean {
return true;
}
Assuming this code is part of the file shown on the right, which of the following expressions can replace ▶▶EXPR◀◀
?
array
.map(identityFunction)
.filter(isOdd)
.reduce(product, 1)
(missing explanation)
array
.map(identity)
.filter(alwaysTrue)
.reduce(product, 1)
(missing explanation)
array
.map(x)
.filter(alwaysOdd)
.reduce(product, 1)
(missing explanation)
array
.map(alwaysOdd)
.filter(alwaysTrue)
.reduce(product, 1)
(missing explanation)
array
.map(identityFunction)
.filter(xIsOdd)
.reduce(x*y, 1)
(missing explanation)
Avoiding loops in TypeScript
Python makes filter
and map
easily accessible with list comprehensions:
doubleOdds = [ x*2 for x in arr if x % 2 == 1 ]
… which is like the TypeScript:
const doubleOdds = arr.filter(x => x % 2 === 1).map(x => x*2)
The result of a list comprehension is a list by virtue of the [
…]
brackets.
In Python, comprehensions work over any iterable source, and can also be used to build sets and dictionaries.
For example:
arr = [ 4, 7, 5, 6, 7, 4 ] # given a list
{ x for x in arr if x > 5 } # ... create a set of values greater than 5:
# => a set { 6, 7 }
input = { 'apple': 'red', 'banana': 'yellow' } # given a dictionary
{ value: key for key, value in input.items() } # ... create a dictionary with
# key-value pairs inverted
# => a dictionary { 'red': 'apple', 'yellow': 'banana' }
We can do the same things in TypeScript:
const arr = [ 4, 7, 5, 6, 7, 4 ];
new Set(arr.filter(x => x > 5));
// => a Set { 6, 7 }
const input = new Map([ ['apple','red'], ['banana','yellow'] ]);
new Map([ ...input.entries() ].map(entry => entry.reverse()));
// => a Map { 'red' => 'apple', 'yellow' => 'banana' }
Breaking down the construction of that inverted map:
- Notice that the
Map
constructor takes a series of two-element arrays: the first element of each is a key, the second is its value. Map.entries
returns an iterable that iterates over two-element key-value arrays.- Because JavaScript/TypeScript doesn’t provide
map
andfilter
on iterables, only on arrays, we use the spread operator...
to stuff those key-value arrays into an array. - Now we can call
map
on that array, and for each[key,value]
entry, we reverse it to[value,key]
and create the new map.
Just as idiomatic Python code will avoid for
loops in favor of comprehensions for filtering and mapping data structures, idiomatic TypeScript will take advantage of map
and filter
in concert with tools like Map.entries
and array methods like Array.entries
, .some
, and .every
to avoid loops.
Any time you write a loop, reach for those tools first, instead of an index counter.
One more example
Let’s look at a typical database query example.
Suppose we have a database about digital cameras, in which each object is of type Camera
with various properties (brand
, pixels
, cost
, etc.).
The whole database is in an array called cameras
.
Then we can describe queries on this database using map/filter/reduce:
// What's the highest resolution Nikon sells?
cameras.filter(camera => camera.brand === "Nikon")
.map(camera => camera.pixels)
.reduce((x,y) => Math.max(x,y));
Relational databases use the map/filter/reduce paradigm (where it’s called project/select/aggregate). SQL (Structured Query Language) is the de facto standard language for querying relational databases. A typical SQL query looks like this:
select max(pixels) from cameras where brand = "Nikon"
cameras
is a sequence (of table rows, where each row has the data for one camera)
where brand = "Nikon"
is a filter
pixels
is a map (extracting just the pixels field from the row)
Summary
Zooming out, this reading is about functional programming: modeling problems and implementing systems with immutable data and operations that implement pure functions, as opposed to mutable data and operations with side effects.
Because functions in TypeScript are first-class, it is much easier to write higher-order functions that abstract away control statements.
Some languages — Haskell, Scala, OCaml — are strongly associated with functional programming. Many other languages — Swift, Ruby, and so on — use functional programming to a greater or lesser extent.
Functional programming helps make code:
Safe from bugs. More immutable data reduces the chance your code has unintended side effects.
Easy to understand. Chaining operations like map, filter, and reduce instead of using sequential control statements like for and if makes your code much easier to read and follow.
Ready for change. The functional style of programming promotes code syntax that follows its semantics, which allows the programmer to focus on the heart of the computation and is designed to evolve in tandem with its problem model.