6.102
6.102 — Software Construction
Spring 2023

#### Software in 6.102

Safe from bugsEasy to understandReady for change
Correct today and correct in the unknown future. Communicating clearly with future programmers, including future you. Designed to accommodate change without rewriting.

#### Objectives

• Understand the properties of an equivalence relation.
• Understand equality for immutable types defined in terms of the abstraction function and in terms of observation.
• Differentiate between reference equality and object equality.
• Differentiate between observational and behavioral equality for mutable types.

## Introduction

In the previous readings we’ve developed a rigorous notion of data abstraction by creating types that are characterized by their operations, not by their representation. For an abstract data type, the abstraction function explains how to interpret a concrete representation value as a value of the abstract type, and we saw how the choice of abstraction function determines how to write the code implementing each of the ADT’s operations.

In this reading we turn to how we define the notion of equality of values in a data type: the abstraction function will give us a way to cleanly define the equality operation on an ADT.

In the physical world, every object is distinct – at some level, even two identical snowflakes are different, even if the distinction is just the position they occupy in space. So two physical objects are never truly “equal” to each other; they only have degrees of similarity.

In the world of human language, however, and in the world of mathematical concepts, you can have multiple names for the same thing. So it’s natural to ask when two expressions represent the same thing: 1+2, $\sqrt{9}$, and 3 are alternative expressions for the same ideal mathematical value.

The first part of this reading focuses on defining equality for immutable types, so that we can understand the concepts without the complication of mutation. Then we’ll expand to consider mutable types as well.

## Equivalence relation

Let’s start with a look at the mathematical properties that need to be satisfied by a sensible notion of equality.

An equality operation defined on a type $T$ can be regarded as a binary relation $E\subseteq T×T$, where a pair of values $\left(x,y\right)\in E$ if and only if $x$ and $y$ are considered equal according to $E$.

Furthermore, for an equality operation, $E$ must be an equivalence relation, meaning that it is:

reflexive:
symmetric: $\left(t,u\right)\in E\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}\left(u,t\right)\in E$
transitive: $\left(t,u\right)\in E\wedge \left(u,v\right)\in E\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}\left(t,v\right)\in E$

For ===, the equivalence E is the set of pairs $\left(x,y\right)$ for which x === y. So for ===, these properties can also be written as:

reflexive: $t$ ===
symmetric: $t$ === $u\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}u$ === $t$
transitive: $t$ === $u\wedge u$ === $v\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}t$ === $v$

An equality operation should always be an equivalence relation, or surprising behavior and bugs will result.

No not None

To see how violating the equivalence relations can produce surprising results, let’s imagine we’re the designers of Python trying to decide how == should behave in various situations. To illustrate the consequences of our decisions, we’ll use these set operations, implemented in a straightforward way, which of course depend on the semantics of ==:

def is_member(x, set):
'''
Returns True iff x is a member of set.
Requires that set contain only unique elements.
'''
for y in set:
if x == y:
return True
return False

def insert(x, set):
'''
Insert an element into a set.
Modifies the set so that set = set U { x }.
Requires and guarantees that set contain only unique elements.
'''
if not is_member(x, set):
set.append(x)

Suppose we decide that Python None (like TypeScript null) is so bad that it should never compare equal to anything. So None == x is false for all possible values of x.

This decision violates:

(missing explanation)

With this rule, what is the outcome of this code that inserts the same value x twice into an empty set?

set = []
insert(x, set)
insert(x, set)
print(len(set))
print(is_member(set, set))

When x is 5, this code prints:

(missing explanation)

When x is None, this code prints:

(missing explanation)

Convertibles are overrated

Now suppose we decide that == should automatically convert different types before comparing them. If the righthand side of == has a different type than the lefthand side, then the righthand side converts itself to the type of the lefthand side before testing for equality.

So if s is a string and n is an integer, then s == n would be the same as s == str(n), and n == s would be the same as n == int(s). If the conversion fails, the equality test returns false. Thus, for example:

• "5" == 5 would be the same as "5" == str(5), which is true.
• 5 == "5" would be the same as 5 == int("5"), which is true.
• "abc" == 5 would be the same as "abc" == str(5), which is false.
• 5 == "abc" would be the same as 5 == int("abc"), which fails to convert, so the result is false.

But Ben Bitdiddle points out that the int() operation that converts a string to an integer ignores leading zeros in a string like "001".

What is the result of these expressions?

7 == "007"

(missing explanation)

"007" == 7

(missing explanation)

These expressions show that this design decision violates:

(missing explanation)

Reading the implementations of insert() and is_member() above carefully, what is printed by this code?

set = []
insert("007", set)
insert(7, set)
print(len(set))

(missing explanation)

And what is printed by this code, whose only difference is the order in which the elements were inserted in the set?

set = []
insert(7, set)
insert("007", set)
print(len(set))

(missing explanation)

## Equality of immutable types

Now that we have a mathematical foundation for all equality operations, let’s look at equality in the context of abstract data types, starting with immutable types.

Formally, we can define equality on immutable types in two ways.

Using the abstraction function. Recall that an abstraction function AF: R → A maps concrete instances of a data type to their corresponding abstract values. To use AF as a definition for equality, we would say that two concrete instances a and b are equal if and only if AF(a) = AF(b).

Using observation. We can say that two objects are equal when they cannot be distinguished by observation – every operation we can apply produces the same result for both objects. Consider the set expressions {1,2} and {2,1}. Using the observer operations available for sets, cardinality |…| and membership ∈, these expressions are indistinguishable:

• |{1,2}| = 2 and |{2,1}| = 2
• 1 ∈ {1,2} is true, and 1 ∈ {2,1} is true
• 2 ∈ {1,2} is true, and 2 ∈ {2,1} is true
• 3 ∈ {1,2} is false, and 3 ∈ {2,1} is false
• … and so on

In terms of abstract data types, “observation” means calling operations on the objects. So two objects are equal if and only if they cannot be distinguished by calling any operations of the abstract data type (which for immutable types means any observer or producer operations).

The two definitions of equality – by abstraction function, and by observation – should be consistent with each other, or something is wrong.

### Example: Duration

Here’s a simple example of an immutable ADT.

class Duration {
// Rep invariant:
//    mins >= 0, secs >= 0
// Abstraction function:
//    AF(mins, secs) = the span of time equal to mins minutes plus secs seconds

/** Make a duration lasting for m minutes and s seconds. */
public constructor(m: number, s: number) {
this.mins = m;
this.secs = s;
}
/** @returns length of this duration in seconds */
public get length(): number {
return this.mins*60 + this.secs;
}
}

Now which of the following values should be considered equal?

const d_1_2  = new Duration(1, 2);
const d_1_3  = new Duration(1, 3);
const d_0_62 = new Duration(0, 62);
const e_1_2  = new Duration(1, 2);

Think in terms of both the abstraction-function definition of equality, and the observational equality definition.

Any second now

Consider the code for Duration and the objects d_1_2, d_1_3, d_0_62, e_1_2 just created above.

Using the abstraction-function notion of equality, which of the following would be considered equal to d_1_2?

(missing explanation)

Eye on the clock

Using the observational notion of equality, which of the following would be considered equal to d_1_2?

(missing explanation)

Letters

Consider the following rep for an abstract data type:

/** Immutable type representing a set of letters, ignoring case */
class LetterSet {

// Abstraction function:
//    AF(s) = the set of the letters that are found in s
//            (ignoring non-letters and alphabetic case)
// Rep invariant:
//    true

/**
* Make a LetterSet consisting of the letters found in chars
* (ignoring alphabetic case and non-letters).
*/
public constructor(chars: string) {
this.s = chars;
}

... // observer and producer operations
}

Using the abstraction-function definition of equality for LetterSet, which of the following should be considered equal to new LetterSet("abc")?

(missing explanation)

Consider the following observer and producer operations that might be included in the LetterSet type.

/**
* @returns the size of this set
*/
public get size(): number { ... }

/**
* @param letter must be a single letter 'a'...'z' or 'A'...'Z'
* @returns true iff this set contains letter, ignoring alphabetic case
*/
public contains(letter: string): boolean { ... }

/**
* @returns the length of the string that this set was constructed from
*/
public get length(): number { ... }

/**
* @returns the union of this and that
*/
public union(that: LetterSet): LetterSet { ... }

/**
* @returns true if and only if all the letters in this set are lowercase
*/
public isAllLowercase(): boolean { ... }

/**
* @returns the first letter in s
*/
public first(): string { ... }

We want to choose a subset of these observers and producers such that the observational definition of equality is consistent with the abstraction-function definition of equality. Which of the subsets below would satisfy that goal?

(missing explanation)

Which operations are definitely inconsistent with the abstraction function?

(missing explanation)

Lines

Consider this abstract data type, which has a rep and a complete set of operations, but is missing its AF/RI:

/** Immutable type representing lines in the plane that are neither horizontal nor vertical. */
class MyLine {

// Abstraction function:
//    TODO
// Rep invariant:
//    TODO

/**
*  Make a MyLine that passes through all the points (p[2i], p[2i+1]).
*  p must contain at least 2 points and all points in p must be collinear
*  on a line that is neither horizontal nor vertical.
*  For example, MyLine({ 0,0,  1,2 }) passes through both (0,0) and (1,2).
*/
public constructor(p: Array<number>) { ... }

/**
* @returns the slope of the line.
*/
public get slope(): number { ... }
}

Using the observational definition of equality, which of the following should be considered equal to new MyLine([ 0,0, 1,1 ])?

(missing explanation)

Which of these abstraction functions would yield an abstraction-function definition of equality that consistent with the observational definition of equality? (Assume that the rep invariant is defined appropriately in each case.)

(missing explanation)

## Reference equality vs. value equality

Many languages have two different operations for testing equality between objects, with different semantics.

• reference equality tests whether two references point to the same storage in memory. In terms of the snapshot diagrams we’ve been drawing, two references are equal by reference equality if their arrows point to the same object bubble.
• value equality tests whether two objects represent the same value – in other words, the sense we’ve been talking about in this reading.

For comparison, here are the equality operations for objects in several languages:

 referenceequality valueequality Python is == Java == equals() Objective C == isEqual: C# == Equals() TypeScript/JavaScript === missing

The == operator unfortunately flips its meaning between Python and most other languages.

TypeScript and JavaScript use triple-equals ===, the strict equality operator. The double-equals == operator also exists in these languages, but it does a variety of automatic type conversions that make it difficult to use safely as an equality operator. The automatic type conversions mean that the == is sometimes a reference-equality operation, and sometimes a value-equality operation on converted values. It isn’t even an equivalence relation! (We’ll see why in an exercise below.) The == is not safe from bugs or easy to understand, so modern TS/JS programmers strenuously avoid it.

Note also that this table is only about equality for object types, like we use to define a new ADT. Built-in primitive types, like number or string in TypeScript, follow different rules. For primitives like string in TypeScript, the === operator does value equality, and there is no notion of reference equality at all.

For immutable object types, the reference-equality === operation is generally not the right equality operation. And TypeScript and JavaScript unfortunately have no built-in operation for value equality on object types, unlike the other languages. But when we define a new data type, value equality is a useful operation, so it makes sense to introduce it as a method. Our convention will be to name the method equalValue():

/** Some immutable type T. */
class T {
...

/**
* @param that  value to compare this with
* @returns true iff this and that represent the same abstract value.
*    Must be an equivalence relation (reflexive, symmetric, and transitive).
*/
public equalValue(that: T): boolean;
}

It’s our responsibility to decide what value equality means for values of the data type, and implement equalValue() appropriately. Here’s the equalValue() operation for Duration:

class Duration {
...
public equalValue(that: Duration): boolean {
return this.length === that.length;
}
}

Triple-equals is too strong

Using the same Duration objects we created earlier:

const d_1_2  = new Duration(1, 2);
const d_1_3  = new Duration(1, 3);
const d_0_62 = new Duration(0, 62);
const e_1_2  = new Duration(1, 2);

Which of the following are equivalent to d_1_2 by the === operator, i.e. ____ === d_1_2 is true?

(missing explanation)

### Breaking the equivalence relation

We have to make sure that the definition of equality implemented by equalValue() is actually an equivalence relation as defined earlier: reflexive, symmetric, and transitive. If it isn’t, then other operations that depend on equalValue() (like searching through an array) will behave erratically and unpredictably. You don’t want to program with a data type in which sometimes a equals b, but b doesn’t equal a. Subtle and painful bugs will result.

Here’s an example of how an innocent attempt to make equality more flexible can go wrong. Suppose we wanted to allow for a tolerance in comparing Duration objects, because different computers may have slightly unsynchronized clocks:

private static readonly CLOCK_SKEW: number = 5; // seconds

// returns true iff this and that represent the same abstract value within a clock-skew tolerance
public equalValue(that: Duration): boolean {
return Math.abs(this.length - that.length) <= Duration.CLOCK_SKEW;
}

Which property of the equivalence relation is violated?

Equals-ish

Consider the latest implementation of Duration in the reading, reprinted here for convenience:

class Duration {
// Rep invariant:
//    mins >= 0, secs >= 0
// Abstraction function:
//    AF(mins, secs) = the span of time equal to mins minutes plus secs seconds

/** Make a duration lasting for m minutes and s seconds. */
public constructor(m: number, s: number) {
this.mins = m;
this.secs = s;
}
/** @returns length of this duration in seconds */
public get length(): number {
return this.mins*60 + this.secs;
}

private static readonly CLOCK_SKEW: number = 5; // seconds

/**
* @param that  value to compare this with
* @returns true iff this and that represent the same abstract value.
*    Must be an equivalence relation (reflexive, symmetric, and transitive).
*/
public equalValue(that: Duration): boolean {
return Math.abs(this.length - that.length) <= Duration.CLOCK_SKEW;
}
}

Suppose these Duration objects are created:

const d_0_60 = new Duration(0, 60);
const d_1_00 = new Duration(1, 0);
const d_0_57 = new Duration(0, 57);
const d_1_03 = new Duration(1, 3);

(missing explanation)

Skewed up

(missing explanation)

Buggy equality

(missing explanation)

Oh-no equals-equals

Let’s look at how the JavaScript == operator works. Suppose we have an empty string and two empty arrays:

const s = "";
const arr1 = [];
const arr2 = [];

To answer this question, you will need two things:

Which of the following return true? (Try to figure it out first using just the spec of ==, but if you get stuck you can experiment in the TypeScript Playground.)

(missing explanation)

Considering just its behavior on s, arr1, and arr2, which properties of an equivalence relation are not satisfied by ==?

(missing explanation)

## Equality of mutable types

We’ve been focusing on equality of immutable objects so far in this reading. What about mutable objects?

Equality must still be an equivalence relation.

We also want equality to respect the abstraction function and respect operations. But with mutable objects, there is a new possibility: by calling a mutator on one of the objects before doing the observation, we may change its state and thus create an observable difference between the two objects.

So let’s refine our definition and allow for two notions of equality based on observation:

• observational equality means that two references cannot be distinguished now, in the current state of the program. A client can try to distinguish them only by calling operations that don’t change the state of either object (i.e. only observers and producers, not mutators) and comparing the results of those operations. This tests whether the two references “look” the same for the current state of the objects.
• behavioral equality means that two references cannot be distinguished now or in the future, even if a mutator is called on one reference but not the other. This tests whether the two references will “behave” the same, in this and all future states.

For immutable types, observational and behavioral equality are identical, because there aren’t any mutator methods that can change the state of the objects. So we only need one kind of equality – which for immutable object types is the equalValue() operation that we have already defined, and for immutable primitive types (like string and number) is ===.

For mutable types, it is useful to have both kinds of equality available as operations. The === operator provides behavioral equality, and we can use the equalValue() operation for observational equality:

/**
* Some mutable type T.
* Use === to compare elements of type T for behavioral equality.
*/
class T {
...

/**
* @param that  value to compare this with
* @returns true iff this and that are observationally equivalent.
*    Must be an equivalence relation (reflexive, symmetric, and transitive).
*/
public equalValue(that: T): boolean;

}

Observational equality on arrays
const arrayA = [1, 2, 3];
const arrayB = [1, 2, 3];
const arrayC = arrayB;

If we want to explore whether the array values referred to by arrayA, arrayB, and arrayC are equal by observational equality, which of the following thought experiments could we try?

(missing explanation)

Which objects are equal by observational equality?

(missing explanation)

Behavioral equality on arrays

As before:

const arrayA = [1, 2, 3];
const arrayB = [1, 2, 3];
const arrayC = arrayB;

If we want to explore whether arrayA, arrayB, and arrayC would be equal by behavioral equality, which of the following thought experiments could we try? (Assume you catch exceptions, so that the thought experiment can keep running without crashing the program.)

(missing explanation)

Which objects are equal by behavioral equality?

(missing explanation)

Bag

Suppose Bag<E> is a mutable ADT representing what is often called a multiset, an unordered collection of objects where an object can occur more than once. It has the following operations:

/** make an empty bag */
public constructor();

/** modify this bag by adding an occurrence of e, and return this bag */

/** modify this bag by removing an occurrence of e (if any), and return this bag */
public remove(e: E): Bag<E>;

/** return number of times e occurs in this bag */
public count(e: E): number;

/** use === to compare two Bags for behavioral equivalence. */

/** return true iff this and that are observationally equivalent */
public equalValue(that: Bag<E>): boolean;

Suppose we run this code:

const b1 = new Bag<string>().add("a").add("b");
const b3 = b1.remove("b");
const b4 = new Bag<string>().add("b").add("a"); // swap!

(missing explanation)

Bag behavior

Assuming === provides behavioral equality for Bag as the spec requires, which of the following expressions are true?

(missing explanation)

Bean bag

Assuming Bag.equalValue() is implemented with observational equality as its spec requires, which of the following expressions are true?

(missing explanation)

## “Deep equality” on collections

One thing unfortunately missing from TypeScript/JavaScript is a standard operation for observational equality on the built-in collection types, Array, Set, and Map. For example:

const arr1: Array<number> = [1, 2, 3];
const arr2: Array<number> = [1, 2, 3];

const set1: Set<string> = new Set(["a", "b"]);
const set2: Set<string> = new Set(["b", "a"]);

Since Array and Set are mutable, === implements behavioral equality as expected. In this case, arr1 !== arr2 and set1 !== set2, because each points to a different mutable object in memory.

But there is no standard operator for observational equality, comparable to the equalValue() operation we introduced in the previous section. There is no built-in way to tell that arr1 and arr2 currently represent the same sequence of elements, and likewise that set1 and set2 represent the same set of elements.

Some libraries have tried to rectify this omission by providing a “deep equality” operation, defined not only on collections like Array and Set but on a wide variety of object types:

These operations are “deep” in the sense that they can unravel multiple levels of collections. For example, two values of type Array<Map<T,Set<U>>> can be compared for observational equality.

But these deep equality operations must be used with great care. The spec for assert.deepStrictEqual() has special treatment for built-in collections like Map and Set – for example, ignoring the order of elements – so that it can correctly compare the abstract values of these collections. But for other object types, these deep equality operations blindly compare the rep, field by field, without any knowledge at all of the abstraction function that governs how the rep should be interpreted.

Deep equality can be safe to use when the leaf types of the collections are primitive types like number, string, and boolean – e.g. Array<Map<string,Set<number>>>. But when the leaf types are user-defined object types, like the types Duration and Bag we have used in this reading, then the behavior of deep equality may be unexpected and undesirable.

Drilling deeper

Which of these assertions will succeed? You may find the specifications for assert.strictEqual() and assert.deepStrictEqual() helpful.

(missing explanation)

Drilling too deep

Consider Duration, reprinted here for convenience:

class Duration {
// Rep invariant:
//    mins >= 0, secs >= 0
// Abstraction function:
//    AF(mins, secs) = the span of time equal to mins minutes plus secs seconds

/** Make a duration lasting for m minutes and s seconds. */
public constructor(m: number, s: number) {
this.mins = m;
this.secs = s;
}
/** @returns length of this duration in seconds */
public get length(): number {
return this.mins*60 + this.secs;
}
/** @param that  value to compare this with
*  @returns true iff this and that represent the same abstract value.
*     Must be an equivalence relation (reflexive, symmetric, and transitive). */
public equalValue(that: Duration): boolean {
return this.length === that.length;
}
}

Suppose we have these two Duration values:

const d_0_60 = new Duration(0, 60);
const d_1_00 = new Duration(1, 0);

… which we should consider equal, because they represent the same abstract value.

Which of the following would be correct ways to assert their equality?

(missing explanation)

Now suppose we have:

const arr1: Array<Duration> = [d_0_60];
const arr2: Array<Duration> = [d_1_00];

Which of the following would be correct ways to assert the observational equality of these two arrays?

(missing explanation)

## Hash functions

In most languages (including TypeScript and Python), the Set and Map types are implemented using a hash table, which supports fast O(1) lookup of set elements or map keys. A hash table requires the set element type or map key type to offer a hash function operation, which converts the object value into an integer.

This section discusses the interactions between hash functions and equality. Because TypeScript does not allow implementing the hash function yourself, we will use Python in this section.

In Python, the equality operation for object types is __eq__(..), much as we have defined it for TypeScript. And the hash function operation is __hash__():

def __hash__(self):
'''
Called by built-in function hash() and for operations on members of hashed
collections including set, frozenset, and dict. __hash__() should return an
integer. The only required property is that objects which compare equal
have the same hash value.
'''

To understand why __hash__ needs to be compatible with __eq__, you’ll need to have some idea of how hash tables work. Python sets and dictionaries use a hash table data structure, and depend on the __hash__ method to be implemented correctly for the objects stored in the set and used as keys in the dictionary.

A hash table is a representation for a mapping: an abstract data type that maps keys to values. Hash tables offer constant time lookup, so they tend to perform better than trees or lists. Keys don’t have to be ordered, or have any particular property, except for offering equality and hashing operations.

Here’s how a hash table works. It contains an array that is initialized to a size corresponding to the number of elements that we expect to be inserted. When a key and a value are presented for insertion, we compute the hash code of the key, and convert it into an index in the array’s range (e.g., by a modulo division). The value is then inserted at that index.

The rep invariant of a hash table includes the fundamental constraint that a key can be found by starting from the slot determined by its hash code.

Hash codes are designed so that the keys will be spread evenly over the indices. But occasionally a conflict occurs, and two keys are placed at the same index. So rather than holding a single value at an index, a hash table actually holds a list of key/value pairs, usually called a hash bucket. On insertion, you add a key/value pair to the list in the array slot determined by the hash code. For lookup, you hash the key, find the right slot, and then examine each of the pairs until one is found whose key equals the query key.

Now it should be clear why the __hash__ spec requires equal objects to have the same hash code. If two equal objects had distinct hash codes, they might be placed in different slots. So if you attempt to lookup a value using a key equal to the one with which it was inserted, the lookup may fail.

For mutable objects, Python forbids implementing __hash__:

If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable.

But immutable objects need an implementation of __hash__ in order to be used in those collections.

A simple and drastic way to ensure that the contract is met is for __hash__ to always return some constant value, so every object’s hash code is the same. This satisfies the specification, but it would have a disastrous performance effect, since every key will be stored in the same slot, and every lookup will degenerate to a linear search along a long list.

The standard way to construct a more reasonable hash code that still satisfies the contract is to compute a hash code for each component of the object that is used in the determination of equality (in Python by using hash()), and then combining these, throwing in a few arithmetic operations.

(In the context of Java, where this operation is named hashCode, Josh Bloch’s book Effective Java explains this issue in more detail, and gives some strategies for writing decent hash code functions. The advice is summarized in a good StackOverflow post.)

Note, however, that as long as you satisfy the requirement that equal objects have the same hash code value, then the particular hashing technique you use doesn’t make a difference to the correctness of your code. It may affect its performance, by creating unnecessary collisions between different objects, but even a poorly-performing hash function is better than one that breaks the contract.

In a language like Python or Java where all objects support equality comparison, always implement hashing consistent with equality for immutable types.

### Hashing in TypeScript/JavaScript

In TypeScript and JavaScript, the hash function is built-in and cannot (currently) be overridden for user-defined types. The predefined hash functions are designed to be consistent with ===, so any type that uses === for behavioral equality is safe for use as a Set element or Map key. So Set elements and Map keys can be immutable built-in types (like number or string) or mutable object types (using reference equality). But unfortunately we cannot easily use immutable object types as Set elements or Map keys in TypeScript because Set and Map don’t have a way to determine equality (or get a hash code) for our immutable types.

One approach to this problem is interning, a design pattern that creates exactly one object instance of the immutable type for each abstract value that the program uses, and reuses that instance any time the same value is used again. When exactly one object for each abstract value exists, then observational equality becomes the same as reference equality, and the object hash function (which uses reference equality) can be used to correctly hash identify the object in a Set or Map. Using the interning pattern requires keeping a cache of all the objects of the type that have been created, and a factory function that first consults the cache before creating a new instance of the object.

## Summary

• Equality should be an equivalence relation (reflexive, symmetric, transitive).
• The abstraction function is the basis for equality in immutable data types.
• Mutable data types have two useful notions of equality: behavioral equality (which provides consistent equality over time) and observational equality (which uses the abstraction function). For immutable types, observational and behavioral equality are identical.

In TypeScript, for immutable built-in types (e.g. number, string, bigint):

• Use === for behavioral and observational equality, which are identical because there are no mutators.
• Avoid == because of its automatic conversions.
• Safe for use as Set elements and Map keys.

For immutable object types (e.g. Duration):

• Use equalValue() for both behavioral and observational equality, which are identical because there are no mutators.
• Avoid === because it is too strong, and avoid == because of its automatic conversions.
• Not safe for use as Set elements or Map keys because those data structures compare objects using ===.

For mutable object types:

• Use === for behavioral equality.
• Use equalValue() for observational equality.
• Avoid == because of its automatic conversions.
• Safe for use as Set elements and Map keys.

Equality is one part of implementing an abstract data type, and we’ve already seen how important ADTs are to achieving our three primary objectives. Let’s look at equality in particular:

• Safe from bugs. Correct implementation of equality is essential for clients of a data type, and also highly desirable for writing tests.

• Easy to understand. Clients and other programmers who read our specs will expect our types to implement an appropriate equality operation, and will be surprised and confused if we do not.

• Ready for change. Correctly-implemented equality for immutable types separates equality of reference from equality of abstract value, hiding from clients our decisions about whether values are shared.