Reading 8: Defining ADTs with Interfaces, Generics, Enums, and Functions
Praxis Tutor exercises
Keep making progress on TypeScript by completing the following categories in the Praxis Tutor:
Software in 6.102
Objectives
Today’s class is about various ways to implement abstract data types, including:
- interfaces: separating the interface of an ADT from its implementation;
- generic types: defining a family of ADTs using generic type parameters;
- enumerations: defining an ADT with a small finite set of values;
- global functions operating on an opaque type: rare in TypeScript but common in non-object-oriented languages.
We also discuss subtyping as a relationship between two types determined by their specs, and distinguish it from subclassing, which involves reps as well.
After today’s class, you should be able to:
- define ADTs using classes, interfaces, generics, and enumerations
- determine whether one type is a subtype of another
Interfaces
TypeScript’s interface
is a useful language mechanism for expressing an abstract data type.
An interface in TypeScript is a list of method signatures without method bodies.
A class implements an interface if it declares the interface in its implements
clause, and provides method bodies for all of the interface’s methods.
So one way to define an abstract data type in TypeScript is as an interface.
The type’s implementation is then a class implementing that interface.
One advantage of this approach is that the interface can specify the contract for the client and nothing more. The interface is all a client programmer needs to read to understand the ADT. The client can’t create inadvertent dependencies on the ADT’s rep, because the rep isn’t present in the interface (not even as private fields). The implementation is kept well and truly separated, in a different class altogether.
Another advantage is that multiple different representations of the abstract data type can coexist in the same program, as different classes implementing the interface.
When an abstract data type is represented just as a single class, without an interface, it’s harder to have multiple representations.
In the MyString
example from Abstract Data Types, MyString
was a single class.
We explored two different representations for MyString
, but we couldn’t have both representations for the ADT in the same program.
TypeScript’s static type checking allows the compiler to catch many mistakes in implementing an ADT’s contract. For instance, it is a compile-time error to omit one of the required methods, or to give a method the wrong return type. Unfortunately, the compiler doesn’t check for us that the code adheres to the specs of those methods that are written in documentation comments.
About interfaces in TypeScript
Syntactically, a TypeScript interface
contains the spec of an ADT, namely its public method signatures and public instance methods.
Each instance method signature ends with a semicolon.
An interface does not include information about the rep, so it should not declare any private instance variables or include instance method bodies. It also should not include any abstraction function, rep invariant, or safety from rep exposure argument, as all those depend on having a rep.
Because an interface has no rep, TypeScript doesn’t allow it to be instantiated with new
.
The expression new ArrayLike<number>()
is a static error.
To make an instance of an ArrayLike
type, you need to instead instantiate a class that provides a rep for it, like new Array<number>()
.
reading exercises
Consider this TypeScript interface and TypeScript class, which are intended to represent a curve in the 2D plane:
/** Represents an immutable curve in the plane. */
interface Curve {
/** @returns true if the point (x,y) lies on this curve */
contains(x: number, y: number): boolean;
/** @returns a curve formed by connecting this with that */
/* A */ join(that: Curve): ArrayCurve;
}
/** Implementation of Curve. */
class ArrayCurve implements Curve {
/** make a one-point curve */
public constructor(x: number, y: number) { ... }
/** @returns a curve formed by connecting this with that */
public join(that: Curve): ArrayCurve { ... }
/** @returns the total path length of this curve */
public pathLength(): number { ... }
}
The line labeled A
is a problem because Curve
mentions ArrayCurve
, but ArrayCurve
also mentions Curve
, which is circular.
(missing explanation)
The line labeled A
is a problem because it isn’t representation-independent.
(missing explanation)
ArrayCurve
doesn’t correctly implement Curve
because it’s missing the contains()
method.
(missing explanation)
ArrayCurve
doesn’t correctly implement Curve
because it includes a method pathLength
that Curve
doesn’t have.
(missing explanation)
Subtypes
Recall that a type is a set of values, with associated operations.
For example, the TypeScript ArrayLike
type is defined by an interface, which has the operations length
and [..]
indexing.
If we think about all possible ArrayLike
values, none of them are actually objects of type ArrayLike
: we cannot create instances of an interface.
Instead, those values might be string
objects, or Array
objects, or objects of any class that provides the operations expected by ArrayLike
(namely length
and [..]
indexing).
A subtype is simply a subset of the supertype: string
and Array
are subtypes of ArrayLike
.
“B is a subtype of A” means “every B is an A.” In terms of specifications: “every B satisfies the specification for A.”
That means B is only a subtype of A if B’s specification is at least as strong as A’s specification. When we declare a class that implements an interface, the TypeScript compiler enforces part of this requirement automatically: for example, it ensures that every method in A appears in B, with a compatible type signature. Class B cannot implement interface A without implementing all of the operations declared in A.
But the compiler cannot check that we haven’t weakened the specification in other ways: strengthening the precondition on some inputs to a method, weakening a postcondition, weakening a guarantee that the interface abstract type advertises to clients. If you declare a subtype in TypeScript — implementing an interface is our current focus — then you must ensure that the subtype’s spec is at least as strong as the supertype’s.
How to declare subtypes in TypeScript
To declare that a class B is a subtype of an interface A, use implements
:
class MyArray<T> implements ArrayLike<T> {
...
}
This declaration requires MyArray
to implement (provide method bodies for) all the operations found in ArrayLike
, with specs at least as strong as the specs in ArrayLike
.
TypeScript statically checks the types, but a human programmer must check the rest of the spec.
Note even though we cannot create instances of the ArrayLike
interface, we can still use it to declare variables, parameters, and return types, so long as they are eventually initialized with an object from a concrete subtype class like MyArray
.
You can also declare that an interface is a subtype of another interface, using extends
:
interface ReversibleArrayLike<T> extends ArrayLike<T> {
// inherits signatures and specs of existing ArrayLike operations, and adds new ones:
/** Reverse this array, mutating it in place. */
public reverse(): void;
}
Because ReversibleArrayLike
is also an interface, it doesn’t provide any implementations of ArrayLike
operations.
But it may strengthen the spec of the type, by strengthening existing ArrayLike
operations or adding new operations.
Here, for example, ReversibleArrayLike
adds a reverse
operation.
Structural subtyping in TypeScript
In TypeScript, there is one more way for a type B to be a subtype of another type A: structural subtyping.
With structural subtyping, B doesn’t have to mention A in its declaration (no B implements A
or B extends A
).
But if B provides at least all the operations required by A – the same public methods and public instance variables, with compatible types – then TypeScript will consider B a subtype of A.
Structural subtyping is convenient and frequently necessary in TypeScript, but it opens a hole in type safety, because it allows B to be a subtype of A even if B has an incompatible specification.
Mutability and immutability offer a good example of both the advantages and pitfalls here.
Recall that TypeScript offers a ReadonlyArray
interface, which has all the observers and producers of Array
, but omits the mutators.
As a result, Array
is a structural subtype of ReadonlyArray
, so TypeScript allows us to write useful and good code like this:
const readonlyArr: ReadonlyArray<number> = [ 1, 2, 3 ];
The initializer in this declaration, [ 1, 2, 3 ]
, produces an Array
object.
Because Array
is a structural subtype of ReadonlyArray
, we can then assign it to a variable of type ReadonlyArray
, and henceforth treat it like an immutable value.
TypeScript’s static checking will ensure that we can’t accidentally go the other way:
const arr: Array<number> = readonlyArr;
// static error! ReadonlyArray is *not* a structural subtype of Array,
// because it doesn't provide Array's mutator operations
So that’s how structural subtyping helps us. But here’s the hole in type safety. Suppose that we had kept an alias to the original mutable Array
:
const arr: Array<number> = [ 1, 2, 3 ];
const readonlyArr: ReadonlyArray<number> = arr;
Now we can mutate readonlyArr
, using its alias arr
:
arr.push(4);
console.log(readonlyArr); // prints 1,2,3,4
Using this alias, we have broken the best part of ReadonlyArray
: its immutability!
Even though Array
is a structural subtype of ReadonlyArray
, it is not a true subtype, because the Array
contract does not provide immutability.
So structural subtyping is convenient, but should be used with care.
This example has shown a common use of it — constructing a list, map, or set using a mutable type, and then assigning it to a ReadonlyArray
, ReadonlyMap
, or ReadonlySet
to treat it as immutable from then on.
It’s important to throw away all aliases that use the original mutable type, to avoid the risk shown here.
reading exercises
Let’s define an interface for rectangles:
/** An immutable rectangle. */
interface ImmutableRectangle {
/** @returns the width of this rectangle */
getWidth(): number;
/** @returns the height of this rectangle */
getHeight(): number;
}
We would like to say that every square is a rectangle:
/** An immutable square. */
class ImmutableSquare implements ImmutableRectangle {
private readonly side: number;
/** Make a new side x side square. */
public constructor(side: number) { this.side = side; }
/** @returns the width of this square */
public getWidth(): number { return this.side; }
/** @returns the height of this square */
public getHeight(): number { return this.side; }
}
Does ImmutableSquare.getWidth()
satisfy the spec of ImmutableRectangle.getWidth()
?
Does ImmutableSquare.getHeight()
satisfy the spec of ImmutableRectangle.getHeight()
?
Does the whole ImmutableSquare
spec satisfy the ImmutableRectangle
spec?
(missing explanation)
Now let’s look at mutable versions of rectangles and squares.
/** A mutable rectangle. */
interface MutableRectangle {
// ... same methods as ImmutableRectangle above ...
/** Set this rectangle's dimensions to width x height. */
setSize(width: number, height: number);
}
Surely every square is still a rectangle?
/** A mutable square. */
class MutableSquare implements MutableRectangle /* hopefully? */ {
private side: number;
// ... same constructor and methods as ImmutableSquare above ...
// TODO implement setSize(..)
}
For each possible MutableSquare.setSize(..)
spec below, is it a valid strengthening of MutableRectangle.setSize()
?
/**
* Set this square's dimensions to width x height.
* Requires width = height.
*/
public setSize(width: number, height: number) { ... }
(missing explanation)
/**
* Set this square's dimensions to width x height.
* @throws Error if width != height
*/
public setSize(width: number, height: number) { ... }
(missing explanation)
/**
* If width = height, set this square's dimensions to width x height.
* Otherwise, new dimensions are unspecified.
*/
public setSize(width: number, height: number) { ... }
(missing explanation)
Example: MyString
Let’s revisit MyString
.
Using an interface instead of a class for the ADT, we can support multiple implementations:
/**
* MyString represents an immutable sequence of characters.
*/
interface MyString {
// We'll skip this creator operation for now
// /**
// * @param s
// * @returns MyString representing the sequence of characters in s
// */
// public constructor(s: string) { ... }
/**
* @returns number of characters in this string
*/
length(): number;
/**
* @param i character position (requires 0 <= i < string length)
* @returns character at position i
*/
charAt(i: number): string;
/**
* Get the substring between start (inclusive) and end (exclusive).
* @param start starting index
* @param end ending index. Requires 0 <= start <= end <= string length.
* @returns string consisting of charAt(start)...charAt(end-1)
*/
substring(start: number, end: number): MyString;
}
Here’s our first implementation:
class SimpleMyString implements MyString {
private a: Uint16Array;
public constructor(s: string) {
this.a = new Uint16Array(s.length);
for (let i = 0; i < s.length; ++i) {
this.a[i] = s.charCodeAt(i);
}
}
public length(): number {
return this.a.length;
}
public charAt(i: number): string {
return String.fromCharCode(this.a[i]);
}
public substring(start: number, end: number): MyString {
const that = new SimpleMyString(""); // make a temporarily-empty object
that.a = this.a.slice(start, end); // ... and immediately replace its array
return that;
}
}
And here’s the optimized implementation:
class FastMyString implements MyString {
private a: Uint16Array;
private start: number;
private end: number;
public constructor(s: string) {
this.a = new Uint16Array(s.length);
for (let i = 0; i < s.length; ++i) {
this.a[i] = s.charCodeAt(i);
}
this.start = 0;
this.end = this.a.length;
}
public length(): number {
return this.end - this.start;
}
public charAt(i: number): string {
return String.fromCharCode(this.a[this.start + i]);
}
public substring(start: number, end: number): MyString {
const that = new FastMyString(""); // make a temporarily-empty object
that.a = this.a; // ... and immediately replace its instance variables
that.start = this.start + start;
that.end = this.start + end;
return that;
}
}
How will clients use this ADT? Here’s an example:
const s: MyString = new FastMyString("good morning");
console.log("The first character is: " + s.charAt(0));
Unfortunately, this pattern breaks the abstraction barrier we’ve worked so hard to build between the abstract type and its concrete representations. Clients must know the name of the concrete representation class. Because an interface in TypeScript has no constructor, a client must directly call one of the concrete class constructors. The spec of that constructor won’t appear anywhere in the interface, so there’s no static guarantee that different implementations will even provide the same constructors.
Instead of using a constructor, then, we can implement the creator operation as a factory function:
/**
* @param s
* @returns MyString representing the sequence of characters in s
*/
function makeMyString(s: string): MyString {
return new FastMyString(s);
}
Now a client can use the ADT without breaking the abstraction barrier:
const s: MyString = makeMyString("good morning");
console.log("The first character is: " + s.charAt(0));
Hiding the implementation completely is a tradeoff, because sometimes the client wants a choice of implementations with different characteristics.
reading exercises
Let’s review the code for FastMyString
.
Which of these are useful criticisms:
The abstraction function should be documented
(missing explanation)
The representation invariant should be documented
(missing explanation)
The rep fields should be readonly
so they cannot be reassigned
(missing explanation)
The charAt
specification exposes that the rep contains individual characters, and it shouldn’t.
(missing explanation)
The charAt
implementation should behave more helpfully when i
is greater than the length of the string
(missing explanation)
Why interfaces?
Interfaces are used frequently in real code. Not every class is associated with an interface, but there are a few good reasons to bring an interface into the picture.
- Documentation for both the compiler and for humans. Not only does an interface help the compiler catch ADT implementation bugs, but it is also much more useful for a human to read than the code for a concrete implementation. Such an implementation intersperses ADT-level types and specs with implementation details.
- Allowing performance trade-offs. Different implementations of the ADT can provide methods with very different performance characteristics. Different applications may work better with different choices, but we would like to code these applications in a way that is representation-independent. From a correctness standpoint, it should be possible to drop in any new implementation of a key ADT with simple, localized code changes.
- Methods with intentionally underdetermined specifications. An ADT for finite sets could leave unspecified the element order one gets when converting to an array. Some implementations might use slower method implementations that manage to keep the set representation in some sorted order, allowing quick conversion to a sorted array. Other implementations might make many methods faster by not bothering to support conversion to sorted arrays.
- Multiple views of one class. A TypeScript class may implement multiple interfaces. For instance, a user interface widget displaying a drop-down list is natural to view as both a widget and a list. The class for this widget could implement both interfaces. In other words, we don’t always implement an ADT multiple times just because we are choosing different data structures. Those implementations may be various different sorts of objects that can be seen as special cases of the ADT.
- More and less trustworthy implementations. Another reason to implement an interface more than once might be that it is easy to build a simple implementation that you believe is correct, while you can work harder to build a fancier version that is more likely to contain bugs. You can choose implementations for applications based on how bad it would be to get bitten by a bug.
reading exercises
Suppose you have an abstract data type for rational numbers, which is currently represented as a TypeScript class:
class Rational {
...
}
You decide to change Rational
to a TypeScript interface instead, along with an implementation class called IntFraction
:
interface Rational {
...
}
class IntFraction implements Rational {
...
}
For each piece of code below, taken from your original Rational
class, first identify the role it plays in the ADT. Then decide where it should go in your new ADT design: in the interface Rational
, in the implementation class IntFraction
, or both?
Subclassing
Implementing an interface is one way to make a subtype of another type. Another technique, which you may have seen before, is subclassing. Subclassing means defining a class as an extension, or subclass, of another class, so that:
- the subclass automatically inherits the instance methods of its superclass, including their method bodies
- the subclass can override any of those inherited method bodies by substituting its own method body
- the subclass also inherits the private rep of its superclass
- the subclass can also add new instance methods and rep fields for its own purposes
The boldfaced parts in the list above show how subclassing differs from implementing an interface. An interface has neither method bodies nor rep, so it’s not possible to inherit them when you implement an interface.
Subclassing is a common feature of object-oriented programming languages.
You may have already encountered it in Python, where you can write class SpottedTurtle(Turtle): ...
to define a new class SpottedTurtle
as a subclass of the class Turtle
.
TypeScript provides the same effect with the syntax class SpottedTurtle extends Turtle { ... }
.
Just like implementing an interface, subclassing should imply subtyping.
If SpottedTurtle
is a subclass of Turtle
, then SpottedTurtle
needs to have at least as strong a spec as Turtle
, because TypeScript will allow SpottedTurtle
objects to be used anywhere that a Turtle
might be expected.
But unlike implementing an interface, a subclass inherits not only the spec of its superclass but also its rep – the fields that implement it. At first this seems like a great idea. Reusing implementation ought to make our code more DRY, and more ready for change.
But years of experience by API designers have revealed subtle challenges to making subclassing safe from bugs and ready for change. You can read more if you’re interested, but essentially, inheriting the rep means:
- rep exposure between the superclass and all its subclasses
- rep dependence between superclass and subclasses
- superclass and subclass can inadvertently break each other’s rep invariants
Designing for safe subclassing means that the superclass must now offer two contracts: one for interaction with clients, and one for subclasses. These issues simply do not arise with interfaces.
Overriding and dynamic dispatch
In one respect, however, we can’t get away from thinking about subclassing when we write classes in TypeScript.
Every TypeScript class is automatically a subclass of Object
, and automatically inherits methods like toString()
.
We often need to override this inherited implementation, replacing it with our own implementation, in order for our class to behave correctly.
To illustrate, let’s look at a simpler case: overriding the toString()
method for our FastMyString
class.
The default toString()
provided by Object
is not particularly useful for debugging:
const fms: FastMyString = new FastMyString("hello");
fms.toString() → "[object Object]" // not useful!
When a method has multiple implementations, such as when a class overrides a method like toString()
that already has an implementation in its superclass Object
, then TypeScript has to determine which of the implementations to use when the method is called.
This process is called dispatching to the method, and TypeScript’s rule is dynamic dispatch – it uses the implementation appropriate to the dynamic type of the object, not the static type of the reference that points to the object.
Let’s make this better by overriding toString()
:
class FastMyString {
...
public toString(): string {
let s = "";
for (let i = 0; i < this.length(); ++i) {
s += this.charAt(i);
}
return s;
}
}
const fms: FastMyString = new FastMyString("hello");
fms.toString() → "hello"
In this example, FastMyString
is the type of both the variable fms
and the object it points to.
But they don’t have to have to be the same type.
The variable can be a supertype, while the object itself is from a subtype.
We have already done this when we used a variable of type MyString
to hold a reference to FastMyString
.
But because Object
is a supertype of all classes in TypeScript, we can also do this:
const obj: Object = new FastMyString("hello");
obj.toString() → ?
But now there seems to be an ambiguity.
Object
has its own implementation of toString()
.
When we call obj.toString()
, does it call the original Object
implementation, or the overridden FastMyString
implementation?
Do we get "[object Object]"
or "hello"
?
The dynamic dispatch rule says that the type of the object at runtime determines the method implementation to call.
This type is not known until runtime, hence the word “dynamic.”
The type of the obj
variable, by contrast, is known at compile time.
Using dynamic dispatch, here is what we get:
const obj: Object = new FastMyString("hello");
obj.toString() → "hello"
…the same as we would get from fms.toString()
.
The effect of the dynamic dispatch rule is that it doesn’t matter what type of reference you have pointing to an object, it will always use the object’s dynamic type for method dispatch.
Static checking uses static types
Since we’re now looking at situations where the static type of a variable differs from the dynamic type of the object it points to, it’s worth revisiting what static checking can and can’t do. Consider this example:
const obj: Object = new FastMyString("hello");
obj.charAt(0) → ?
What will happen? The charAt
method does indeed exist as a method on the FastMyString
object that obj
points to, so at runtime this code could conceivably work.
But charAt
is not a method on Object
. Since obj
has static type Object
, the TypeScript compiler will consider obj.charAt(0)
to be a static type error, so it will not even compile the code.
TypeScript relies on the static type of an expression at compile time to ensure that the method call will be always possible at runtime.
Because obj
has type Object
, at runtime it may point to any object, including Object
, Set
, Date
, or many other objects that don’t have a charAt
method.
Static type checking requires that TypeScript only allow expressions that are guaranteed to have an operation to call at runtime.
You can’t use a supertype reference (here Object
) to call a method that’s defined only on a subtype (FastMyString
).
reading exercises
let arr: ReadonlyArray<string> = ["abc"];
let obj: Object = arr;
/* next expression here */
What does each of the following do, if substituted for /* next expression here */
in the code above?
(missing explanation)
arr = obj as ReadonlyArray<string>;
(missing explanation)
(missing explanation)
(missing explanation)
Generic types
A useful feature of the TypeScript type system, and other modern statically-typed languages too, is a generic type: a type whose specification is in terms of a placeholder type to be filled in later.
The collection types are good examples of this.
Set<T>
is the ADT of finite sets of elements of some other type T
.
Instead of writing separate specifications and implementations for Set<string>
, Set<Integer>
, and so on, we can design one interface Set<T>
, which represents an entire family of set ADTs.
Here is a simplified version of the Set
interface, MySet
:
/**
* A mutable set.
* @param <T> type of elements in the set
*/
interface MySet<T> {
// example observer operations
/**
* Get size of the set.
* @returns the number of elements in this set
*/
size(): number;
/**
* Test for membership.
* @param e an element
* @returns true iff this set contains e
*/
has(e: T): boolean;
// example mutator operations
/**
* Modifies this set by adding e to the set.
* @param e element to add
*/
add(e: T): void;
/**
* Modifies this set by removing e, if found.
* If e is not found in the set, has no effect.
* @param e element to remove
*/
delete(e: T): void;
}
Notice how the specs are in terms of our abstract notion of a set; it would be malformed to mention the details of any particular implementation of sets with particular private fields.
These specs should apply to any valid implementation of the MySet
ADT.
We will also want a creator operation:
// example creator operation
/**
* Make an empty set.
* @param <U> type of elements in the set
* @returns a new set instance, initially empty
*/
function makeMySet<U>(): MySet<U> { ... }
The make
operation is implemented as a factory function.
Clients will write code like:
const strings: MySet<string> = makeMySet();
and the compiler will infer that the new MySet
should be a set of string
objects.
Note that a TypeScript syntax requirement arises in the method signature above: a function like makeMySet
needs to independently declare its type parameter before the parameter list, here shown as function makeMySet<U>
.
We called the type parameter U
instead of T
just to highlight that it’s an arbitrary name, just like the name of a value parameter.
The instance methods above don’t have this extra <...>
declaration because they always use the type parameter declared by the enclosing class or interface, interface MySet<T>
.)
Implementing generic interfaces
Suppose we want to implement the generic MySet<T>
interface above.
We can either write a non-generic implementation that replaces T
with a specific type, or a generic implementation that keeps the placeholder.
Generic interface, non-generic implementation.
Let’s implement MySet<T>
for a particular type T
.
We’ll implement a set of digits:
class DigitSet implements MySet<number>
Wherever the interface mentions placeholder type T
, the DigitSet
implementations replace T
with number
.
For example:
Note that this representation for DigitSet
is not suited for representing sets of arbitrary-type elements.
Its rep cannot represent a MySet<string>
without careful work to define a new rep invariant and abstraction function that can handle arbitrary strings.
Generic interface, generic implementation.
We can also implement the generic MySet<T>
interface without picking a type for T
.
In that case, we write our code blind to the actual type that clients will choose for T
.
|
|
A generic implementation can only rely on details of a placeholder type T
that are explicitly included in the generic interface specification.
HashSet
can’t call methods on its elements that are only declared for some specific type like string
, because T
might be any type.
reading exercises
Consider again our MySet
interface:
interface MySet<T> {
has(e: T): boolean;
// ...
}
What is wrong (if anything) with the following pieces of code that try to create new ADTs based on MySet
?
/** Represents a set of integers. */
/*1*/ class IntSet implements MySet<number> {
/*2*/ public has(e: T): boolean { ... }
// ...
}
(missing explanation)
/** Represents a set that can grow but never shrink. */
/*1*/ interface GrowingSet<Z> extends MySet<T> {
/*2*/ has(z: Z): boolean;
// ...
}
(missing explanation)
/** Represents a set whose for-iteration returns the elements in the order they were added. */
/*1*/ class OrderedSet<T> implements MySet<T> {
/*2*/ public has(e: T): boolean { ... }
// ...
}
(missing explanation)
Consider another possible operation on our MySet
interface:
interface MySet<T> {
/**
* Picks an element from the set.
* @returns an element in the set
* @throws NoSuchElementError if set is empty
*/
pick(): T;
// ...
}
This operation is underdetermined – it may have many possible legal return values for a particular set.
Suppose SimpleSet
implements MySet
, and wants to write its own spec for pick
:
class SimpleSet<T> implements MySet<T> {
private readonly elementList: Array<T> = [];
// ...
/**
* <new spec here>
*/
public pick(): T {
...
}
}
(missing explanation)
(missing explanation)
(missing explanation)
Enumerations
Sometimes an ADT has a small, finite set of immutable values, such as:
- months of the year: January, February, …
- days of the week: Monday, Tuesday, …
- compass points: north, south, east, west
- line caps in a line drawing: butt, round, square
A type like this may be used as part of a more complex type (like DateTime
or Latitude
), or as a parameter that changes the behavior of a method (like drawLine
).
When the set of values is small and finite, it makes sense to define all the values as named constants, called an enumeration.
TypeScript has the enum
construct to make this convenient:
enum Month { JANUARY, FEBRUARY, MARCH, ..., DECEMBER };
This enum
defines a new type name, Month
, in the same way that class
and interface
define new type names.
It also defines a set of named values, which we write in all-caps because they are effectively public static readonly
constants.
So you can now write:
const thisMonth = Month.MARCH;
This idea is called an enumeration because you are explicitly listing the elements of the set, and TypeScript is assigning numbers to them as their default rep values.
In the simplest use case for enumerations, the only operation you need is testing equality between values:
if (day === Day.SATURDAY || day === Day.SUNDAY) {
console.log("It's the weekend");
}
In that sense, using an enumeration can feel like you’re using primitive number
constants.
TypeScript even supports using them in switch
statements (which otherwise only allow primitive number types and their wrappers, and string
, but not other objects):
switch (direction) {
case Direction.NORTH: return "polar bears";
case Direction.SOUTH: return "penguins";
case Direction.EAST: return "elephants";
case Direction.WEST: return "llamas";
}
But unlike number
values, enumerations have more static checking:
const firstMonth: Month = Day.MONDAY; // static error: MONDAY has type Day, not type Month
TypeScript actually offers two kinds of enumerations: numeric enumerations, where the rep is a number, and string enumerations, where the rep is a string.
The default is a numeric enumeration, so the Month
enumeration above simply uses a number to represent each month, assigned in order from 0, so JANUARY
is 0, FEBRUARY
is 1, etc.
To get a string enumeration instead, just initialize each member of the enumeration to a distinct string value:
enum Direction {
NORTH = "north",
SOUTH = "south",
EAST = "east",
WEST = "west"
}
This is largely an implementation detail, because we are talking about the rep of the enumeration. But it affects the toString()
operation that is automatically generated for these enumerations. Printing a value of a numeric enumeration will just print its number (so console.log(Month.FEBRUARY)
prints 1), but for a string enumeration, it prints the string value (console.log(Direction.EAST)
prints east
).
Thus debugging may be easier with string enumerations.
reading exercises
Consider these three alternative ways to name the semester you’re registering for:
function startRegistrationFor(semester: string, year: number): void { ... }
...
startRegistrationFor("Fall", 2023);
const FALL = "Fall";
function startRegistrationFor(semester: string, year: number): void { ... }
...
startRegistrationFor(FALL, 2023);
enum Semester { IAP, SPRING, SUMMER, FALL };
function startRegistrationFor(semester: Semester, year: number): void { ... }
...
startRegistrationFor(Semester.FALL, 2023);
Which of the following are correctly-stated advantages or disadvantages of each approach?
(missing explanation)
(missing explanation)
(missing explanation)
(missing explanation)
(missing explanation)
Getters and setters
We often have observer operations that take no arguments, like this one for our MyString
interface:
interface MyString {
/**
* @returns the number of characters in this string
*/
length(): number;
// ...
}
What if we instead want to represent this observer operation (syntactically) as if it were an instance variable?
Then clients can omit the parentheses and use s.length
on a variable s
of type MyString
, just like we use Set.size
or Array.length
for those built-in types.
Declaring this observer property in the interface is straightforward:
interface MyString {
/**
* the number of characters in this string
*/
readonly length: number;
// ...
}
(Note the important readonly
– we don’t want clients to try to resize the string by assigning to this property.)
But doesn’t this sacrifice representation independence?
Doesn’t it constrain the representation we choose for MyString
, by forcing us to have a public length
instance variable in our rep?
In languages like TypeScript that support getter methods, the answer is no.
Using the keyword get
, we can define a getter method that is called whenever a client reads the length
instance variable:
class FastMyString implements MyString {
public get length(): number {
return this.end - this.start;
}
// ...
}
But a different implementation of MyString
might choose to represent length
directly as an instance variable:
class SimpleMyString implements MyString {
public readonly length;
// ...
}
The important thing is that FastMyString
and SimpleMyString
can decide whether the observer is implemented by a getter function or a variable, without affecting clients.
From the client’s point of view, length
just looks like an instance variable.
For a public instance variable that is intended to look reassignable, we can define both a getter method (observer) and a setter method (mutator). The setter method can do whatever is necessary to make the rest of the rep consistent with the reassigned value – and even throw an exception if the client has assigned a bad value.
ADTs in non-OOP languages
One more way to define an abstract data type is simply as a group of globally-accessible functions that operate on an opaque data type. This pattern is rarely seen in object-oriented languages like TypeScript and Python, but often appears in older programming languages like C. Here is an example of some file I/O in C:
FILE* f = fopen("out.txt", "w"); // open a file for writing
fputs("hello", f); // write to the file
fclose(f); // close the file
In this code, the abstract data type is FILE
, representing an open file.
The functions fopen
, fputs
, and fclose
are operations of the type, taking a FILE
as an argument or returning it as a return value.
fputs
expects the file as its second argument, oddly enough.
Because C is not object-oriented, there is no notion of classes, methods, fields, or even private
, but representation independence is still achieved.
The client has no way to look inside a FILE
value.
The only way to use a FILE
is by passing it to operations provided by the type.
One key takeaway from this is that the notion of an abstract data type does not depend on language features like classes, or interfaces, or public
/private
access control.
Data abstraction is a powerful design pattern that is ubiquitous in software engineering.
ADTs in TypeScript
We’ve now completed our TypeScript toolbox of ADT concepts from the first ADTs reading. The new concepts from this reading are highlighted in yellow below.
Constructor |
|||
Summary
TypeScript interfaces help us formalize the idea of an abstract data type as a set of operations that must be supported by a type.
Safe from bugs. An ADT is defined by its operations, and interfaces do just that. When clients use an interface type, static checking ensures that they only use methods defined by the interface. If the implementation class exposes other methods — or worse, has visible representation — the client can’t accidentally see or depend on them. When we have multiple implementations of a data type, interfaces provide static checking of the method signatures.
Easy to understand. Clients and maintainers know exactly where to look for the specification of the ADT. Since the interface doesn’t contain fields or implementations of instance methods, it’s easier to keep details of the implementation out of the specifications.
Ready for change. We can easily add new implementations of a type by adding classes that implement an interface. If we avoid constructors in favor of factory functions, clients will only see the interface. That means we can switch which implementation class clients are using without changing their code at all.
TypeScript enumerations allow defining an ADT with a small finite set of immutable values. Compared to the old-fashioned alternatives of special integer values or special strings, enumerations help make our code:
Safe from bugs. Static checking ensures that clients can’t use values outside the finite set, or confuse two different enumerated types.
Easy to understand. Named constants are not as magical as integer literals, and named types explain themselves better than
number
orstring
.Ready for change. The code that uses a particular enumeration is easy to find by searching for the enumeration’s type name, so that it can be reviewed and changed when the enumeration changes. (IDE refactoring tools can even do some of these changes automatically.) Code that depends on a particular set of string or integer constants is much harder to distinguish from code that uses strings and integers for other purposes.