6.031
6.031 — Software Construction
Fall 2021

Reading 12: Defining ADTs with Interfaces, Generics, Enums, and Functions

TypeScript Tutor exercises

Keep making progress on TypeScript by completing this category in the TypeScript Tutor:

Software in 6.031

Safe from bugsEasy to understandReady for change
Correct today and correct in the unknown future. Communicating clearly with future programmers, including future you. Designed to accommodate change without rewriting.

Objectives

Today’s class is about various ways to implement abstract data types, including:

  • interfaces: separating the interface of an ADT from its implementation;
  • generic types: defining a family of ADTs using generic type parameters;
  • enumerations: defining an ADT with a small finite set of values;
  • global functions operating on an opaque type: rare in TypeScript but common in non-object-oriented languages.

We also discuss subtyping as a relationship between two types determined by their specs, and distinguish it from subclassing, which involves reps as well.

After today’s class, you should be able to:

  • define ADTs using classes, interfaces, generics, and enumerations
  • determine whether one type is a subtype of another

Interfaces

TypeScript’s interface is a useful language mechanism for expressing an abstract data type. An interface in TypeScript is a list of method signatures without method bodies. A class implements an interface if it declares the interface in its implements clause, and provides method bodies for all of the interface’s methods. So one way to define an abstract data type in TypeScript is as an interface. The type’s implementation is then a class implementing that interface.

One advantage of this approach is that the interface can specify the contract for the client and nothing more. The interface is all a client programmer needs to read to understand the ADT. The client can’t create inadvertent dependencies on the ADT’s rep, because the rep isn’t present in the interface (not even as private fields). The implementation is kept well and truly separated, in a different class altogether.

Another advantage is that multiple different representations of the abstract data type can coexist in the same program, as different classes implementing the interface. When an abstract data type is represented just as a single class, without an interface, it’s harder to have multiple representations. In the MyString example from Abstract Data Types, MyString was a single class. We explored two different representations for MyString, but we couldn’t have both representations for the ADT in the same program.

TypeScript’s static type checking allows the compiler to catch many mistakes in implementing an ADT’s contract. For instance, it is a compile-time error to omit one of the required methods, or to give a method the wrong return type. Unfortunately, the compiler doesn’t check for us that the code adheres to the specs of those methods that are written in documentation comments.

About interfaces in TypeScript

Syntactically, a TypeScript interface contains the spec of an ADT, namely its public method signatures and public instance methods. Each instance method signature ends with a semicolon.

An interface does not include information about the rep, so it should not declare any private instance variables or include instance method bodies.

Because an interface has no rep, TypeScript doesn’t allow it to be instantiated with new. The expression new ArrayLike<number>() is a static error. To make an instance of an ArrayLike type, you need to instead instantiate a class that provides a rep for it, like new Array<number>().

reading exercises

TypeScript interfaces

Consider this TypeScript interface and TypeScript class, which are intended to represent a curve in the 2D plane:

      /** Represents an immutable curve in the plane. */
      interface Curve {

          /** @returns true if the point (x,y) lies on this curve */
          contains(x:number, y:number):boolean;

          /** @returns a curve formed by connecting this with that */
/* A */   join(that:Curve):ArrayCurve;

      }

      /** Implementation of Curve. */
      class ArrayCurve implements Curve {

          /** make a one-point curve */
          public constructor(x:number, y:number) { ... }

          /** @returns a curve formed by connecting this with that */
          public join(that:Curve):ArrayCurve { ... }

          /** @returns the total path length of this curve */
          public pathLength():number { ... }

      }

The line labeled A is a problem because Curve mentions ArrayCurve, but ArrayCurve also mentions Curve, which is circular.

(missing explanation)

The line labeled A is a problem because it isn’t representation-independent.

(missing explanation)

ArrayCurve doesn’t correctly implement Curve because it’s missing the contains() method.

(missing explanation)

ArrayCurve doesn’t correctly implement Curve because it includes a method pathLength that Curve doesn’t have.

(missing explanation)

Subtypes

Recall that a type is a set of values, with associated operations. For example, the TypeScript ArrayLike type is defined by an interface, which has the operations length and [...] indexing.

If we think about all possible ArrayLike values, none of them are actually objects of type ArrayLike: we cannot create instances of an interface. Instead, those values might be string objects, or Array objects, or objects of any class that provides the operations expected by ArrayLike (namely length and [...] indexing).

A subtype is simply a subset of the supertype: string and Array are subtypes of ArrayLike.

“B is a subtype of A” means “every B is an A.” In terms of specifications: “every B satisfies the specification for A.”

That means B is only a subtype of A if B’s specification is at least as strong as A’s specification. When we declare a class that implements an interface, the TypeScript compiler enforces part of this requirement automatically: for example, it ensures that every method in A appears in B, with a compatible type signature. Class B cannot implement interface A without implementing all of the operations declared in A.

But the compiler cannot check that we haven’t weakened the specification in other ways: strengthening the precondition on some inputs to a method, weakening a postcondition, weakening a guarantee that the interface abstract type advertises to clients. If you declare a subtype in TypeScript — implementing an interface is our current focus — then you must ensure that the subtype’s spec is at least as strong as the supertype’s.

How to declare subtypes in TypeScript

To declare that a class B is a subtype of an interface A, use implements:

class MyArray<T> implements ArrayLike<T> {
    ...
}

This declaration requires MyArray to implement (provide method bodies for) all the operations found in ArrayLike, with specs at least as strong as the specs in ArrayLike. TypeScript statically checks the types, but a human programmer must check the rest of the spec.

You can also declare that an interface is a subtype of another interface, using extends:

interface SortedArrayLike<T> extends ArrayLike<T> {
    ...
}

Because SortedArrayLike is also an interface, it doesn’t provide any implementations of ArrayLike operations. But it may strengthen the spec of the type, by strengthening existing ArrayLike operations or adding new operations.

Structural subtyping in TypeScript

In TypeScript, there is one more way for a type B to be a subtype of another type A: structural subtyping. With structural subtyping, B doesn’t have to mention A in its declaration (no B implements A or B extends A ). But if B provides at least all the operations required by A – the same public methods and public instance variables, with compatible types – then TypeScript will consider B a subtype of A.

Structural subtyping is convenient and frequently necessary in TypeScript, but it opens a hole in type safety, because it allows B to be a subtype of A even if B has an incompatible specification. Mutability and immutability offer a good example of both the advantages and pitfalls here. As we saw in the Mutability and Immutability reading, TypeScript offers a ReadonlyArray interface, which has all the observers and producers of Array, but omits the mutators. As a result, Array is a structural subtype of ReadonlyArray, so TypeScript allows us to write useful and good code like this:

const readonlyArr: ReadonlyArray<number> = [ 1, 2, 3 ];

The initializer in this declaration, [ 1, 2, 3 ], produces an Array object. Because Array is a structural subtype of ReadonlyArray, we can then assign it to a variable of type readonlyArr, and henceforth treat it like an immutable value. TypeScript’s static checking will ensure that we can’t accidentally go the other way:

const arr: Array<number> = readonlyArr; 
// static error!  ReadonlyArray is *not* a structural subtype of Array,
// because it doesn't provide Array's mutator operations

So that’s how structural subtyping helps us. But here’s the hole in type safety. Suppose that we had kept an alias to the original mutable Array:

const arr: Array<number> = [ 1, 2, 3 ];
const readonlyArr: ReadonlyArray<number> = arr;

Now we can mutate readonlyArr, using its alias arr:

arr.push(4);
console.log(readonlyArr); // prints  1,2,3,4

Using this alias, we have broken the best part of ReadonlyArray: its immutability! Even though Array is a structural subtype of ReadonlyArray, it is not a true subtype, because the Array contract does not provide immutability.

So structural subtyping is convenient, but should be used with care. This example has shown a common use of it — constructing a list, map, or set using a mutable type, and then assigning it to a ReadonlyArray, ReadonlyMap, or ReadonlySet to treat it as immutable from then on. It’s important to throw away all aliases that use the original mutable type, to avoid the risk shown here.

reading exercises

Immutable shapes

Let’s define an interface for rectangles:

/** An immutable rectangle. */
interface ImmutableRectangle {
    /** @returns the width of this rectangle */
    getWidth():number;
    /** @returns the height of this rectangle */
    getHeight():number;
}

We would like to say that every square is a rectangle:

/** An immutable square. */
class ImmutableSquare implements ImmutableRectangle {
    private readonly side:number;
    /** Make a new side x side square. */
    public constructor(side:number) { this.side = side; }
    /** @returns the width of this square */
    public getWidth():number { return this.side; }
    /** @returns the height of this square */
    public getHeight():number { return this.side; }
}

Does ImmutableSquare.getWidth() satisfy the spec of ImmutableRectangle.getWidth()?

Does ImmutableSquare.getHeight() satisfy the spec of ImmutableRectangle.getHeight()?

Does the whole ImmutableSquare spec satisfy the ImmutableRectangle spec?

(missing explanation)

Mutable shapes

Now let’s look at mutable versions of rectangles and squares.

/** A mutable rectangle. */
interface MutableRectangle {
    // ... same methods as ImmutableRectangle above ...
    /** Set this rectangle's dimensions to width x height. */
    setSize(width:number, height:number);
}

Surely every square is still a rectangle?

/** A mutable square. */
class MutableSquare implements MutableRectangle /* hopefully? */ {
    private side:number;
    // ... same constructor and methods as ImmutableSquare above ...
    // TODO implement setSize(..)
}

For each possible MutableSquare.setSize(..) spec below, is it a valid strengthening of MutableRectangle.setSize()?

/** 
 * Set this square's dimensions to width x height.
 * Requires width = height.
 */
public setSize(width:number, height:number) { ... }

(missing explanation)

/** 
 * Set this square's dimensions to width x height.
 * @throws Error if width != height
 */
public setSize(width:number, height:number) { ... }

(missing explanation)

/**
 * If width = height, set this square's dimensions to width x height.
 * Otherwise, new dimensions are unspecified.
 */
public setSize(width:number, height:number) { ... }

(missing explanation)

Example: MyString

Let’s revisit MyString. Using an interface instead of a class for the ADT, we can support multiple implementations:

/**
 * MyString represents an immutable sequence of characters.
 */
interface MyString { 

    // We'll skip this creator operation for now
    // /**
    //  * @param s 
    //  * @returns MyString representing the sequence of characters in s
    //  */
    // public constructor(s:string) { ... }

    /**
     * @returns number of characters in this string
     */
    length():number;

    /**
     * @param i character position (requires 0 <= i < string length)
     * @returns character at position i
     */
    charAt(i:number):string;

    /**
     * Get the substring between start (inclusive) and end (exclusive).
     * @param start starting index
     * @param end ending index.  Requires 0 <= start <= end <= string length.
     * @returns string consisting of charAt(start)...charAt(end-1)
     */
    substring(start:number, end:number):MyString;
}

Here’s our first implementation:

class SimpleMyString implements MyString {

    private a: Uint16Array;

    public constructor(s: string) {
        this.a = new Uint16Array(s.length);
        for (let i = 0; i < s.length; ++i) {
           this.a[i] = s.charCodeAt(i);
        }
    }

    public length(): number {
        return this.a.length;
    }

    public charAt(i: number): string {
        return String.fromCharCode(this.a[i]);
    }

    public substring(start: number, end: number): MyString {
        const that = new SimpleMyString("");  // make a temporarily-empty object
        that.a = this.a.slice(start, end);    // ... and immediately replace its array
        return that;
    }
}

And here’s the optimized implementation:

class FastMyString implements MyString {

    private a: Uint16Array;
    private start:number;
    private end:number;

    public constructor(s: string) {
        this.a = new Uint16Array(s.length);
        for (let i = 0; i < s.length; ++i) {
            this.a[i] = s.charCodeAt(i);
        }
        this.start = 0;
        this.end = this.a.length;
    }

    public length(): number {
        return this.end - this.start;
    }

    public charAt(i: number): string {
        return String.fromCharCode(this.a[this.start + i]);
    }

    public substring(start: number, end: number): MyString {
        const that = new FastMyString(""); // make a temporarily-empty object
        that.a = this.a;                   // ... and immediately replace its instance variables 
        that.start = this.start + start;
        that.end = this.start + end;
        return that;
    }
}

How will clients use this ADT? Here’s an example:

const s:MyString = new FastMyString("good morning");
console.log("The first character is: " + s.charAt(0));

Unfortunately, this pattern breaks the abstraction barrier we’ve worked so hard to build between the abstract type and its concrete representations. Clients must know the name of the concrete representation class. Because an interface in TypeScript has no constructor, a client must directly call one of the concrete class constructors. The spec of that constructor won’t appear anywhere in the interface, so there’s no static guarantee that different implementations will even provide the same constructors.

Instead of using a constructor, then, we can implement the creator operation as a factory function:

/**
 * @param s 
 * @returns MyString representing the sequence of characters in s
 */
function makeMyString(s:string):MyString {
    return new FastMyString(s);
}

Now a client can use the ADT without breaking the abstraction barrier:

const s:MyString = makeMyString("good morning");
console.log("The first character is: " + s.charAt(0));

Hiding the implementation completely is a tradeoff, because sometimes the client wants a choice of implementations with different characteristics.

reading exercises

Code review

Let’s review the code for FastMyString. Which of these are useful criticisms:

The abstraction function should be documented

(missing explanation)

The representation invariant should be documented

(missing explanation)

The rep fields should be readonly so they cannot be reassigned

(missing explanation)

The charAt specification exposes that the rep contains individual characters, and it shouldn’t.

(missing explanation)

The charAt implementation should behave more helpfully when i is greater than the length of the string

(missing explanation)

Why interfaces?

Interfaces are used frequently in real code. Not every class is associated with an interface, but there are a few good reasons to bring an interface into the picture.

  • Documentation for both the compiler and for humans. Not only does an interface help the compiler catch ADT implementation bugs, but it is also much more useful for a human to read than the code for a concrete implementation. Such an implementation intersperses ADT-level types and specs with implementation details.
  • Allowing performance trade-offs. Different implementations of the ADT can provide methods with very different performance characteristics. Different applications may work better with different choices, but we would like to code these applications in a way that is representation-independent. From a correctness standpoint, it should be possible to drop in any new implementation of a key ADT with simple, localized code changes.
  • Methods with intentionally underdetermined specifications. An ADT for finite sets could leave unspecified the element order one gets when converting to an array. Some implementations might use slower method implementations that manage to keep the set representation in some sorted order, allowing quick conversion to a sorted array. Other implementations might make many methods faster by not bothering to support conversion to sorted arrays.
  • Multiple views of one class. A TypeScript class may implement multiple interfaces. For instance, a user interface widget displaying a drop-down list is natural to view as both a widget and a list. The class for this widget could implement both interfaces. In other words, we don’t always implement an ADT multiple times just because we are choosing different data structures. Those implementations may be various different sorts of objects that can be seen as special cases of the ADT.
  • More and less trustworthy implementations. Another reason to implement an interface more than once might be that it is easy to build a simple implementation that you believe is correct, while you can work harder to build a fancier version that is more likely to contain bugs. You can choose implementations for applications based on how bad it would be to get bitten by a bug.

reading exercises

Suppose you have an abstract data type for rational numbers, which is currently represented as a TypeScript class:

class Rational {
    ...
}

You decide to change Rational to a TypeScript interface instead, along with an implementation class called IntFraction:

interface Rational {
    ...
}

class IntFraction implements Rational {
    ...
}

For each piece of code below, taken from your original Rational class, first identify the role it plays in the ADT. Then decide where it should go in your new ADT design: in the interface Rational, in the implementation class IntFraction, or both?

Interface + implementation 1
private readonly numerator:number;
private readonly denominator:number;

This piece of code is, or is part of:
(check all that apply)

It should be put in:
 

(missing explanation)

Interface + implementation 2
//   denominator > 0
//   numerator and denominator are relatively prime

This piece of code is, or is part of:
(check all that apply)

It should be put in:
 

(missing explanation)

Interface + implementation 3
//   AF(numerator, denominator) = numerator / denominator

This piece of code is, or is part of:
(check all that apply)

It should be put in:
 

(missing explanation)

Interface + implementation 4
    /**
     * @param that another Rational
     * @returns a Rational equal to (this / that)
     */

This piece of code is, or is part of:
(check all that apply)

It should be put in:
 

(missing explanation)

Interface + implementation 5
    isZero():boolean

This piece of code is, or is part of:
(check all that apply)

It should be put in:
 

(missing explanation)

Interface + implementation 6
    return this.numerator === 0;

This piece of code is, or is part of:
(check all that apply)

It should be put in:
 

(missing explanation)

Subclassing

Implementing an interface is one way to make a subtype of another type. Another technique, which you may have seen before, is subclassing. Subclassing means defining a class as an extension, or subclass, of another class, so that:

  • the subclass automatically inherits the instance methods of its superclass, including their method bodies
    • the subclass can override any of those inherited method bodies by substituting its own method body
  • the subclass also inherits the private rep of its superclass
  • the subclass can also add new instance methods and rep fields for its own purposes

The boldfaced parts in the list above show how subclassing differs from implementing an interface. An interface has neither method bodies nor rep, so it’s not possible to inherit them when you implement an interface.

Subclassing is a common feature of object-oriented programming languages. You may have already encountered it in Python, where you can write class SpottedTurtle(Turtle): ... to define a new class SpottedTurtle as a subclass of the class Turtle. TypeScript provides the same effect with the syntax class SpottedTurtle extends Turtle { ... }.

Just like implementing an interface, subclassing should imply subtyping. If SpottedTurtle is a subclass of Turtle, then SpottedTurtle needs to have at least as strong a spec as Turtle, because TypeScript will allow SpottedTurtle objects to be used anywhere that a Turtle might be expected.

But unlike implementing an interface, a subclass inherits not only the spec of its superclass but also its rep – the fields that implement it. At first this seems like a great idea. Reusing implementation ought to make our code more DRY, and more ready for change.

But years of experience by API designers have revealed subtle challenges to making subclassing safe from bugs and ready for change. You can read more if you’re interested, but essentially, inheriting the rep means:

  • rep exposure between the superclass and all its subclasses
  • rep dependence between superclass and subclasses
  • superclass and subclass can inadvertently break each other’s rep invariants

Designing for safe subclassing means that the superclass must now offer two contracts: one for interaction with clients, and one for subclasses. These issues simply do not arise with interfaces.

Overriding and dynamic dispatch

In one respect, however, we can’t get away from thinking about subclassing when we write classes in TypeScript. Every TypeScript class is automatically a subclass of Object, and automatically inherits methods like toString(). We often need to override this inherited implementation, replacing it with our own implementation, in order for our class to behave correctly.

To illustrate, let’s look at a simpler case: overriding the toString method for our FastMyString class. The default toString() provided by Object is not particularly useful for debugging:

const fms:FastMyString = new FastMyString("hello");
fms.toString() → "[object Object]" // not useful!

When a method has multiple implementations, such as when a class overrides a method like toString() that already has an implementation in its superclass Object, then TypeScript has to determine which of the implementations to use when the method is called. This process is called dispatching to the method, and TypeScript’s rule is dynamic dispatch – it uses the implementation appropriate to the dynamic type of the object, not the static type of the reference that points to the object.

Let’s make this better by overriding toString():

class FastMyString {
    ...
    public toString():string {
      let s = "";
      for (let i = 0; i < this.length(); ++i) {
          s += this.charAt(i);
      }
      return s;
    }
}

After this change, we get:

const fms:FastMyString = new FastMyString("hello");
fms.toString() → "hello"
fms:FastMyStringFastMyString"hello"obj:ObjectFastMyString"hello"

In this example, FastMyString is the type of both the variable fms and the object it points to. But they don’t have to have to be the same type. The variable can be a supertype, while the object itself is from a subtype. We have already done this when we used a variable of type MyString to hold a reference to FastMyString. But because Object is a supertype of all classes in TypeScript, we can also do this:

const obj:Object = new FastMyString("hello");
obj.toString() → ?

But now there seems to be an ambiguity. Object has its own implementation of toString(). When we call obj.toString(), does it call the original Object implementation, or the overridden FastMyString implementation? Do we get "[object Object]" or "hello"?

The dynamic dispatch rule says that the type of the object at runtime determines the method implementation to call. This type is not known until runtime, hence the word “dynamic.” The type of the obj variable, by contrast, is known at compile time.

Using dynamic dispatch, here is what we get:

const obj:Object = new FastMyString("hello");
obj.toString() → "hello"

…the same as we would get from fms.toString().

The effect of the dynamic dispatch rule is that it doesn’t matter what type of reference you have pointing to an object, it will always use the object’s type for method dispatch.

reading exercises

Static vs. dynamic types
let arr:ReadonlyArray<string> = ["abc"];
let obj:Object = arr;

What is the static type of arr?

(missing explanation)

What is the dynamic type of arr?

(missing explanation)

What is the static type of obj?

(missing explanation)

What is the dynamic type of obj?

(missing explanation)

Static typing vs. dynamic dispatch
const arr:ReadonlyArray<string> = ["abc"];
const obj:Object = arr;
/* next expression here */

What does each of the following do, if substituted for /* next expression here */ in the code above?

arr.toString()

(missing explanation)

obj.toString()

(missing explanation)

arr.length

(missing explanation)

obj.length

(missing explanation)

Assignment
let arr:ReadonlyArray<string> = ["abc"];
let obj:Object = arr;
/* next expression here */

What does each of the following do, if substituted for /* next expression here */ in the code above?

arr = obj;

(missing explanation)

arr = obj as ReadonlyArray<string>;

(missing explanation)

obj = "abc";

(missing explanation)

arr = "abc";

(missing explanation)

Generic types

A useful feature of the TypeScript type system, and other modern statically-typed languages too, is a generic type: a type whose specification is in terms of a placeholder type to be filled in later.

The collection types are good examples of this. Set<T> is the ADT of finite sets of elements of some other type T. Instead of writing separate specifications and implementations for Set<string>, Set<Integer>, and so on, we can design one interface Set<T>, which represents an entire family of set ADTs.

Here is a simplified version of the Set interface, MySet:

/**
 * A mutable set.
 * @param <T> type of elements in the set
 */
interface MySet<T> {

    // example observer operations

    /**
     * Get size of the set.
     * @returns the number of elements in this set
     */
     size(): number;

    /**
     * Test for membership.
     * @param e an element
     * @returns true iff this set contains e
     */
    has(e:T): boolean;

    // example mutator operations

    /**
     * Modifies this set by adding e to the set.
     * @param e element to add
     */
    add(e:T): void;

    /**
     * Modifies this set by removing e, if found.
     * If e is not found in the set, has no effect.
     * @param e element to remove
     */
    delete(e:T): void;

}

Notice how the specs are in terms of our abstract notion of a set; it would be malformed to mention the details of any particular implementation of sets with particular private fields. These specs should apply to any valid implementation of the MySet ADT.

We will also want a creator operation:

// example creator operation
/**
 * Make an empty set.
 * @param <U> type of elements in the set
 * @returns a new set instance, initially empty
 */
 function makeMySet<U>():MySet<U> { ... }

The make operation is implemented as a factory function. Clients will write code like:
const strings:MySet<string> = makeMySet();
and the compiler will infer that the new MySet should be a set of string objects.

Note that a TypeScript syntax requirement arises in the method signature above: a function like makeMySet needs to independently declare its type parameter before the parameter list, here shown as function makeMySet<U>. We called the type parameter U instead of T just to highlight that it’s an arbitrary name, just like the name of a value parameter. The instance methods above don’t have this extra <...> declaration because they always use the type parameter declared by the enclosing class or interface, interface MySet<T>.)

Implementing generic interfaces

Suppose we want to implement the generic MySet<T> interface above. We can either write a non-generic implementation that replaces T with a specific type, or a generic implementation that keeps the placeholder.

Generic interface, non-generic implementation. Let’s implement MySet<T> for a particular type T. We’ll implement a set of digits:

class DigitSet implements MySet<number>

Wherever the interface mentions placeholder type T, the DigitSet implementations replace T with number. For example:

/** represents a mutable set of elements of type T */
interface MySet<T> {

    // ...

    /**
     * Test for membership.
     * @param e an element
     * @returns true iff this set contains e
     */
    has(e:T): boolean;

    /**
     * Modifies this set by adding e to the set.
     * @param e element to add
     */
    add(e:T): void;

    // ...
}
/** represents a mutable set drawn from the digits {0...9} */
class DigitSet implements MySet<number> {

    private readonly digitInSet:Array<boolean> = 
        [ false, false, false, false, false,
          false, false, false, false, false ];


    /**
     * Test for membership.
     * @param e an element, must be integer {0...9}
     * @returns true iff this set contains e
     */
    public has(e:number):boolean {
        this.checkRep();
        return this.digitInSet[e];
    }

    /**
     * Modifies this set by adding e to the set.
     * @param e element to add, must be integer {0...9}
     */
    public add(e:number): void {
        this.digitInSet[e] = true;
        this.checkRep();
    }
    // ...
}

Note that this representation for DigitSet is not suited for representing sets of arbitrary-type elements. Its rep cannot represent a MySet<string> without careful work to define a new rep invariant and abstraction function that can handle arbitrary strings.

Generic interface, generic implementation. We can also implement the generic MySet<T> interface without picking a type for T. In that case, we write our code blind to the actual type that clients will choose for T.

interface MySet<T> {

    // ...
                                               
class HashSet<T> implements MySet<T> {

    // ...
     

A generic implementation can only rely on details of a placeholder type T that are explicitly included in the generic interface specification. HashSet can’t call methods on its elements that are only declared for some specific type like string, because T might be any type.

reading exercises

Implementing generics

Consider again our MySet interface:

interface MySet<T> {
    has(e:T): boolean;
    // ...
}

What is wrong (if anything) with the following pieces of code that try to create new ADTs based on MySet?

      /** Represents a set of integers. */
/*1*/ class IntSet implements MySet<number> {
/*2*/     public has(e:T):boolean { ... }
          // ...
      }

(missing explanation)

      /** Represents a set that can grow but never shrink. */
/*1*/ interface GrowingSet<Z> extends MySet<T> {
/*2*/     has(z:Z):boolean;
          // ...
      }

(missing explanation)

      /** Represents a set whose for-iteration returns the elements in the order they were added. */
/*1*/ class OrderedSet<T> implements MySet<T> {
/*2*/     public has(e:T):boolean { ... }
          // ...
      }

(missing explanation)

Changing the spec

Consider another possible operation on our MySet interface:

interface MySet<T> {
    /**
     * Picks an element from the set.
     * @returns an element in the set
     * @throws NoSuchElementError if set is empty
     */
    pick():T;

    // ...
}

This operation is underdetermined – it may have many possible legal return values for a particular set.

Suppose SimpleSet implements MySet, and wants to write its own spec for pick:

class SimpleSet<T> implements MySet<T> {

    private readonly elementList:Array<T> = [];
    // ...

    /**
     * <new spec here>
     */
    public pick():T {
        ...
    }
}

(missing explanation)

(missing explanation)

(missing explanation)

Hiding the implementation class

Suppose we wanted MySet to have its own creator operation emptySet(), so that clients don’t have to know about the existence of a specific implementation like SimpleSet.

(missing explanation)

(missing explanation)

Enumerations

Sometimes an ADT has a small, finite set of immutable values, such as:

  • months of the year: January, February, …
  • days of the week: Monday, Tuesday, …
  • compass points: north, south, east, west
  • line caps in a line drawing: butt, round, square

A type like this may be used as part of a more complex type (like DateTime or Latitude), or as a parameter that changes the behavior of a method (like drawLine).

When the set of values is small and finite, it makes sense to define all the values as named constants, called an enumeration. TypeScript has the enum construct to make this convenient:

enum Month { JANUARY, FEBRUARY, MARCH, ..., DECEMBER };

This enum defines a new type name, Month, in the same way that class and interface define new type names. It also defines a set of named values, which we write in all-caps because they are effectively public static readonly constants. So you can now write:

const thisMonth = Month.MARCH;

This idea is called an enumeration because you are explicitly listing the elements of the set, and TypeScript is assigning numbers to them as their default rep values.

In the simplest use case for enumerations, the only operation you need is testing equality between values:

if (day === Day.SATURDAY || day === Day.SUNDAY) {
  console.log("It's the weekend");
}

In that sense, using an enumeration can feel like you’re using primitive number constants. TypeScript even supports using them in switch statements (which otherwise only allow primitive number types and their wrappers, and string, but not other objects):

switch (direction) {
    case Direction.NORTH: return "polar bears";
    case Direction.SOUTH: return "penguins";
    case Direction.EAST:  return "elephants";
    case Direction.WEST:  return "llamas";
}

But unlike number values, enumerations have more static checking:

const firstMonth: Month = Day.MONDAY; // static error: MONDAY has type Day, not type Month 

TypeScript actually offers two kinds of enumerations: numeric enumerations, where the rep is a number, and string enumerations, where the rep is a string. The default is a numeric enumeration, so the Month enumeration above simply uses a number to represent each month, assigned in order from 0, so JANUARY is 0, FEBRUARY is 1, etc. To get a string enumeration instead, just initialize each member of the enumeration to a distinct string value:

enum Direction {
  NORTH = "north",
  SOUTH = "south",
  EAST = "east",
  WEST = "west"
}

This is largely an implementation detail, because we are talking about the rep of the enumeration. But it affects the toString() operation that is automatically generated for these enumerations. Printing a value of a numeric enumeration will just print its number (so console.log(Month.FEBRUARY) prints 1), but for a string enumeration, it prints the string value (console.log(Direction.EAST) prints east). Thus debugging may be easier with string enumerations.

reading exercises

Semesters

Consider these three alternative ways to name the semester you’re registering for:

  • with a string literal:
function startRegistrationFor(semester:string, year:number):void { ... }
...
startRegistrationFor("Fall", 2023);
  • with a named constant of type string:
const FALL = "Fall";
function startRegistrationFor(semester:string, year:number):void { ... }
...
startRegistrationFor(FALL, 2023);
  • with a value of an enumeration:
enum Semester { IAP, SPRING, SUMMER, FALL };
function startRegistrationFor(semester:Semester, year:number):void { ... }
...
startRegistrationFor(Semester.FALL, 2023);

Which of the following are correctly-stated advantages or disadvantages of each approach?

(missing explanation)

(missing explanation)

(missing explanation)

(missing explanation)

(missing explanation)

Getters and setters

We often have observer operations that take no arguments, like this one for our MyString interface:

interface MyString {
    /**
     * @returns the number of characters in this string
     */
    length():number;

    // ...
}

What if we instead want to represent this observer operation (syntactically) as if it were an instance variable? Then clients can omit the parentheses and use s.length on a variable s of type MyString, just like we use Set.size or Array.length for those built-in types. Declaring this observer property in the interface is straightforward:

interface MyString {
    /**
     * the number of characters in this string
     */
    readonly length: number;

    // ...
}

(Note the important readonly – we don’t want clients to try to resize the string by assigning to this property.)

But doesn’t this sacrifice representation independence? Doesn’t it constrain the representation we choose for MyString, by forcing us to have a public length instance variable in our rep?

In languages like TypeScript that support getter methods, the answer is no. Using the keyword get, we can define a getter method that is called whenever a client reads the length instance variable:

class FastMyString implements MyString {

    public get length(): number {
        return this.end - this.start;
    }

    // ... 
}

But a different implementation of MyString might choose to represent length directly as an instance variable:

class SimpleMyString implements MyString {

    public readonly length;

    // ...
}

The important thing is that FastMyString and SimpleMyString can decide whether the observer is implemented by a getter function or a variable, without affecting clients. From the client’s point of view, length just looks like an instance variable.

For a public instance variable that is intended to look reassignable, we can define both a getter method (observer) and a setter method (mutator). The setter method can do whatever is necessary to make the rest of the rep consistent with the reassigned value – and even throw an exception if the client has assigned a bad value.

ADTs in non-OOP languages

One more way to define an abstract data type is simply as a group of globally-accessible functions that operate on an opaque data type. This pattern is rarely seen in object-oriented languages like TypeScript and Python, but often appears in older programming languages like C. Here is an example of some file I/O in C:

FILE* f = fopen("out.txt", "w"); // open a file for writing
fputs("hello", f); // write to the file
fclose(f);  // close the file

In this code, the abstract data type is FILE, representing an open file. The functions fopen, fputs, and fclose are operations of the type, taking a FILE as an argument or returning it as a return value. fputs expects the file as its second argument, oddly enough.

Because C is not object-oriented, there is no notion of classes, methods, fields, or even private, but representation independence is still achieved. The client has no way to look inside a FILE value. The only way to use a FILE is by passing it to operations provided by the type.

One key takeaway from this is that the notion of an abstract data type does not depend on language features like classes, or interfaces, or public/private access control. Data abstraction is a powerful design pattern that is ubiquitous in software engineering.

ADTs in TypeScript

We’ve now completed our TypeScript toolbox of ADT concepts from the first ADTs reading:

ADT concept

Ways to do it in TypeScript

Examples

Abstract data type

Class

Date

Interface + class(es) 1

ArrayLike and Array

Enum 2

PenColor

Creator operation

Constructor

Array()

Static (factory) method

Array.of()

Constant 3

Number.POSITIVE_INFINITY

Observer operation

Instance method

String.charAt()

Static method

Object.entries()

Getter

Map.size

Producer operation

Instance method

String.trim()

Static method

Math.floor()

Mutator operation

Instance method

Array.push()

Static method

Object.assign()

Setter

Representation

private fields

Summary

TypeScript interfaces help us formalize the idea of an abstract data type as a set of operations that must be supported by a type.

This helps make our code…

  • Safe from bugs. An ADT is defined by its operations, and interfaces do just that. When clients use an interface type, static checking ensures that they only use methods defined by the interface. If the implementation class exposes other methods — or worse, has visible representation — the client can’t accidentally see or depend on them. When we have multiple implementations of a data type, interfaces provide static checking of the method signatures.

  • Easy to understand. Clients and maintainers know exactly where to look for the specification of the ADT. Since the interface doesn’t contain fields or implementations of instance methods, it’s easier to keep details of the implementation out of the specifications.

  • Ready for change. We can easily add new implementations of a type by adding classes that implement an interface. If we avoid constructors in favor of factory functions, clients will only see the interface. That means we can switch which implementation class clients are using without changing their code at all.

TypeScript enumerations allow defining an ADT with a small finite set of immutable values. Compared to the old-fashioned alternatives of special integer values or special strings, enumerations help make our code:

  • Safe from bugs. Static checking ensures that clients can’t use values outside the finite set, or confuse two different enumerated types.

  • Easy to understand. Named constants are not as magical as integer literals, and named types explain themselves better than number or string.

  • Ready for change. The code that uses a particular enumeration is easy to find by searching for the enumeration’s type name, so that it can be reviewed and changed when the enumeration changes. (IDE refactoring tools can even do some of these changes automatically.) Code that depends on a particular set of string or integer constants is much harder to distinguish from code that uses strings and integers for other purposes.