# Reading 19: Writing a Program with Abstract Data Types

#### Software in 6.031

#### Objectives

After today’s class, you should:

- Understand how to apply specification, testing, and iteration at different levels of software development: from single functions, to abstract data types, to entire programs

## Introduction

Bringing together what we’ve learned about abstract data types and recursive data types, this reading demonstrates the development of a recursive ADT to solve a simple problem.

## Recipes for programming

Recall the test-first programming approach for

**Writing a procedure** (a static method):

**Spec.**Write the spec, including the method signature (name, argument types, return types, exceptions), and the precondition and the postcondition as a Javadoc comment.**Test.**Create systematic test cases and put them in a JUnit class so you can run them automatically.- You may have to go back and change your spec when you start to write test cases. Just the process of writing test cases puts pressure on your spec, because you’re thinking about how a client would call the method. So steps 1 and 2 iterate until you’ve got a better spec and some good test cases.
- Make sure at least some of your tests are
*failing*at first. A test suite that passes all tests even when you haven’t implemented the method is not a good test suite for finding bugs.

**Implement.**Write the body of the method. You’re done when the tests are all green in JUnit.- Implementing the method puts pressure on both the tests and the specs, and you may find bugs that you have to go back and fix. So finishing the method may require changing the implementation, the tests, and the specs, and bouncing back and forth among them.

Let’s broaden this to a recipe for

**Writing an abstract data type**:

**Spec.**Write specs for the operations of the datatype, including method signatures, preconditions, and postconditions.**Test.**Write test cases for the ADT’s operations.- Again, this puts pressure on the spec. You may discover that you need operations you hadn’t anticipated, so you’ll have to add them to the spec.

**Implement.**For an ADT, this part expands to:**Choose rep.**Write down the private fields of a class, or the variants of a recursive datatype. Write down the rep invariant and abstraction function as a comment.**Assert rep invariant.**Implement a`checkRep()`

method that enforces the rep invariant. This is critically important if the rep invariant is nontrivial, because it will catch bugs much earlier.**Implement operations.**Write the method bodies of the operations, making sure to call`checkRep()`

in them. You’re done when the tests are all green in JUnit.

And let’s broaden it further to a recipe for

**Writing a program** (consisting of ADTs and procedures):

**Choose datatypes.**Decide which ones will be mutable and which immutable.**Choose procedures.**Write your top-level procedure and break it down into smaller steps.**Spec.**Spec out the ADTs and procedures. Keep the ADT operations simple and few at first. Only add complex operations as you need them.**Test.**Write test cases for each unit (ADT or procedure).**Implement simply first.**Choose simple, brute-force representations. The point here is to put pressure on the specs and the tests, and try to pull your whole program together as soon as possible. Make the whole program work correctly first. Skip the advanced features for now. Skip performance optimization. Skip corner cases. Keep a to-do list of what you have to revisit.**Iterate**. Now that it’s all working, make it work better. Reimplement, optimize, redesign if necessary.

## Problem: matrix multiplication

Suppose we want to represent matrix multiplications in a way that allows us to optimize the computation.

For example, if *a*, *b* are scalar constants, and *X* is a big matrix, then

is slow to compute because it loops over the matrix *X* twice: once to multiply it by *a*, and then again to multiply it by *b*.
It’d be equivalent and cheaper to do:

That’s just one example of an optimization we could make by rearranging the products in a matrix multiplication. (Remember however that matrix multiplication is associative but not commutative; only scalars commute.)

## Spec

### Choose datatypes

Let’s call this a `MatrixExpression`

.
To make the definitions easier to read, we’ll abbreviate that to `MatExpr`

.

## Test

Let’s test `optimize()`

.
Partitions:

- partition on number of scalars in expression: 0, 1, 2, >2
- partition on position of scalar in expression tree: left child, right child, left-child-of-left-child, left-child-of-right-child, right-child-of-left-child, right-child-of-right-child, more than 2 levels deep

## Implement

### Choose a rep

This problem is a natural one for a recursive datatype.

```
/** Represents an immutable expression of matrix and scalar products. */
public interface MatrixExpression {
// Datatype definition:
// MatExpr = Identity + Scalar(double)
// + Matrix(double[][]) + Product(MatExpr, MatExpr)
// ...
}
class Identity implements MatrixExpression {
public Identity() {
}
}
class Scalar implements MatrixExpression {
private final double value;
// RI: true
// AF(value) = the real scalar represented by value
public Scalar(double value) {
this.value = value;
}
}
class Matrix implements MatrixExpression {
private final double[][] array;
// RI: array.length > 0, and all array[i] are equal nonzero length
// AF(array) = the matrix with array.length rows and array[0].length columns
// whose (row,column) entry is array[row][column]
public Matrix(double[][] array) {
this.array = array; // note: danger!
}
}
class Product implements MatrixExpression {
private final MatrixExpression m1;
private final MatrixExpression m2;
// RI: m1's column count == m2's row count, or m1 or m2 is scalar
// AF(m1, m2) = the matrix product m1*m2
public Product(MatrixExpression m1, MatrixExpression m2) {
this.m1 = m1;
this.m2 = m2;
}
}
```

### Choose an identity

It’s always good to have a value in the datatype that represents nothing, so that we can avoid using `null`

.
For a matrix product, the natural choice is the identity matrix — an empty product expression is just the identity anyway.
So let’s define that:

```
/** Identity for all matrix computations. */
public static final MatrixExpression I = new Identity();
```

Unfortunately, we’ll see that this is not a perfect situation: other MatExprs might *also* be the identity.

### Implementing `make`

: use factory methods

Let’s start implementing, starting with the `make()`

creators.

We don’t want to expose our concrete rep classes `Scalar`

, `Matrix`

, and `Product`

, so that clients won’t depend on them and we’ll be able to change them later (being *ready for change*).

So we’ll create static factory methods in `MatrixExpression`

to implement `make()`

:

```
/** @return a matrix expression consisting of just the scalar value */
public static MatrixExpression make(double value) {
return new Scalar(value);
}
/** @return a matrix expression consisting of just the matrix given */
public static MatrixExpression make(double[][] array) {
return new Matrix(array);
}
```

### Implementing `isIdentity`

: don’t use `instanceof`

Now let’s implement the `isIdentity`

observer.
Here’s a **bad** way to do it:

```
// don't do this
public static boolean isIdentity(MatrixExpression m) {
if (m instanceof Scalar) {
return ((Scalar)m).value == 1;
} else if (m instanceof Matrix) {
// ... check for 1s on the diagonal and 0s everywhere else
} else ... // do the right thing for other variant classes
}
```

In general, using `instanceof`

in object-oriented programming is a bad smell.
Break the operation down into pieces that are appropriate for each class, and write instance methods instead:

#### reading exercises

class Matrix implements MatrixExpression {
private final double[][] array;
// RI: array.length > 0, and all array[i] are equal nonzero length
// AF(array) = the matrix with array.length rows and array[0].length columns
// whose (row,column) entry is array[row][column]
...
public boolean isIdentity() {
if (array.length != array[0].length) {

}
for (int row = 0; row < array.length; row++) {
for (int col = 0; col < array[row].length; col++) {
double expected = (row == col) ? 1 : 0;
if (array[row][col] != expected) {
}
}
}
}
}

(missing explanation)

```
class Product implements MatrixExpression {
private final MatrixExpression m1, m2;
// RI: m1's column count == m2's row count, or m1 or m2 is scalar
// AF(m1, m2) = the matrix product m1*m2
...
public boolean isIdentity() {
return m1.isIdentity() ❓ m2.isIdentity();
}
}
```

Pick the best operator to replace ❓:

(missing explanation)

Implementing `isIdentity`

exposes an issue that we *should* have first discovered by writing test cases: we will not always return `true`

for a `Product`

whose value is the identity matrix (e.g. A × A^{-1}).
We probably want to resolve this by *weakening* the spec of `isIdentity`

, so that it doesn’t guarantee to identify every multiplicative identity:

### Implementing `optimize`

without `instanceof`

Let’s implement `optimize()`

.
Again, here’s a **bad** way to do it, which will quickly get us mired in weeds:

```
// don't do this
class Product implements MatrixExpression {
public MatrixExpression optimize() {
if (m1 instanceof Scalar) {
...
} else if (m2 instanceof Scalar) {
...
}
...
}
}
```

If you find yourself writing code with `instanceof`

checks all over the place, you need to take a step back and rethink the problem.

In particular, to optimize the scalars, we’re going to need two recursive helper operations, so we’ll add them to our abstract datatype:

These expressions will allow us to pull the scalars out of an expression and move them together in a single multiplication expression.

```
/** Represents an immutable expression of matrix and scalar products. */
public interface MatrixExpression {
// ...
/** @return the product of all the scalars in this expression */
public MatrixExpression scalars();
/** @return the product of all the matrices in this expression in order.
* times(scalars(), matrices()) is equivalent to this expression. */
public MatrixExpression matrices();
}
```

Now we’ll implement them as expected:

```
class Identity implements MatrixExpression {
// no fields
...
public MatrixExpression scalars() { return this; }
public MatrixExpression matrices() { return this; }
}
class Scalar implements MatrixExpression {
private final double value;
...
public MatrixExpression scalars() { return this; }
public MatrixExpression matrices() { return I; }
}
class Matrix implements MatrixExpression {
private final double[][] array;
...
public MatrixExpression scalars() { return I; }
public MatrixExpression matrices() { return this; }
}
class Product implements MatrixExpression {
private final MatrixExpression m1, m2;
...
public MatrixExpression scalars() {
return times(m1.▶▶A◀◀, m2.▶▶B◀◀);
}
public MatrixExpression matrices() {
return times(m1.▶▶C◀◀, m2.▶▶D◀◀);
}
}
```

#### reading exercises

With these helper functions, `optimize()`

can just separate the scalars and the matrices:

```
class Identity implements MatrixExpression {
...
public MatrixExpression optimize() { return this; }
}
class Scalar implements MatrixExpression {
...
public MatrixExpression optimize() { return this; }
}
class Matrix implements MatrixExpression {
...
public MatrixExpression optimize() { return ▶▶A◀◀; }
}
class Product implements MatrixExpression {
...
public MatrixExpression optimize() {
return times(▶▶B◀◀);
}
}
```