Problem Set 2: Multi-Startup Set

The deadlines for this problem set are shown on the course calendar.

The purpose of this problem set is to practice designing, testing, and implementing abstract data types. This problem set focuses on implementing mutable types where the specifications are provided to you; the next problem set will focus on immutable types where you must choose specifications.

Design Freedom and Restrictions

On several parts of this problem set, the classes and methods will be yours to specify and create, but you must pay attention to the PS2 instructions sections in the provided documentation.

You must satisfy the specifications of the provided interfaces, classes, and functions. You are not permitted to change the specifications at all for most modules. You may strengthen, but must not otherwise change, the specification of similarity in Problem 5.

On this problem set, Didit provides less feedback about the correctness of your code:

It is your responsibility to examine Didit feedback and make sure your code compiles and runs properly for grading.
However, correctness is your responsibility alone, and you must rely on your own careful specification, testing, and implementation to achieve it.

Please remember to push early: we cannot guarantee timely Didit builds, especially near the problem set deadline.

Get the code

To get started,

Ask Didit to create a remote psets/ps2 repository for you on github.mit.edu.
Clone the repo. Find the git clone command at the top of your Didit assignment page, copy it entirely, paste it into your terminal, and run it.
Run npm install, then open in Visual Studio Code. See Problem Set 0 for if you need a refresher on how to create, clone, or set up your repository.

Overview

On this problem set we will use the data structures built in to JavaScript as building blocks to build a data structure of our own, and then build one small application of that ADT.

Interval sets

In problems 1—3, we will implement the interval set abstract type, a mutable set of unique labels where each label is associated with a non-overlapping interval on the number line.

For example, we can picture the interval set { A = [0,5), B = [10,30), C = [30,35) } as:

In this example, the labels are the unique strings "A", "B", and "C"; and each is associated with an interval, which in our type will be an interval of bigint values. Every interval is half-open (also called half-closed), meaning that it includes its start point but not its end point, so [0,5) represents { x ∈ ℝ | 0 ≤ x < 5 }.

Read the TypeDoc documentation for IntervalSet generated from src/intervalset.ts.

Our interval sets will support labels of any type. IntervalSet<Label> is a generic type similar to Array<E> or Map<K,V>. In the specification of Array, E is a placeholder for the type of elements in the list. In Map, K and V are placeholders for the types of keys and values. Only later does the client of Array or Map choose particular types for these placeholders by constructing, for example, a Array<Clown> or a Map<Car,number>.

For this problem set, we will implement IntervalSet twice, with two different reps, to practice choosing abstraction functions and rep invariants and preventing rep exposure. There are many good reasons for a library to provide multiple implementations of a type, and we’re following that model.

The implementations will be in src/intervalset-impls.ts.

Multi-interval sets

In problem 4, using interval sets, we will implement the multi-interval set type, a mutable set of labels where each label is associated with one or more globally-non-overlapping intervals.

For example, { A = [[0,5),[20,25)], B = [[10,20),[25,30)], C = [[30,35)] }:

Once again, the labels are "A", "B", and "C". Label "C" is associated with one interval, labels "A" and "B" each with two.

Read the TypeDoc documentation for MultiIntervalSet generated from src/multiintervalset.ts.

Similarity between multi-interval sets

With our multi-interval set datatype in hand, the MVP for your new “” startup needs one crucial piece of proprietary technology — a way to measure the similarity between multi-interval sets. That is problem 5.

For example, given { A = [[0,5),[20,25)], B = [[10,20),[25,30)] } and { A = [[20,35)], B = [[10,20)], C = [[0,5)] }:

Our similarity function will support a client-supplied definition of label similarity. For example, let’s suppose a binary definition: labels are completely similar (value 1) when they are equal, and completely dissimilar (value 0) otherwise.

The interval similarity of an interval spanned by both sets is the similarity of their labels. Those spanned by only one set, or by neither set, have interval similarity 0.

The similarity between multi-interval sets will be defined as a ratio of the extent spanned by the two sets, so the similarity between these two sets with the binary definition of label similarity is:

( 5×0 + 5×0 + 10×1 + 5×1 + 5×0 + 5×0 ) ÷ ( 35 - 0 ) = 15 / 35 ≈ 0.42857

(Hover or tap on individual terms to highlight that part of the figure above.)

However, suppose the client instead specifies that "A" and "C" have similarity 0 but "A" and "B" are closer, similarity ½. Using that definition:

( 5×0 + 5×0 + 10×1 + 5×1 + 5×0.5 + 5×0 ) ÷ ( 35 - 0 ) = 17.5 / 35 = 0.5

Clients of similarity define label similarity by providing a list of pairs of labels with their numeric similarity between 0 and 1. The similarity between all other pairs of labels is either 1 (if they are equal) or 0 (otherwise).

Read the TypeDoc documentation for similarity generated from src/similarity.ts.

Problem 1: Test `IntervalSet<string>`

Devise, document, and implement tests for IntervalSet<string>.

For now, we’ll only test (and then implement) interval sets with string labels. Later, we’ll implement a generic IntervalSet<Label> that can handle other kinds of labels too.

Write your testing strategy and your tests in test/intervalset.test.ts.

In order to run your tests on multiple implementations of the IntervalSet interface:

src/intervalset-impls.ts defines a function implementationsForTesting() that the tests will use to access your different implementation classes. You should not change this function.
test/intervalset.test.ts has an extra bit of code around Mocha’s describe to iterate over your implementations and run the test suite on each of them. You should not change this line.

The tests use the name SomeIntervalSet to refer to the current implementation class being tested, and you must use new SomeIntervalSet<string>(..) to get fresh empty sets, of type IntervalSet<string>, in your tests. Do not use the makeIntervalSet function, and do not try to call the constructor for a specific implementation like RepMapIntervalSet. See the provided test for an example.

Your tests in intervalset.test.ts must be legal clients of the IntervalSet spec.

Refer to Abstract Data Types: Testing an abstract data type. When testing instance methods, remember to partition the state of input this.

reading exercises

Partitions

Looking at intervalset.ts and intervalset-impls.ts, which of the following might you mention in your partitions for the operation add : IntervalSet<string> × bigint × bigint × string → void?

whether start is less than end

whether this is the empty set

whether an interval with label exists in this

whether startMap contains label

whether startMap and endMap are the same size

(missing explanation)

Throws

The spec of IntervalSet.add(..) includes a postcondition that specifies when an error is thrown. How could we use assert.throws to test this operation?

const is = new SomeIntervalSet<string>();
is.add(0n, 2n, "Z");

assert.throws(is.add);

assert.throws(is.add, 0n, 2n, "A");

assert.throws(is.add(0n, 2n, "A"));

assert.throws(() => { is.add(0n, 2n, "A"); });

(missing explanation)

Commit to Git. Once you’re happy with your solution to this problem, commit and push!

Now we’ll implement interval sets — twice.

Problem 2: Implement `RepMapIntervalSet`

First we’ll implement RepMapIntervalSet for string labels. Then we’ll make that implementation generic, for any kind of label. In the next problem, we’ll implement RepArrayIntervalSet.

Write your implementations in src/intervalset-impls.ts.

For all the classes you write in this problem set:

Document the abstraction function and representation invariant.
And along with the AF and RI, document how the type prevents rep exposure. See Documenting the AF, RI, & SRE.
Implement checkRep and use it to check the rep invariant.
Implement toString with a useful human-readable representation of the abstract value.

All your classes must have clear documented specifications. This means every method will have a TypeDoc comment, except when you inherit a spec from an interface:

When a method is specified in an interface and then implemented by a concrete class — for example, add(..) in RepMapIntervalSet is specified in IntervalSet — use @inheritDoc to point to the interface spec, indicating the method has the same spec. Unless the concrete class needs to strengthen the spec, don’t write the spec again (DRY).

Think carefully about your rep invariants and abstraction functions before you start writing code.

Consider how the required ADT operations, as well as checkRep and toString, will work given your choices.
Performance is not our objective: you should avoid exponential growth, but don’t be afraid to write e.g. quadratic code, especially in your first iterations.
Correctness, clarity, and changeability are our objectives, and none are served by pages of spaghetti code. Plan instead.

Helper functions. You can define small private (non-exported) helper functions in any of the files you are allowed to change. Since you cannot import these functions into a test file, you cannot test them, which is why they must be small. If you want to write a helper function or class or interface of any complexity, or export one to use in another file in src/ or test/ or both, then you should place it in src/utils.ts and write tests for it in test/utils.test.ts. Do not export anything new from any other source code file.

As you iterate and discover useful helper functions — for example, Interval does not provide an equality comparison or other helpful operations — consider extracting those functions to utils.ts.

reading exercises

Looking at intervalset-impls.ts, what is the input space of RI : R → boolean for RepMapIntervalSet?

(missing explanation)

Looking at intervalset-impls.ts, what does an input to AF : R → A for RepMapIntervalSet look like?

(missing explanation)

For an input that satisfies the rep invariant, what does the output look like?

(missing explanation)

Interval Set

Which describes exactly one abstract interval set?

A. a string-label interval set with no intervals

B. a string-label interval set with one interval

C. a string-label interval set with one interval [0,5)

D. a string-label interval set with one interval [0,5) labeled “A”

E. a string-label interval set with one interval [x,y) labeled z, where x = 0, y = 5, and z = “A”

F. a string-label interval set with one interval, where t = [0,5) and u = “A”

G. a string-label interval set where labels = [“A”, “B”] and intervals = [[0,5), [10,30)]

(missing explanation)

Interval

Given:

const i1 = new Interval(0n, 1n);
const i2 = new Interval(0n, 1n);

What is the value of: i1 === i2

True

False

(missing explanation)

Should you use i1 == i2 instead?

Yes

(missing explanation)

Should you use assert.deepStrictEqual(i1, i2) instead?

Yes

(missing explanation)

Map

Given:

const map: Map<string, Array<string>> = new Map([ ["A", []] ]);

What happens with the line of code below?

if (map.has("B")) { console.log(map.get("B").length); }

(missing explanation)

2.1. Implement `RepMapIntervalSet<string>`

For RepMapIntervalSet, you must use the rep provided:

private readonly startMap: Map<string, bigint> = new Map();
private readonly endMap: Map<bigint, bigint> = new Map();

You may not add fields to the rep or choose not to use one of the fields. You must use the names startMap and endMap, even if they are not the names you would choose. The constructor must create an empty interval set.

All TypeScript classes inherit Object.toString() with its very underdetermined spec. If your toString() documentation comment simply refers to that spec with @inheritDoc, RepMapIntervalSet keeps the same spec. The choice not to strengthen the spec is deliberate: since toString is intended for human consumption, keeping its spec weak helps prevent other code from depending on its particular format.

Because of this very weak spec, you do not need to test toString.

Run your IntervalSetTest tests on RepMapIntervalSet by running npm test -- -f RepMapIntervalSet.

Correction

The provided specs of IntervalSet and MultiIntervalSet exclude labels that are null or undefined. For its bad behavior with respect to ===, NaN should also be excluded:

Labels […] may not be null, undefined, or NaN.

Commit to Git. Once you’re happy with your solution to this problem, commit and push!

2.2. Implement generic `RepMapIntervalSet<Label>`

Nothing in your implementation should rely on the fact that labels are of type string in particular.

The spec of IntervalSet says that labels may be of arbitrary type, as long as they can be compared for equality using ===. Let’s change our implementation to support labels of any type that meets this condition.

First, change the declaration of RepMapIntervalSet to read:

export class RepMapIntervalSet<Label> implements IntervalSet<Label> { ... }

Generic labels and toString()

You cannot call toString() on objects of generic type Label (because it statically includes null and undefined, which we’ve excluded by precondition).

Instead, given lbl: Label…

1. Use concatenation:
const str: string = "" + lbl;

2. Use a template string:
const str: string = `${lbl}`;

Then update your implementation to support any type of label, using placeholder Label instead of string. Roughly speaking, you should be able to find-and-replace the instances of string in your code with Label!

This refactoring will make RepMapIntervalSet a generic type.

Any place you previously referred to RepMapIntervalSet without a generic parameter, you may need to add it.

2.3. Test other types of labels

In principle, because your implementations in this problem and in the next do not know or care about the actual type of labels, and will only compare them using === according to the IntervalSet specification, your test suite with string labels should be sufficient.

Nevertheless, there are bugs that a test suite with only string labels cannot find! For example, if the implementation accidentally compares labels after converting them to strings, distinct objects with the same toString() value will appear equal when they should be unequal.

To enable your test suite to create interval sets with other kinds of labels, remove the line at the bottom of src/intervalset-impls.ts with the comment remove this line in Problem 2.3, and replace it with the commented-out line immediately below.

You don’t need to understand the IntervalSetCtor type alias, but please ask if you have questions.

Then improve your test suite to cover labels that don’t behave like built-in strings, which should require adding or modifying only a couple tests.

Commit to Git. Once you’re happy with your solution to this problem, commit and push!

Problem 3. Implement `RepArrayIntervalSet`

3.1. Implement generic `RepArrayIntervalSet<Label>`

For RepArrayIntervalSet, you must use the rep provided:

private readonly labelList: Array<Label> = [];
private readonly valueList: Array<bigint> = [];

You may not add fields to the rep or choose not to use one of the fields. You must use the names labelList and valueList, even if they are not the names you would choose. The constructor must create an empty interval set.

Note that JavaScript arrays can be made sparse by omitting elements at some indices. But sparse arrays can have surprising behavior (e.g., map() and forEach() just skip over the missing elements rather than visiting them as undefined values). We use Array<T> to indicate a dense (non-sparse) array of type T. If you choose to use sparse arrays in your rep, you must document that choice clearly.

Run your IntervalSet tests on RepArrayIntervalSet by running npm test -- -f RepArrayIntervalSet.

Run the tests on both implementations by running, for example, npm test -- -f IntervalSet.

If you cannot figure out how to make one of your implementations generic while satisfying the TypeScript compiler’s static type checking, please visit lab or office hours, or ask a question on Piazza.

3.2. Wrap up `IntervalSet`

Finally, pick one of your IntervalSet implementations for clients to use, and write the one-line implementation of makeIntervalSet() in src/intervalset.ts.

At this point in the problem set, we’ve completed the test-first implementation of IntervalSet.

Your test suites should be passing and have excellent code coverage from the coverage tool.
This is a good opportunity for self code review: remove dead code and debugging console.log output, DRY up duplication, and so on.

Commit to Git. Once you’re happy with your solution to this problem, commit and push!

Problem 4: `MultiIntervalSet<Label>`

Just as we implemented the IntervalSet ADT by relying on other ADTs as building blocks, we can use IntervalSet as part of the specification and as part of the representation of another ADT.

4.1. Test `MultiIntervalSet`

Devise, document, and implement tests for MultiIntervalSet in test/multiintervalset.test.ts.

Your tests in multiintervalset.test.ts must be legal clients of the MultiIntervalSet spec. As before, since the spec of toString is very weak, you do not need to test toString.

4.2. Implement `MultiIntervalSet`

Implement MultiIntervalSet in src/multiintervalset.ts.

You must use IntervalSet in the rep of MultiIntervalSet. For example, IntervalSet instances might be elements in an array or values in a map. Your MultiIntervalSet should rely only on the specs of IntervalSet, not on the details of a particular IntervalSet implementation, and it should use the makeIntervalSet() factory function to obtain IntervalSet instances. The implementation of MultiIntervalSet is otherwise entirely up to you.

Run all the IntervalSet and MultiIntervalSet tests by running npm test.

Commit to Git. As always, once you’re happy with your solution to this problem, commit and push!

Problem 5: Fame and fortune with `IntervalSet`

This problem description was randomly generated

If you don’t like it, please reload the page to receive a different problem.

You are the founder and CTO of , the hottest new startup that just called “ for , but .” In your MVP, you’re leveraging the most advanced artificial intelligence algorithms and machine learning architecture to deliver unprecedented breakthrough results in the world of .

Of course, that’s all a bunch of . As CTO, you realize it all comes down to managing multi-interval sets that represent . works by comparing . You need to develop tools for measuring the similarity of multi-interval sets: completely different will score 0, with the score approaching 1 as they become more alike.

And best of all, when you pivot your startup to and rebrand as , the startup in , you’ll use those multi-interval sets to represent ! In the new startup, you’re going to compare — so your success in the startup world all comes down to this one problem.

5.1 Test `similarity`

Devise, document, and implement tests for similarity(..) in test/similarity.test.ts.

You may strengthen the spec of similarity but must not weaken or otherwise change it. Your tests in similarity.test.ts must be legal clients of your spec.

5.2 Specify, test, and implement a helper ADT

You will implement similarity(..) in src/similarity.ts.

Plan that implementation now, with the following requirement: you must make use of one more ADT, whose design is entirely up to you, that improves the safety from bugs, ease of understanding, and/or readiness for change of your similarity implementation.

Specify your new ADT as a class at the bottom of src/similarity.ts; then test it in test/similarity.test.ts; and implement it.

5.3 Implement `similarity`

Your similarity code should rely only on the specs of MultiIntervalSet and the other ADTs, not on the details of any particular implementation.

It should make good use of the new helper ADT you designed, again relying only on its spec and not on its implementation.

The implementation of similarity is otherwise entirely up to you.

Run all your tests by running npm test.

Before you’re done

Make sure you have documented specifications, in the form of properly-formatted TypeDoc comments, for all your types and operations.
Make sure you have documented abstraction functions and representation invariants, in the form of a comment near the field declarations, for all your implementations.

In addition, document how the type prevents rep exposure. See Documenting the AF, RI, & SRE.

Make sure all types use checkRep to check the rep invariant and implement toString with a useful human-readable representation of the abstract value.
Make sure you have a thorough, principled test suite for every type.

Submitting

Make sure you commit AND push your work to your repository on github.mit.edu. We will use the state of your repository on github.mit.edu as of 10:00pm on the deadline date. When you git push, the continuous build system attempts to compile your code and run the public tests (which are only a subset of the autograder tests). You can always review your build results at didit.mit.edu/6.102/sp25.

Didit feedback is provided on a best-effort basis:

There is no guarantee that Didit tests will run within any particular timeframe, or at all. If you push code close to the deadline, the large number of submissions will slow the turnaround time before your code is examined.
If you commit and push right before the deadline, it’s okay if the Didit build finishes after the deadline, because the commit-and-push was on time.
Passing some or all of the public tests on Didit is no guarantee that you will pass the full battery of autograding tests — but failing them is almost sure to mean lost points on the problem set.

Grading

Your overall ps2 grade will be computed as approximately:
~40% alpha autograde (including online exercises) + ~10% alpha manual grade + ~35% beta autograde + ~15% beta manual grade

The autograder test cases will not change from alpha to beta, but their point values will. Test cases may be worth zero points on the alpha and nonzero points on the beta, or vice versa.

Manual grading of the alpha may examine any part of your solution, including specs, test suites, implementations, and your Git commit history. Manual grading of the beta may examine any part, and how you addressed manual grading and code review feedback from the alpha.

Problem Set 2: Multi-Startup Set

Get the code

Overview

Interval sets

Multi-interval sets

Similarity between multi-interval sets

Problem 1: Test IntervalSet​<string>

reading exercises

Problem 2: Implement RepMapIntervalSet

reading exercises

2.1. Implement RepMapIntervalSet<string>

2.2. Implement generic RepMapIntervalSet<Label>

2.3. Test other types of labels

Problem 3. Implement RepArrayIntervalSet

3.1. Implement generic RepArrayIntervalSet<Label>

3.2. Wrap up IntervalSet

Problem 4: Multi­IntervalSet<Label>

4.1. Test MultiIntervalSet

4.2. Implement MultiIntervalSet

Problem 5: Fame and fortune with IntervalSet

5.1 Test similarity

5.2 Specify, test, and implement a helper ADT

5.3 Implement similarity

Before you’re done

Submitting

Grading

Problem 1: Test `IntervalSet<string>`

Problem 2: Implement `RepMapIntervalSet`

2.1. Implement `RepMapIntervalSet<string>`

2.2. Implement generic `RepMapIntervalSet<Label>`

Problem 3. Implement `RepArrayIntervalSet`

3.1. Implement generic `RepArrayIntervalSet<Label>`

3.2. Wrap up `IntervalSet`

Problem 4: `MultiIntervalSet<Label>`

4.1. Test `MultiIntervalSet`

4.2. Implement `MultiIntervalSet`

Problem 5: Fame and fortune with `IntervalSet`

5.1 Test `similarity`

5.3 Implement `similarity`