Problem Set 1: Flashcards

Alpha due: Monday, September 23, 2019, 10:00 pm
Code reviews due: Friday, September 27, 2019, 11:00 am
Beta due: Monday, September 30, 2019, 10:00 pm

The purpose of this problem set is to give you practice with test-first programming and specifications. Given a set of specifications, you will write unit tests that check for compliance with the specifications, and then implement code that meets the specifications. You will also write a stronger version of one of the specifications, and test and implement your stronger spec.

Get the code

To get started,

Ask Didit to create a remote psets/ps1 repository for you on GitHub.mit.
Clone the repo. Find the ssh:// URL at the top of your Didit assignment page and run: git clone «URL»
Import into Eclipse. See Problem Set 0 for if you need a refresher on how to create, clone, or import your repository.

Overview

Have you ever used flashcards – maybe to improve your vocabulary either in your native language or in a foreign language you want to learn? A flashcard has two sides, a front and a back, which might be a vocabulary word and its meaning. To use a flashcard for learning practice, you look at the front and try to recall what’s on the back, then look at the back to see if you were right. Flashcards are an example of active learning, which is much more effective than passive reading or listening.

A flashcard system is a scheduling algorithm that decides when to practice each flashcard for the most efficient learning. Flashcard systems take advantage of spaced repetition, the principle that we learn more durably when practice is spaced out over time rather than crammed together, and furthermore that the rate of forgetting something is slower the more we have practiced it, so that the spacing between practice can increase as a concept or skill becomes more learned.

This problem set uses a spaced repetition flashcard algorithm that we will call the Modified-Leitner system, because it is inspired by the Leitner system but differs in the details. In a Leitner system, flashcards are grouped into numbered learning buckets. Low-numbered buckets contain flashcards that are less well-learned, so low-numbered buckets are scheduled for practice more often than high-numbered buckets. As a flashcard receives more and more successful practice, it moves up to higher buckets, until finally reaching a retired bucket (e.g., bucket 5). Flashcards in the retired bucket are considered well-enough learned not to need practice anymore.

Here is the specific behavior of our Modified-Leitner algorithm:

A new card, before its first practice, is assumed to be in bucket 0.
On day n of the learning process, the user collects all the buckets that are scheduled for practice that day, and practices every card found in those buckets.
- Cards in the retired bucket are not scheduled for practice.
- Bucket 0 cards are practiced every day, bucket 1 cards every two days, bucket 2 cards every four days, and so forth. In general, bucket i is practiced every 2ⁱ days.
- The user is of course free to practice new cards, or unscheduled cards of their own choice, or to practice cards repeatedly.
Whenever the user practices a flashcard:
- if they get the card wrong – they are unable to correctly recall what’s on the back of the card – then the card is put back in bucket 0.
- if they find the card easy – getting it right without much mental effort – then the card moves up from bucket i to bucket i+1 (unless the card is already in the retired bucket).
- if they find the card hard – getting it right or almost right but only with effort – then the card moves down to bucket i-1 (unless the card is already in bucket 0).

This problem set consists of a group of methods that can be used to implement the Modified-Leitner flashcard system:

toBucketSets() converts a Map representation of learning buckets into a List-of-Set representation
getBucketRange() finds which buckets contain flashcards, as a rough measure of progress
practice() selects cards to practice on a particular day
update() updates a card’s bucket number after a practice trial
deduplicate() consolidates a set of flashcards by finding redundant cards

In addition to implementing these methods yourself, you will also write a test suite for each method, carefully following the method’s spec. Your test suites will be graded in part by running them against various staff implementations of the method, some of which obey the spec and some which don’t. A good test suite should pass the good implementations and reject the bad ones.

You are also asked to change exactly one of these method specs, deduplicate(). Its current spec is very weak, so you will strengthen the spec, and test and implement your stronger spec. Unlike your test suites for other methods, your deduplicate() tests won’t be run against any staff implementations, because you’ll be changing the spec.

Steps

Since we are doing iterative test-first programming, your workflow for each method should follow the order specs → tests → implementation, iterating as needed. Make steady progress across methods and down the workflow, in the following pattern:

Read this entire handout first.
Read the specs for all methods carefully.
For toBucketSets() and getBucketRange():
1. Write tests for one method. Add/commit/push.
2. Write tests for the other method. Add/commit/push.
3. Implement one method. Add/commit/push.
4. Implement the other method. Add/commit/push.
For practice() and update():
1. Write tests for one method. Add/commit/push.
2. Write tests for the other method. Add/commit/push.
3. Implement one method. Add/commit/push.
4. Implement the other method. Add/commit/push.
For deduplicate():
1. Write tests for the staff-provided weak spec. Add/commit/push.
2. Code a trivial implementation of the weak spec, to exercise your tests. Add/commit/push.
3. Strengthen the spec, following instructions later in this handout. Add/commit/push.
4. Add tests for the revised spec. Add/commit/push.
5. Implement the revised spec. Add/commit/push.
6. Iterate as needed. Add/commit/push after each step.

	`toBucketSets`	`getBucketRange`	`practice`	`update`	`deduplicate`
spec	1	1	1	1	1	4c	4f
test	2a	2b	3a	3b	4a	4d	4f
implement	2c	2d	3c	3d	4b	4e	4f

The next few sections of this handout have more specific advice about these steps:

Specifications
Testing
Implementation
Strengthening deduplicate()

Use git add/git commit/git push after every step that changes your code. Committing frequently – whenever you’ve written some tests, fixed a bug, or added a new feature – is a good way to use version control, and will be a good habit to have for your team projects and as a software engineer in general, so start forming the habit now. Your git commit history for this problem set should demonstrate test-first programming, have frequent small commits, and include short but descriptive commit messages.

What you can and can’t change

In order for your overall program to meet the specification of this problem set, you are required to keep some things unchanged:

Don’t change these classes at all: the classes Flashcard and Answer should not be modified at all.
Don’t change these class names: the classes BucketSets, BucketRange, Practice, Update, Deduplicate, BucketSetsTest, BucketRangeTest, PracticeTest, UpdateTest, and DeduplicateTest must use those names and remain in the flashcards package.
Don’t change the method signatures and specifications: the public methods toBucketSets(), getBucketRange(), practice(), and update() must use the method signatures and the specifications that we provided.
Don’t change the method signature or weaken the specification: the public method deduplicate() must have a stronger spec than we provided.
Don’t include illegal test cases: the tests you implement in BucketSetsTest, BucketRangeTest, PracticeTest, and UpdateTest must respect the specifications that we provided for the methods you are testing.

Aside from these requirements, however, you are free to add new public and private methods and new public and private classes if you wish.

Specifications

Before you start, read the specs carefully, and take notes about what you observe. You can either read the Java source files directly in Eclipse, or read the Javadoc documentation for all classes generated from the source files.

Keep in mind these facts about specifications:

Some specs have preconditions. Recall from the specs reading that when preconditions are violated by the client, the behavior of the method is completely unspecified.
Some specs have underdetermined postconditions. Recall that underdetermined postconditions allow a range of behavior. When you’re implementing such a method, the exact behavior of your method within that range is up to you to decide. When you’re writing a test case for the method, you must allow the implementation you’re testing to have the full range of variation, because otherwise your test case is not a legal client of the spec as required above.
Public methods can be used anywhere. These methods are independent modules, which might be called by various parts of a flashcard system, not necessarily just the Modified-Leitner algorithm. A method implementation must be able to handle all inputs that satisfy the precondition, even if they don’t arise in the Modified-Leitner algorithm. And it may return any output that satisfies its postcondition, even if that doesn’t seem useful in the Modified-Leitner algorithm.

Testing

You should partition each method’s inputs and outputs, write down your partitions in a testing strategy comment, and choose test cases to cover the partitions.

For each method, you’ll find its spec and implementation in a Java file in the src folder, and a corresponding JUnit test class file in the test folder. Separating implementation code from test code is a common practice in development projects. It makes the implementation code easier to understand, uncluttered by tests, and easier to package up for release.

The Test class for a method may already have example tests in it, which are provided as models. You are recommended to throw away those example tests and write your own.

Some advice about testing:

Your test cases should be chosen using partitioning. This approach is explained in the reading about testing.
Include a comment at the top of each test suite class describing your testing strategy. Examples are shown in the reading about testing.
Your test cases should be small and well-chosen. Don’t use a large set of data for each test. Instead, create inputs carefully chosen to test the partition you’re trying to test.
Your tests should find bugs. We will grade your test cases in part by running them against buggy implementations and seeing if your tests catch the bugs. So consider ways an implementation might inadvertently fail to meet the spec, and choose tests that will expose those bugs.
Your tests must be legal clients of the spec. We will also run your test cases against legal, variant implementations that still strictly satisfy the specs, and your test cases should not complain for these good implementations. That means that your test cases can’t make extra assumptions that are only true for your own implementation.

If you wish to write test cases that test a stronger spec than we provide, you should put those tests in a separate JUnit test class, so that we don’t try to run them on staff implementations that only satisfy the weaker spec. For example, you could put stronger tests of practice() in a class called MyPracticeTest in the flashcards package in the test folder alongside the other JUnit test classes. The explicit exception to this rule is deduplicate(), whose stronger spec tests should go in DeduplicateTest.
Put each test case in its own JUnit method. This will be far more useful than a single large test method, since it pinpoints where the problem areas lie in the implementation.
Avoid static fields in test classes. Any mutable data in them would be shared by all the tests, so use instance fields instead. When JUnit runs the @Test methods in a class, it creates a new instance of the test class for each method.
Be careful calling helper methods from testing code. Your test cases in, say, BucketSetsTest.java must not call a new helper method that you have defined in BucketSets.java or some other src/ class. We detach your testing code from the rest of your solution when we run your test suites against staff implementations, so your code in src/ will not be available to your test suites. Put helper methods needed by testing code into test/.
Again, keep your tests small. Don’t use unreasonable amounts of resources (such as lists or strings of length MAX_INT). We won’t expect your test suite to catch bugs related to running out of resources; every program fails when it runs out of resources.
Add/commit/push frequently. Whenever you do a nontrivial amount of work – e.g. after writing a testing strategy comment; after choosing some test inputs; after writing JUnit methods and seeing that they pass – you should add/commit/push your work with git.

Implementation

Implement each method, and revise your implementation and your tests until all your tests pass.

Some advice about implementation:

Helper methods. If you want to write helper methods for your implementation, then make them public static methods of a new class, alongside the other classes. For example, you could put them in a Utilities class in the flashcards package of the src folder, and write tests for them in UtilitiesTest in test folder. Feel free to put multiple methods into one class.
Don’t call testing code. Don’t put a helper method in the test/ folder if you need to call it from implementation code in src/. Testing code should be completely detachable from the implementation, so that the implementation can be packaged up separately for deployment. We also detach your testing code when we run staff tests against your implementation. Put helper methods needed by implementation code into src/.
Eliminate warnings. Revise your code to address all the yellow-highlighted warnings from Eclipse.
Check testing coverage. Do glass box testing and revise your tests until you have satisfactory code coverage.
Review your own code. Read your code critically with an eye to making it as SFB, ETU, and RFC as possible.
Test frequently. Rerun your tests whenever you make a change to your code.
Add/commit/push frequently. Whenever you do a nontrivial amount of work – e.g. after writing a method body and seeing that the tests pass, or after finding a bug, adding a regression test for it, and fixing it – you should add/commit/push your work with git.

After you’ve implemented all the methods, you can run Main.java to see the methods used in a simulated flashcard application. Main.java is not part of the spec of this problem set, is not used for grading, and you are free to edit it as you wish.

Strengthening `deduplicate()`

Almost all the specs in this problem set should be used exactly as-is, with one exception. The deduplicate() method currently has a rather weak spec. You should at first use this weak spec to write some tests and a simple implementation. Make sure your tests respect the weak spec, and that your implementation passes your tests. Add/commit/push.

Then revise the spec of deduplicate() by editing the Javadoc comment in Deduplicate.java. Your new spec should be:

stronger than the original spec
deterministic
helpful, in the sense of helping a user in realistic situations

When revising the spec, you will need to carefully consider three words:

Similar. Your stronger spec will have to completely define what “similar” means for flashcards. At a minimum, your revised spec should require that:

cards with identical fronts are similar, as in the original weak spec
cards whose fronts are the same English noun in singular and plural forms are similar, for example:
- ”sound” and “sounds”
- “bus” and “buses”
- “synergy” and “synergies”
- “ray” and “rays”
Exceptions to the general rules, and very specialized rules that apply only to a small number of English words, do not need to be part of your spec.
one additional similarity rule of your choice not about English vocabulary
- e.g. a grammatical form or spelling variation in some other language, or a rule for another learning topic that flashcards might be useful for

Canonical. Your spec should also define what the “canonical” card is for a group of similar cards, in a clear, deterministic, and helpful way, which is stronger than the original weak spec.

Information. Thinking about canonical cards will also involve thinking about the “information” content of cards, and how to combine information from multiple cards, deterministically.

For simplicity, your spec should not require the implementation to refer to a list of exceptions or a dictionary. So your spec may be unhelpfully inaccurate on irregular words that would require an exception list or dictionary lookup. For example, your spec may treat “mouse” and “mice” as not similar even though they are actually the same English noun, while it treats “these” and “theses” as similar even though they are completely different words.

Keep in mind that you are strengthening the spec, so deduplicate can still be called in all the situations it could be used in the original weak spec. For example, some of the flashcards may not be about vocabulary at all, but about math: e.g. 3x7 / 21 and 5x5 / 25. A spec that assumed only vocabulary cards mattered, and collapsed all these weird math cards into a single canonical card, would indeed be a legal strengthening of deduplicate, but not a helpful one to a poor 9-year-old trying to learn the multiplication table.

So this is a design problem, and you will need to think broadly and carefully. In the criteria given for your spec above, “stronger” and “deterministic” are likely to be clear, but “helpful” is likely to involve tradeoffs and design decisions. There is no single right answer, but there are better and worse decisions.

After you change your spec in Deduplicate.java, revise your tests in DeduplicateTest.java and your implementation in Deduplicate.java so that the implementation satisfies your stronger spec and passes your updated tests.

Some advice about this problem:

Work iteratively. Bite off the requirements one bit at a time. First handle identical card fronts – update the spec, write tests, then implement. Then for English nouns: spec, test, implement. Then for your new rule: spec, test, implement.
Keep your original tests if possible. Each time you change the spec, revise your test suite to test your new spec by updating the testing strategy and test cases in DeduplicateTest.java. Keep the strategy and tests you wrote for the original weak staff-provided spec, so that your DeduplicateTest.java is checking your implementation for compliance with the original weak spec as well as your stronger spec.
Revise specs and tests as needed. If you find while you are testing or implementing that you need to revise your spec, do so.
Add/commit/push frequently. If you commit after every step of this process (spec, commit, test, commit, implement, commit, etc.), then Git will keep track of your old versions of spec, tests, and code, so that you can refer back to them or even recover them if you have to. You don’t have to preserve your implementation of the original staff-provided weak spec in your final code, because your Git history should have it.

Submitting

Make sure you commit AND push your work to your repository on Athena. We will use the state of your repository on Athena as of 10:00pm on the deadline date. When you git push, the continuous build system attempts to compile your code and run a subset of the autograder tests. You can always review your build results at didit.csail.mit.edu/6.031/fa19.

Didit feedback is provided on a best-effort basis:

There is no guarantee that Didit tests will run within any particular timeframe, or at all. If you push code close to the deadline, the large number of submissions will slow the turnaround time before your code is examined.
If you commit and push right before the deadline, the Didit build does not have to complete in order for that commit to be graded.
Passing some or all of the public tests on Didit is no guarantee that you will pass the full battery of autograding tests — but failing them is almost sure to mean lost points on the problem set.

Grading

Your overall ps1 grade will be computed as approximately:
33% alpha autograde + 7% alpha manual grade + 45% beta autograde + 15% beta manual grade

The autograder test cases will not change from alpha to beta, but their point values will. In order to encourage test-first programming, alpha autograding will put more weight on your tests and less weight on your implementations. On the beta, autograding will look for both good testing and good implementations.

Manual grading of the alpha will examine your test suites, your revised deduplicate() spec, and your git commit history. Manual grading of the beta may examine any part, including re-examining the test suites and spec, and how you addressed code review feedback.