6.031 — Software Construction
Fall 2017

Problem Set 3: Memely

Iteration 1 due
Monday, October 30, 2017, 10:00 pm
Code reviews due
Friday, November 3, 2017, 11:00 am
Iteration 2 due
Monday, November 6, 2017, 10:00 pm

In this problem set, we will explore parsers, recursive data types, and equality for immutable types.

Compared to the previous problem sets, we are imposing very few restrictions on how you structure your code. In addition, much of the code that you write for problems in this problem set will depend heavily on how you decided to implement earlier parts of the problem set. We strongly recommend that you read through the entire assignment before writing any code.

Design Freedom and Restrictions

On several parts of this problem set, the classes and methods will be yours to specify and create, but you must pay attention to the PS3 instructions sections in the provided documentation.

You must satisfy the specifications of the provided interfaces and methods. You are, however, permitted to strengthen the provided specifications or add new methods. On this problem set, unlike previous problem sets, we will not be running your tests against any other implementations.

On this problem set, Didit provides less feedback about the correctness of your code:

  • It is your responsibility to examine Didit feedback and make sure your code compiles and runs properly for grading.
  • However, correctness is your responsibility alone, and you must rely on your own careful specification, testing, and implementation to achieve it.

Please remember to push early: we cannot guarantee timely Didit builds, especially near the problem set deadline.

Get the code

To get started,

  1. Ask Didit to create a remote psets/ps3 repository for you on Athena.
  2. Pull the repo from Athena using Git:

git clone ssh://[username]@athena.dialup.mit.edu/mit/6.031/git/fa17/psets/ps3/[username].git ps3

If you need a refresher on how to create, clone, and import your repository, see Problem Set 0.

Overview

Image memes are fun. But they would be even more fun if we could program them. In this problem set, we’ll implement a language that generates image memes.

The meaning of an expression in this language is a generated image, a 2D array of pixels with a specific width and height. All widths and heights in the language are positive integers.

The simplest primitives of the language are images, represented by filenames:

myroom.jpg

and captions, represented as quoted strings:

"This is my room"

An image filename loads an image from the specified file. Even though operating systems allow a variety of characters in a filename, in this language a filename should allow letters, digits, periods, and forward slashes /. The image file format may be any format that the Java Image I/O package is capable of reading. If the image in the file has zero width or zero height (which is actually forbidden by most image formats), then behavior is undefined.

A caption produces an image showing the given text as a single line with no word-wrapping. A caption may contain any characters except newlines and double-quotes (though how the font actually displays unusual characters, like emojis, is unspecified). The font, color, and size of the text is unspecified, but must be reasonably readable. (If you choose to use a font other than the default font, make sure all the public tests render correctly on Didit.) The width and height must be big enough to not crop any of the text; extra margin or padding is permitted but unspecified.

Two expressions can be glued together side-by-side, e.g.:

A.jpg | B.jpg | C.jpg

produces a horizontal three-panel cartoon strip.

Expressions can be glued together top-to-bottom, e.g.:

1.jpg --- 2.jpg --- 3.jpg

produces a vertical three-panel strip.

Space, tab, carriage return, and newline characters around symbols are irrelevant and ignored, and any string of at least 3 - characters can be used to mean a single --- operator, so this can also be written:

1.jpg
--------
2.jpg
------
3.jpg

Side-by-side gluing has higher precedence than top-to-bottom gluing so:

A1.jpg  |  B1.jpg  |  C1.jpg
-----------------------------
A2.jpg  |  B2.jpg  |  C2.jpg

produces a 6-panel cartoon laid out in 2 rows of 3 images each. Parentheses can be used to group expressions and override precedence, so if the six image files are all the same size, here is another way to write the same layout:

(A1.jpg --- A2.jpg) | (B1.jpg --- B2.jpg) | (C1.jpg --- C2.jpg)

When images of different sizes are glued together side-by-side, the taller image is shrunk so that both images have the same height. Similarly, when different-sized images are glued top-to-bottom, the wider image is shrunk so that both have the same width. When shrinking an image, the aspect ratio (width/height) must be preserved as closely as possible, but widths and heights are rounded to the nearest integer.

The language has two more combiner operators that overlay an image onto another image:

img/boromir.jpg ^ "One does not simply"

places the caption over the top part of the image so that the tops of both are aligned, and

img/boromir.jpg _ "walk into Mordor"

places the caption over the bottom of the image, so that the bottoms are aligned. The lefthand side and righthand side of ^ and _ can be any expression, not just a caption, though these are most often used with captions. For both operators, if the two images have different widths, then the wider image is shrunk (preserving aspect ratio and rounding as above) so that the widths match. The resulting image’s height must be the maximum of the two image heights.

Finally, any expression can be explicitly resized using the @ operator:

myface.jpg @ 300x200

which rescales the image (i.e. stretches or shrinks it) so that its width is 300 pixels and height is 200 pixels. Resizing does not have to preserve the aspect ratio of the image. Resizing to zero width or height has undefined behavior.

The precedence of the operators goes in the order @ ^ _ | ---. The resize @ operator has highest precedence, applied first. The top-overlay operator ^ has next highest precedence, followed by the bottom-overlay operator _, so that A ^ B _ C means (A ^ B) _ C. Then the side-by-side glue operator | is applied, and finally the top-to-bottom glue operator ---- has lowest precedence. Precedence can be overridden by parentheses.

The system has a console user interface where users may input expressions and see their results. When the user enters an expression on the console, that expression becomes the current expression and is echoed back to the user, possibly with reformatting (user input in green):

These are example outputs, not fully determined outputs. Your system’s output may vary within the bounds of the spec. Examples of variation include whitespace, parentheses, simplification, operator representation, number representation, font face, text color, text size, and error messages.

> img/boromir.jpg ^ "One does not simply" _ "walk into Mordor"
(img/boromir.jpg^"One does not simply")_"walk into Mordor"

> img/tech1.png ---------- img/tech2.png ---------- img/tech3.png
img/tech1.png --- (img/tech2.png --- img/tech3.png)

A command starts with !. The command operates on the current expression, and may also update the current expression. Valid commands:

!layout
produces a fully laid-out expression, in which every subexpression is explicitly augmented with its computed width and height using the @ operator, and updates the current expression to it.

!generate
generates the image represented by the current expression and displays it in a window. This command does not update the current expression, and this command can be run on any valid current expression, not just the result of !layout.

Entering an invalid expression prints an error but does not update the current expression. The error should include a human-readable message but is not otherwise specified.

More examples:

These are example outputs, not fully determined outputs. Your system’s output may vary within the bounds of the spec. Examples of variation include whitespace, parentheses, simplification, operator representation, number representation, font face, text color, text size, and error messages.

> img/tech1.png | img/tech2.png
img/tech1.png|img/tech2.png

> img/tech3.png -|-|- img/tech4.png
unknown expression

> !layout
(img/tech1.png@200x150|img/tech2.png@200x140)@387x140

> !generate


The three things that a user can do at the console correspond to three provided method specifications in the code for this problem set:

  • Expression.parse()
  • Commands.layout()
  • Commands.generate()

These methods are used by Main to provide the user interface described above.

Problem 1: we will create the Expression data type to represent expressions in the program.

Problem 2: we will create the parser that turns a string into an Expression, and implement Expression.parse().

Problems 3-4: we will add new Expression operations for laying out an expression and generating an image from an expression, and implement Commands.layout() and Commands.generate().

Recommendation: do problems 1-4 first for a subset of the language with only three features: filenames, side-by-side glue |, and resize @. The Didit public tests use only these operators, and iter1 autograding will focus on these operators as well. Getting this minimal subset to work first will help you understand how all the pieces of the problem set work together.

Then go back and extend what you’ve done to support the additional features: captions, top-to-bottom glue ---, and the overlay operators ^ and _.


Problem 1: Representing Expressions

Define an immutable, recursive abstract data type to represent expressions as abstract syntax trees.

Your AST should be defined in the provided Expression interface (in Expression.java) and implemented by several concrete variants, one for each kind of expression. Each variant should be defined in its own appropriately-named .java file. You should have separate variant classes for each operator in the language.

Concrete syntax in the input, such as parentheses and whitespace, should not be represented at all in your AST.

It should be legal to create an abstract syntax tree containing filenames that do not currently exist in the filesystem. Only operations that actually require the image data (like layout and generate) should try to read the files.

For creating your AST type, you may find these examples useful:

1.1 Expression

To repeat, your data type must be immutable and recursive. Follow the recipe for creating an ADT:

  • Spec. Choose and specify operations. For this part of the problem set, the only operations Expression needs are creators and producers for building up an expression, plus the standard observers toString(), equals(), and hashCode(). We are strengthening the specs for these standard methods; see below.
  • Test. Partition and test your operations in ExpressionTest.java, including tests for toString(), equals(), and hashCode(). Note that we will not run your tests on any implementations other than yours.
  • Code. Write the rep for your Expression as a data type definition in a comment inside Expression. Implement the variant classes of your data type.

Remember to include a Javadoc comment above every class and every method you write; define abstraction functions and rep invariants, and write checkRep; and document safety from rep exposure.

1.2 toString()

Define the toString() operation on Expression so it can output itself as a string. This string must be a valid expression as defined above. You have the freedom to decide how to format the output with whitespace and parentheses for readability, but the expression must have the same meaning as an image.

Your toString() implementation must be recursive, and must not use instanceof.

Use the @Override annotation to ensure you are overriding the toString() inherited from Object.

Remember that your tests must obey the spec. If your toString() tests expect a certain formatting of whitespace and parentheses, you must specify this formatting in your spec.

1.3 equals() and hashCode()

Define the equals() and hashCode() operations on your AST to implement structural equality.

Structural equality defines two expressions to be equal if:

  1. the expressions contain the same filenames, captions, numbers, and operators;
  2. those filenames, captions, numbers, and operators are in the same order, read left-to-right;
  3. and they are grouped in the same way.

For example, the AST for a.jpg ^ "title" is not equal to the AST for "title" ^ a.jpg, but it is equal to the ASTs for a.jpg^"title", (a.jpg ^ "title"), and (a.jpg) ^ ("title"). The AST for (A | B) --- (C | D) is not equal to the AST for (A --- C) | (B --- D) even though they both may generate the same 2x2 comic strip.

For n-ary groupings where n is greater than 2:

  • Such expressions must be equal to themselves. For example, the ASTs for A | B | C and (A | B | C) must be equal.
  • However, whether they are equal or not to different groupings with the same image meaning is not specified, and you should choose an appropriate specification and implementation for your AST. For example, you must determine whether the ASTs for (A | B) | C and A | (B | C) are equal.

Remember: concrete syntax, including parentheses, should not be represented in your AST. Grouping, for example, should be reflected in the AST’s structure.

Be sure that AST instances which are considered equal according to this definition and according to equals() also satisfy observational equality.

Your equals() and hashCode() implementations must be recursive. Only equals() can use instanceof, and hashCode() must not.

Remember to use the @Override annotation.

Commit to Git. Once you’re happy with your solution to this problem, commit and push!


Problem 2: Parsing Expressions

Now we will create the parser that takes a string and produces an Expression value from it. The entry point for your parser should be Expression.parse(), whose spec is provided in the starting code.

Examples of valid inputs:

A.jpg | B.jpg --- C.jpg | D.jpg
A.jpg --- B.jpg --- C.jpg
((A_B)^C)---D
base.jpg_("all your base"---"are belong to us")

Examples of invalid inputs:

A - B - C

the --- operator must have at least three dashes

A --- --- B

no spaces within the --- operator

|A|B|

the | operator must be binary

"no

unterminated caption

my-room.jpg

filenames can’t have dashes

Examples of optional inputs (extensions to the language that you may want to design and support):

'caption'

single-quoted captions

"One does not simply"{Helvetica 96pt black}

font face, size, color of text

ghost.jpg * 50%

make image translucent

You may consider the optional inputs invalid, or you may choose to support additional features (like new operators) in the input. However, your system may not produce an output with a new feature unless that feature appeared in its input. This way, a client who knows about your extensions can trigger them, but clients who don’t know won’t encounter them unexpectedly.

2.1 Write a grammar

Write a ParserLib grammar for expressions as described in the overview. A starting ParserLib grammar file can be found in src/memely/Expression.g. This starting grammar recognizes filenames, side-by-side gluing, and resizing with @.

For more information on ParserLib, see:

2.2 Implement Expression.parse()

Implement Expression.parse() by following the recipe:

  • Spec. The spec for this method is given, but you may strengthen it if you want to make it easier to test. Remember that it should be legal to parse an expression containing filenames that do not currently exist in the filesystem, since parsing does not depend on image data from the files.

  • Test. Write tests for Expression.parse() and put them in ExpressionTest.java. Note that we will not run your tests on any implementations other than yours.

    Now that you are implementing Expression.parse(), it’s a good idea to review the spec for Expression.toString(), which specifies a testable relationship between parse(), equals(), and toString().

  • Code. Implement Expression.parse() so that it calls the parser generated by your ParserLib grammar. The reading on parsers discusses how to call the parser and construct an abstract syntax tree from it, including code examples. The starting code for this problem set includes a skeletal ExpressionParser.java that you can work from.

2.3 Run the console interface

Now that Expression values can be both parsed from strings with parse(), and converted back to strings with toString(), you can try entering expressions into the console interface.

Run Main. In Eclipse, the Console view will allow you to type expressions and see the result. Try some of the expressions from the top of this handout.

Commit to Git. Once you’re happy with your solution to this problem, commit and push!


Problem 3: Layout

The layout operation takes an expression and produces a fully laid-out expression as its result, with resize operators inserted as needed to show the size of every subexpression.

For example, the following are correct layout results:

These are example outputs, not fully determined outputs. Your system’s output may vary within the bounds of the spec. Examples of variation include whitespace, parentheses, simplification, operator representation, number representation, font face, text color, text size, and error messages.

img/tech3.png --- img/tech4.png
(img/tech3.png@200x150 --- img/tech4.png@200x160)@200x310

(img/tech3.png --- img/tech4.png) _ img/black.png
((img/tech3.png@200x150 --- img/tech4.png@200x160)@200x310 _ img/black.png@30x30)@30x47

img/black.png@100x400
img/black.png@100x400
because the filename was already enclosed by an explicit resize operator
also correct: (img/black.png@30x30)@100x400
also correct: ((img/black.png@30x30)@100x400)@100x400
redundant parentheses and redundant resize operators are okay

"tech support"
"tech support"@widthxheight
width and height of a caption may vary depending on platform and font choice

Incorrect layouts:

img/tech5.png | img/tech6.png
incorrect: (img/tech5.png@200x150|img/tech6.png@200x160)@387.5x150
correct: (img/tech5.png@200x150|img/tech6.png@200x160)@388x150
(because sizes must be integers)

(img/black.png | img/black.png)@10x5
incorrect: (img/black.png@5x5|img/black.png@5x5)@10x5
correct: (img/black.png@30x30|img/black.png@30x30)@10x5
(because the subexpression img/black.png|img/black.png must be laid out before the resize operator @10x5 is applied)
also correct: ((img/black.png@30x30|img/black.png@30x30)@60x30)@10x5

This operation is where you should implement the resizing rules described in the overview. For getting sizes of image files and text captions, you may find the following useful:

3.1. Add an operation to Expression

You should implement layout as a method on your Expression datatype, defined recursively. The signature and specification of the method are up to you to design, but it would be wise for your layout operation to return Expression rather than String. Follow the recipe:

  • Spec. Define your operation in Expression and write a spec.
  • Test. Put your tests in ExpressionTest.java. Note that we will not run your tests on any implementations other than yours.
  • Code. The implementation must be recursive. It must not use instanceof, nor any equivalent operation you have defined that checks the type of a variant.

You may find it useful to add more operations to Expression to help you implement the layout operation. For example, an operation like size() that computes an expression’s size and returns it as a Dimension value (or a similar type of your own devising) would make it easier to implement the size rules. Spec/test/code these additional operations using the same recipe, and make them recursive as well where appropriate. Your helper operations should not simply be a variation on using instanceof to test for a variant class.

3.2 Implement Commands.layout()

In order to connect your layout operation to the user interface, we need to implement the Commands.layout() method.

  • Spec. The spec for this operation is given, but you may strengthen it if you want to make it easier to test.
  • Test. Write tests for layout() and put them in CommandsTest.java. These tests will likely be very similar to the tests you used for your lower-level layout operation, but they must use Strings instead of Expression objects. Note that we will not run your tests on any implementations other than yours.
  • Code. Implement layout(). This should be straightforward: simply parsing the expression, calling your layout operation, and converting it back to a string.

3.3 Run the console interface

We’ve now implemented the layout command in the console interface. Run Main and try some layouts in the Console view.

Commit to Git. Once you’re happy with your solution to this problem, commit and push!


Problem 4: Image Generation

The generate operation takes an expression and generates an image from it. You may strengthen this spec if you wish.

These are example outputs, not fully determined outputs. Your system’s output may vary within the bounds of the spec. Examples of variation include whitespace, parentheses, simplification, operator representation, number representation, font face, text color, text size, and error messages.

For example, the following are correct output for generated images:

img/tech1.png|img/tech2.png

img/boromir.jpg^"ONE DOES NOT SIMPLY"_"WALK INTO MORDOR"

img/black.png@600x50 ^
"          TECH SUPPORT          "
----------------------------------
img/tech1.png | img/tech2.png | img/tech3.png
----------------------------------
img/black.png@600x25 ^
(  "     What my friends think I do     "@200x25 
 | "     What my mom thinks I do     "@200x25 
 | "     What society thinks I do     "@200x25)
----------------------------------
img/black.png@600x25
----------------------------------
img/tech4.png | img/tech5.png | img/tech6.png
----------------------------------
img/black.png@600x25 ^
(  "     What my boss thinks I do     "@200x25 
 | "     What I think I do     "@200x25
 | "     What I actually do     "@200x25)
----------------------------------
img/black.png@600x25

For generating images, you may find the following useful:

4.1 Add an operation to Expression

You should implement generation as a method on your Expression datatype, defined recursively. The signature and specification of the method are up to you to design. Follow the recipe:

  • Spec. Define your operation in Expression and write a spec.
  • Test. Put your tests in ExpressionTest.java. Note that we will not run your tests on any implementations other than yours.
  • Code. The implementation must be recursive (perhaps by calling recursive helper methods). It must not use instanceof, nor any equivalent operation you have defined that checks the type of a variant class.

You may find it useful to add more operations to Expression to help you implement the generate operation. Spec/test/code them using the same recipe, and make them recursive as well where appropriate. Your helper operations should not simply be a variation on using instanceof to test for a variant class.

Since your generate operation produces an image, your tests may need to inspect properties of that image, like its width, height, and possibly some values of its pixels. Examples.java included in the starting code has examples of doing this inspection.

4.2 Commands.generate()

In order to connect your generate operation to the user interface, we need to implement the Commands.generate() method.

  • Spec. The spec for this operation is given, but you may strengthen it if you want to make it easier to test.
  • Test. Write tests for generate() and put them in CommandsTest.java. These tests will likely be very similar to the tests you used for your lower-level generate operation, but they should use Strings instead of Expression objects. Note that we will not run your tests on any implementations other than yours.
  • Code. Implement generate(). This should be straightforward: simply parsing the expression and calling your generate operation.

4.3 Run the console interface

We’ve now implemented the generate command in the console interface. Run Main and try using it in the Console view.

Commit to Git. Once you’re happy with your solution to this problem, commit and push!


Before you’re done

  • Make sure you have documented specifications, in the form of properly-formatted Javadoc comments, for all your types and operations.

  • Make sure you have documented abstraction functions and representation invariants, in the form of a comment near the field declarations, for all your implementations.

    With the rep invariant, also say how the type prevents rep exposure.

    Make sure all types use checkRep() to check the rep invariant and implement toString() with a useful representation of the abstract value.

  • Make sure you have satisfied the Object contract for all types. In particular, you will need to specify, test, and implement equals() and hashCode() for all immutable types.

  • Use @Override when you override toString(), equals(), and hashCode(), to gain static checking of the correct signature.

    Also use @Override when a class implements an interface method, to remind readers where they can find the spec.

  • Make sure you have a thorough, principled test suite for every type. Note that Expression’s variant classes are considered part of its rep, so a single good test suite for Expression covers the variants too.


Submitting

Make sure you commit AND push your work to your repository on Athena. We will use the state of your repository on Athena as of 10:00pm on the deadline date. When you git push, the continuous build system attempts to compile your code and run some basic tests. You can always review your build results at didit.csail.mit.edu.

Didit feedback is provided on a best-effort basis:

  • There is no guarantee that Didit tests will run within any particular timeframe, or at all. If you push code close to the deadline, the large number of submissions will slow the turnaround time before your code is examined.
  • If you commit and push right before the deadline, the Didit build does not have to complete in order for that commit to be graded.
  • Passing some or all of the public tests on Didit is no guarantee that you will pass the full battery of autograding tests — but failing them is almost sure to mean lost points on the problem set.

Grading

Your overall ps3 grade will be computed as approximately:
35% iter1 autograde + 5% iter1 manual grade + 45% iter2 autograde + 15% iter2 manual grade

The autograder test cases will not change from iter1 to iter2, but their point values will. In order to encourage incremental development, iter1 autograding will put more weight on tests using only filenames, |, and @, and less weight on tests that use the other features of the language (captions, ---, ^, and _). On iter2, autograding will look at the entire language.

Manual grading of iter1 will examine the specs of your Expression operations, the internal documentation (data type definition, AF, RI, etc.) of your Expression data type, and the implementations of Expression variants. Manual grading of iter2 may examine any part, including re-examining ADT implementations, and how you addressed code review feedback.