6.031
6.031 — Software Construction
Fall 2018

Norn Specification

The purpose of this system is to represent and evaluate mailing list expressions. The result of evaluating a mailing list expression is a set of recipients who should receive an email message. Note that this system won’t send email itself. It will just produce a comma-separated list of recipient email addresses, suitable for copy and paste into an mail client.

It will also produce visualizations that explain the structure and meaning of mailing list expressions to users.

Recipients

A recipient is an email address with a username and domain name, like bitdiddle@mit.edu.

Usernames are nonempty case-insensitive strings of letters, digits, underscores, dashes, periods, and plus signs (e.g. bitdiddle+nospam).

Domain names are nonempty case-insensitive strings of letters, digits, underscores, dashes, and periods. This system does not enforce any other constraints on domain names. For example, we don’t care how many dot-separated parts a domain name has, or whether it ends with a valid top-level domain like .com or .edu.

List expressions

The simplest list expressions are:

  • the empty string, which means no recipients;
  • an email address, as defined above, which represents a single recipient;
  • a list name, which represents the set of recipients of a mailing list.

List names are nonempty case-insensitive strings of letters, digits, underscores, dashes, and periods (e.g. course.6).

For all the operators below, list expression subexpressions may be any valid list expression.

Set operators

A list expression may also use these set operators:

  • e,f means set union: recipients in either e or f;
  • e!f means set difference: recipients in e but not in f;
  • e*f means set intersection: recipients in both e and f.

Note that ! in this language is a binary operator, not a unary operator like it is in Java.

Of these three operators, * has highest precedence, and , has lowest precedence, so a,b!c*d is equivalent to a,(b!(c*d)). Operators at the same level of precedence group from left to right, so a!b!c means (a!b)!c. This is the same order as the usual arithmetic operators, in which 9-3-5 means (9-3)-5 rather than 9-(3-5).

Parentheses may be used to group subexpressions:

  • (e) represents the same recipients as e.

List definitions

A list expression may define a list name:

  • listname=e defines listname as the expression e, and returns the set of recipients of e.

The name definition operator = has lower precedence than ,, so that x=a,b is equivalent to x=(a,b).

Binary operators should be evaluated from left to right, so that list definitions in the left-hand side expression e must be substituted into the right-hand side expression f. For example, the meaning of (room=alice@mit.edu)*room should be alice@mit.edu.

The behavior of list definitions nested inside other list definitions, like a=(b=c), is not explicitly specified. The system should do something reasonable and self-consistent.

Sequencing

Finally, a list expression may be a sequence of list expressions separated by semicolons:

  • e ; f represents the recipients produced by f, after substituting the expressions of all named list definitions found in e

For example, consider x = a@mit.edu,b@mit.edu ; x * b@mit.edu. After substituting for x in the second part of the expression, this expression is equivalent to (a@mit.edu,b@mit.edu) * b@mit.edu, which represents the single recipient b@mit.edu.

Note that a semicolon is a separator between expressions, not a terminator. An expression ending with a semicolon, like a@mit.edu;, actually ends with the empty expression, so it evaluates to the empty set of recipients.

Semicolons have the lowest precedence of all the operators.

Whitespace

Whitespace characters around operators, email addresses, and list names are irrelevant and ignored, so a,b,c means the same as a , b , c. Whitespace characters are spaces, tabs (\t), carriage returns (\r), and linefeeds (\n).

List definition rules

Lists that depend on each other can be created in any order. For example, these two expressions should have the same result:

suite=room1,room2; room1=alice@mit.edu; room2=bob@mit.edu; suite
room1=alice@mit.edu; room2=bob@mit.edu; suite=room1,room2; suite

Undefined lists are empty. If a list name is evaluated without being previously defined, it represents the empty set of recipients.

A list can be edited, that is, redefined using its previous definition. For example, room1=alice@mit.edu; room1=room1,eve@mit.edu; room1 should evaluate to the recipients alice@mit.edu,eve@mit.edu. Generally, in a name definition listname=e, any previous definition for listname should be substituted wherever listname occurs in e. If listname has no previous definition, then the empty expression (i.e., no recipients) should be substituted for listname.

Dependent lists should see the edits. The edits to a list should be reflected in other lists that depend on it. For example, room1=alice@mit.edu; room2=bob@mit.edu; suite=room1,room2; room1=eve@mit.edu; suite should evaluate to the recipients eve@mit.edu,bob@mit.edu. Note that this means that for a name definition listname=e, no list names other than listname should be substituted into e until listname is actually used.

Mail loops are not allowed. Mutually-recursive list definitions like a=b;b=a should produce an error with an informative error message. Note that the rules above already prevent mail loops involving a single list, because a=a is not a recursive definition, but instead an edit to list a that simply makes no changes to a. But mail loops can arise when multiple lists depend on each other, so the system must detect and prevent those loops.

Console user interface

The mailing list expression language should be implemented in an interactive system that allows the user to enter expressions at the console and see how they expand into lists of recipients. When a valid list expression is entered at the console, the output should be a valid list expression with the same meaning but consisting only of email addresses and commas.

For example, we could interact with it like this (user input in green):

These are example outputs, not fully-determined outputs

Your system’s output may vary within the bounds of the spec. Examples of variation include whitespace, capitalization, and order.

> hobbits = bilbo@shire, frodo@shire, sam@shire, merry@shire, pippin@shire
bilbo@shire, frodo@shire, sam@shire, merry@shire, pippin@shire

> gandalf@cosmos, hobbits
gandalf@cosmos, bilbo@shire, frodo@shire, sam@shire, merry@shire, pippin@shire

> bagginses = bilbo@shire, frodo@shire
bilbo@shire, frodo@shire

> hobbits !bagginses
merry@shire, sam@shire, pippin@shire

Semantically, the interactive system’s output should be the result of evaluating the concatenation of all (non-erroneous) inputs that have been entered by the user, separated by ;. So the last result printed above should be the same as the result of the single concatenated expression:

hobbits = bilbo@shire, frodo@shire, sam@shire, merry@shire, pippin@shire ;
gandalf@cosmos, hobbits ;
bagginses = bilbo@shire, frodo@shire ;
hobbits !bagginses

Even though this is the specified semantics, your system does not have to actually implement it this way. It makes more sense to create a mutable data type to store list definitions, so that you do not have to repeatedly evaluate old expressions.

If the user enters an invalid list expression, the system should print an informative human-readable error message. This message is not further specified.

Here is an example that demonstrates editing mailing lists:

These are example outputs, not fully-determined outputs

Your system’s output may vary within the bounds of the spec. Examples of variation include whitespace, capitalization, and order.

> fellowship = frodo@shire, sam@shire, merry@shire, pippin@shire
frodo@shire, sam@shire, merry@shire, pippin@shire

> fellowship = fellowship, strider
frodo@shire, sam@shire, merry@shire, pippin@shire

> strider = aragorn@arnor
aragorn@arnor

> fellowship
frodo@shire, sam@shire, merry@shire, pippin@shire, aragorn@arnor

> fellowship = fellowship, gandalf@cosmos, gimli@erebor, legolas@mirkwood, boromir@gondor
frodo@shire, sam@shire, merry@shire, pippin@shire, aragorn@arnor, gandalf@cosmos, gimli@erebor, legolas@mirkwood, boromir@gondor

> fellowship = fellowship !gandalf@cosmos !boromir@gondor
frodo@shire, sam@shire, merry@shire, pippin@shire, aragorn@arnor, gimli@erebor, legolas@mirkwood

> fellowship = 


> fellowship

Loading and saving

The console interface should provide /save and /load commands that save and load all currently-defined named lists to a user-provided filename:

These are example outputs, not fully-determined outputs

Your system’s output may vary within the bounds of the spec. Examples of variation include whitespace, capitalization, and order.

> hobbits
bilbo@shire, frodo@shire, sam@shire, merry@shire, pippin@shire

> /save LOTR.txt

> hobbits =


> /load LOTR.txt

> hobbits
bilbo@shire, frodo@shire, sam@shire, merry@shire, pippin@shire

The contents of the file should be a single valid list expression — specifically, a sequence of list definitions. The behavior of the load command should be as if that list expression were provided as input to the console UI.

The save command should store list definitions in such a way that when reloaded they can still be edited as described above.

If /load is used on a filename that does not contain a valid list expression, or if /save is used on a filename that cannot be opened for writing (e.g. because the name contains illegal characters or refers to a folder that doesn’t exist), the system should print an informative human-readable error message.

Filenames cannot contain newlines but are not otherwise constrained by this spec.

Web user interface

In order to be more usable, the system will also be a web server, offering a web interface served from port 8080 using the standard HTTP protocol.

The system must be designed to be safe for use by a console user and multiple web users at the same time. All users should share the same set of list definitions. If one user defines or redefines a list, all users should subsequently see that edit.

The only request the system must support is evaluating a list expression:

GET /eval/list-expression

where list-expression is a list expression as defined above, but with all whitespace omitted. The result of this GET request should be an HTML page that, when viewed in a web browser, provides two user interface features:

  • displays the set of recipients represented by list-expression, so that the user can copy and paste them
  • presents a visualization that explains how the expression was evaluated

Visualization

The visualization must show, in some form:

  • the complete structure of the expression, in terms of operators and their operands
  • all the recipients involved in evaluating the expression, whether or not those recipients are included in the result

If list-expression is not a valid list expression, the system’s HTML response should show an informative human-readable error message.

The display of the user interface and the specifics of the visualization are entirely up to you. The visualization may be implemented with HTML, images, or a combination of both. It cannot require JavaScript. You should test it manually in at least two of Firefox, Chrome, Safari, and Edge.

For example, if the user on the console has defined:

> bagginses = bilbo@shire, frodo@shire
bilbo@shire, frodo@shire

> wizards = gandalf@cosmos, saruman@cosmos
gandalf@cosmos, saruman@cosmos

Then when a user on the same machine visits this URL in their web browser:

http://localhost:8080/eval/wizards,bagginses

These are example outputs, not fully-determined outputs

Your system’s output may vary within the bounds of the spec, which requires only that you show the complete structure of operators & operands and all the recipients involved in evaluation.

the web server might respond with the following HTML:

gandalf@cosmos, saruman@cosmos, bilbo@shire, frodo@shire
<hr>
( wizards &cup; bagginses )
<hr>
wizards = gandalf@cosmos, saruman@cosmos
<hr>
bagginses = bilbo@shire, frodo@shire

which appears in the web browser like this:

Correction

The example originally claimed to visualize:

http://localhost:8080/eval/wizards=gandalf@cosmos,saruman@cosmos;wizards,bagginses

but it was only showing the structure of the last sub­expression in the sequence.

The visualization must include the full recursive structure of the input expression. For this longer input, while the set of recipients would be the same, the visualization must also show the operands of ;, of =, and of the , inside the definition of wizards.

gandalf@cosmos, saruman@cosmos, bilbo@shire, frodo@shire
( wizards ∪ bagginses )
wizards = gandalf@cosmos, saruman@cosmos
bagginses = bilbo@shire, frodo@shire

and shows how the wizards and bagginses lists were unioned to compute the overall result.

The visualization should not include recipients or list names that are irrelevant to evaluating the expression, unless that information is clearly marked as ancillary (e.g., annotating bilbo@shire and frodo@shire to show that they also are part of the list hobbits; or including all current list definitions in the visualization, but highlighting the definitions used in the requested expression).

The visualization may use lines, colors, or shapes to communicate structure, and recipients or lists may be identified by acronyms, icons, images, etc. as long as the visualization assigns them uniquely and consistently, and provides some way to see the full email address of every recipient (e.g. with a tooltip or by consulting a key).

The provided code for the project demonstrates using the HTTP server library to render a visualization that includes images generated dynamically by the server.

Going further

If you want to go beyond the spec, you might extend the list expression language with new features, e.g.:

  • user-written comments, like // this is a comment
  • regular expressions that expand into a set of list names, like /6.\d+-students/
  • macro definitions, like {1}-minus-fascists = {1} !all-faculty

You might extend the console user interface with additional commands, e.g.:

  • showing the definition of a list instead of evaluating it, e.g. /show hobbits
  • seeing all currently-defined list names, e.g. /showall
  • debugging an expression with a console visualization of its evaluation, e.g. /debug hobbits

You might extend the web interface with requests for other console commands. For example, the web interface as specified doesn’t support loading and saving, because /load and /save are not list expressions.

Or you might extend your visualization to make it more usable or more aesthetically pleasing, to make it interactive, or to allow the user to manipulate list expressions in addition to just showing them.

None of these features are required. Optional features make your system more powerful and interesting, but your grade will be based on whether your code is safe from bugs, easy to understand, and ready for change, not by how many extra features it has.