6.033 2011 Design Project 2

I. Due Dates and Deliverables

You will be working on the second design project in teams of three students who share the same recitation instructor. There are four deliverables for this design project:
  1. A list of team members emailed to your TA by April 7, 2011.
  2. One copy of a design proposal (not exceeding 1,200 words), due at 5pm on April 21, 2011.
  3. One copy of a design report (not exceeding 5,000 words), due at 5pm on May 5, 2011.
  4. A five-minute in-recitation presentation, on May 10, 2011. In consultation with the chair of the faculty we have determined that the assignment follows the spirit of the end-of-term rules.

All deliverables should be submitted via the online submission site.

As with real life system designs, 6.033 design projects are under-specified, and it is your job to complete the specification in a sensible way given the overall requirements of the project. As with designs in practice, the specifications often need some adjustment as the design is fleshed out. We recommend that you start early so that you can evolve your design over time. A good design is likely to take more than just a few days to put together.

II. The Problem

Ben Bitdiddle watched the ``Twitter revolutions'' in several countries over the past few months, and was disappointed to see how government was able to prevent communications by shutting down ISPs in these countries. Even though many people in these countries had computers, laptops, and cell phones with wireless communication capability, they weren't able to exploit it to organize themselves. Ben wants to come up with a distributed messaging system that does not require any central server or network infrastructure, and allows users to send messages and subscribe to other users' messages, similar to a publish-subscribe system.

Consider the scenario of a densely populated urban area such as Cairo, which might have about 10 million people with WiFi-enabled devices (Metropolitan Cairo has about 20 million people), where each device is often (but not always) within radio range of other devices. Many users will want to follow (subscribe to) messages from a small number of popular topics; for instance, you can think of one topic having upwards of 1M subscribers. On the other hand, less popular topics may only have a handful of subscribers. Publishers send messages to a topic (identified by a string), and users subscribe only to topics, and not to individual publishers. There's no access control: any publisher can send a message to a topic, and any user can subscribe to a topic. (Topics are kind-of like ``hash-tags'' in Twitter, except that users subscribe to a topic instead of manually searching for a hash-tag on Twitter. The reason for the difference is that there is no single server that's always online and can always perform a hash-tag search in this design.) Our system should allow all publishers to post messages to topics, and have the messages delivered to subscribers of those topics in a timely manner (e.g., in a few hours at most). The goal is for a publisher to be able to announce a gathering at Tahrir square the next morning, and for most subscribers to receive this message in time.

A wireless radio works by allowing one node to broadcast a message to any other node within radio range; those nodes will in turn receive this message, and process it accordingly. You can assume that the wireless layer provides you with four functions:

If several nodes are within radio range of each other, only one of them can transmit at a time -- multiple transmissions would collide with each other, making it impossible for receivers to receive any of the messages. You can assume that the send function takes care of ensuring this, similar to how Ethernet's collision detection works. You can think of this as a fixed amount of bandwidth available for communication between nodes in the same physical area.

You can assume a densely populated area of 10M users, where each node is often within radio range of 5 other nodes. Assume that 10% of the nodes are offline at any given time, although some nodes (e.g., an office or a home desktop system) are much more reliable than others (e.g., cell phones or laptops). You can assume that there are at least 1% highly reliable nodes, and that they are evenly distributed over the area in question. The average wireless rate is 1 Mbit/sec.

III. Requirements

The challenges you should address in your design project are as follows:

  1. How does a user subscribe to a topic? How does a message from some publisher sent to this topic find out about all of the subscribers? What if the user is offline when the message is sent, or if the publisher is offline when the user subscribes?
  2. How does your system support popular topics that may have 10K-100K subscribers without requiring any single computer with sufficient bandwidth to handle this traffic, and without requiring any single computer to be always online? How do publishers send messages to such popular topics?
  3. How does your system deal with individual users or groups of users that turn off their computers when a message was published, or whose wireless radios were out of range of the publisher at the time the message was sent?
  4. What happens if some user's computer breaks permanently, or moves to a different location in the network?
  5. How does a user move to a different computer, if their original computer broke? Can they preserve their subscriptions?

You do not need to address security, anonymity, or authentication in your design project (although it is an optional challenge problem).

After you have designed your system, you should evaluate it under the following scenario:

You should specify how much traffic will be caused in this scenario, how much storage will be required, and when the subscribers will receive the messages.

Optional challenge problems:

IV. Design proposal

The design proposal should summarize your design in 1,200 words or fewer. It should outline the protocol and algorithms the nodes use to publish messages, to subscribe to topics, and to route messages from publishers to subscribers.

You do not have to present a detailed rationale or analysis in your proposal. However, if any of your design decisions are unusual (particularly creative, experimental, or risky) or if you deviate from the requirements, you should explain and justify those decisions in your proposal.

You will receive feedback from your TA in time to adjust your final report.

V. Design report

Your report should explain your design. It should discuss the major design decisions and tradeoffs you made, and justify your choices. It should discuss any limitations of which you are aware. You should assume that your report is being read by someone who has read this assignment, but has not thought carefully about this particular design problem. Give enough detail that your project can be turned over successfully to an implementation team. Your report should convince the reader that your design satisfies the requirements in Section III.

Use this organization for your report:

Here are a few tips:

While the Writing Program will not be grading DP2, you should feel free to ask them for help.

VI. Presentation

You will have only about five minutes for your presentation. The audience will be very familiar with the problem, so you can get right to the guts of your solution.

VII. How we evaluate your work

Your recitation instructor will assign your report a grade that reflects both the design itself and how well your report presents the design. These are the main high-level grading criteria:

VII. Clarifications