6.033 2011 Design Project 2
I. Due Dates and Deliverables
You will be working on the second design project in teams of three
students who share the same recitation instructor. There are four
deliverables for this design project:
- A list of team members emailed to your TA by April 7, 2011.
- One copy of a design proposal (not exceeding 1,200 words),
due at 5pm on April 21, 2011.
- One copy of a design report (not exceeding 5,000 words),
due at 5pm on May 5, 2011.
- A five-minute in-recitation presentation, on May 10, 2011.
In consultation with the chair of the faculty we have determined
that the assignment follows the spirit of the end-of-term rules.
All deliverables should be submitted via the online submission site.
As with real life system designs, 6.033 design projects are
under-specified, and it is your job to complete the specification in
a sensible way given the overall requirements of the project. As with
designs in practice, the specifications often need some adjustment as
the design is fleshed out. We recommend that you start early so that
you can evolve your design over time. A good design is likely to take
more than just a few days to put together.
II. The Problem
Ben Bitdiddle watched the ``Twitter revolutions'' in several countries
over the past few months, and was disappointed to see how government was
able to prevent communications by shutting down ISPs in these countries.
Even though many people in these countries had computers, laptops,
and cell phones with wireless communication capability, they weren't
able to exploit it to organize themselves. Ben wants to come up with a
distributed messaging system that does not require any central server or
network infrastructure, and allows users to send messages and subscribe
to other users' messages, similar to a publish-subscribe system.
Consider the scenario of a densely populated urban area such as Cairo,
which might have about 10 million people with WiFi-enabled devices
(Metropolitan Cairo has about 20 million people), where each device is
often (but not always) within radio range of other devices. Many users
will want to follow (subscribe to) messages from a small number of
popular topics; for instance, you can think of one topic having upwards
of 1M subscribers. On the other hand, less popular topics may only
have a handful of subscribers. Publishers send messages to a topic
(identified by a string), and users subscribe only to topics, and not to
individual publishers. There's no access control: any publisher can send
a message to a topic, and any user can subscribe to a topic. (Topics are
kind-of like ``hash-tags'' in Twitter, except that users subscribe
to a topic instead of manually searching for a hash-tag on Twitter.
The reason for the difference is that there is no single server that's
always online and can always perform a hash-tag search in this design.)
Our system should allow all publishers to post messages to topics, and
have the messages delivered to subscribers of those topics in a timely
manner (e.g., in a few hours at most). The goal is for a publisher
to be able to announce a gathering at Tahrir square the next morning,
and for most subscribers to receive this message in time.
A wireless radio works by allowing one node to broadcast a message to
any other node within radio range; those nodes will in turn receive this
message, and process it accordingly. You can assume that the wireless
layer provides you with four functions:
scan()
returns the list of nodes.
broadcast(msg)
broadcasts msg to all nodes within radio range,
i.e., likely every node returned by scan()
.
send(node, msg)
broadcasts msg to all nodes within radio range,
with a header indicating that only the specified node should look
at it.
receive()
returns any messages received from nearby nodes.
If several nodes are within radio range of each other, only one of them
can transmit at a time -- multiple transmissions would collide with each
other, making it impossible for receivers to receive any of the messages.
You can assume that the send function takes care of ensuring this,
similar to how Ethernet's collision detection works. You can think of
this as a fixed amount of bandwidth available for communication between
nodes in the same physical area.
You can assume a densely populated area of 10M users, where each node is
often within radio range of 5 other nodes. Assume that 10% of the nodes
are offline at any given time, although some nodes (e.g., an office or a
home desktop system) are much more reliable than others (e.g., cell phones
or laptops). You can assume that there are at least 1% highly reliable
nodes, and that they are evenly distributed over the area in question.
The average wireless rate is 1 Mbit/sec.
III. Requirements
The challenges you should address in your design project are as follows:
-
How does a user subscribe to a topic? How does a message from some
publisher sent to this topic find out about all of the subscribers?
What if the user is offline when the message is sent, or if the
publisher is offline when the user subscribes?
-
How does your system support popular topics that may have 10K-100K
subscribers without requiring any single computer with sufficient
bandwidth to handle this traffic, and without requiring any single
computer to be always online? How do publishers send messages to
such popular topics?
-
How does your system deal with individual users or groups of users
that turn off their computers when a message was published, or whose
wireless radios were out of range of the publisher at the time the
message was sent?
-
What happens if some user's computer breaks permanently, or moves
to a different location in the network?
-
How does a user move to a different computer, if their original
computer broke? Can they preserve their subscriptions?
You do not need to address security, anonymity, or authentication in
your design project (although it is an optional challenge problem).
After you have designed your system, you should evaluate it under the
following scenario:
- Ten popular topics have 1M subscribers each, and get a message every
hour. These topics each get a new subscriber every minute.
- There are also 1M other topics, each with 5 subscribers, and each
topic gets a new message every 5 minutes. However, the 5 subscribers
are often physically near the publisher, meaning that these messages
are of physically local interest only.
You should specify how much traffic will be caused in this scenario,
how much storage will be required, and when the subscribers will receive
the messages.
Optional challenge problems:
- Provide message authentication, meaning that a subscriber can be
reasonably confident that any message he or she sees was written
by the author the message claims to be from. Signing messages is
a good first step towards solving this problem, but how does the
subscriber know the publisher's key?
- How can your system deal with malicious nodes that may be run by
the government or some other adversary? There are many threats
you might consider from a malicious node, such as a node that
simply drops subscription requests from users, or a node that
actively sends messages to other nodes in an attempt to prevent
message delivery.
- Can you provide some form of anonymity or plausible deniability,
so that a government cannot tell what messages a publisher might
be writing, or what messages a user might be subscribing to?
IV. Design proposal
The design proposal should summarize your design in 1,200 words or
fewer. It should outline the protocol and algorithms the nodes use
to publish messages, to subscribe to topics, and to route messages
from publishers to subscribers.
You do not have to present a detailed rationale or analysis in
your proposal. However, if any of your design decisions are unusual
(particularly creative, experimental, or risky) or if you deviate from
the requirements, you should explain and justify those decisions in
your proposal.
You will receive feedback from your TA in time to adjust your final
report.
V. Design report
Your report should explain your design. It should discuss the major
design decisions and tradeoffs you made, and justify your choices.
It should discuss any limitations of which you are aware. You should
assume that your report is being read by someone who has read this
assignment, but has not thought carefully about this particular
design problem. Give enough detail that your project can be turned
over successfully to an implementation team. Your report should
convince the reader that your design satisfies the requirements
in Section III.
Use this organization for your report:
- Title page: Give your report a title that reflects the
subject and scope of your project. Include your names, email address,
recitation instructor, section time(s), and the date on the title page.
- No table of contents.
- Introduction: Summarize what your design is intended to achieve,
outline the design, explain the major trade-offs
and design decisions you have made, and justify those trade-offs
and decisions.
- Design: Explain your design. Identify your design's main
components, state, algorithms, and protocols. You should sub-divide
the design, with corresponding subsections in the text, so that the
reader can focus on and understand one piece at a time. Explain why
your design makes sense as well as explaining how it works. Use
diagrams, pseudo-code, and worked examples as appropriate.
- Analysis: Explain how you expect your design to behave in different
scenarios. What scenarios might pose problems for performance
or correctness? What do you expect to be the
scalability limits of your design?
- Conclusion: Briefly summarize your design and provide
recommendations for further actions and a list of any problems that must
be resolved before the design can be implemented.
- Acknowledgments and references: Give credit to individuals whom
you consulted in developing your design. Provide a list of references
at the end using the IEEE citation-sequence system ("IEEE style")
described in the
Mayfield
Handbook.
- Word count. Please indicate the word count of your report at the
end of the document.
Here are a few tips:
- Use ideas and terms from the course notes when appropriate; this
will save you space (you can refer the reader to the relevant section
of the notes) and will save the reader some effort.
- Before you explain the solution to any given problem,
say what the problem is.
- Before presenting the details of any given design component,
ensure that the purpose and requirements of that
component are well described.
- It's often valuable to illustrate an idea using an example, but
an example is no substitute for a full explanation of the idea.
- You may want to separate the explanation of a component's
data structures (or packet formats) from its algorithms.
- Explain all figures, tables, and pseudo-code; explain what is
being presented, and what conclusions the reader should draw.
While the Writing Program will not be grading DP2, you should feel
free to ask them for help.
VI. Presentation
You will have only about five minutes for your presentation. The audience
will be very familiar with the problem, so you can get right to the guts
of your solution.
VII. How we evaluate your work
Your recitation instructor will assign your report a grade that reflects
both the design itself and how well your report presents the design. These
are the main high-level grading criteria:
- Clarity.
Is the design described well enough to be understood, evaluated, and implemented?
- Correctness. Does the design achieve the requirements laid out in
Section III? Does your report give a convincing analysis that your design
works under the described scenario? Will the design be stable if the
workload is slightly different from the described scenario?
- Simplicity.
Is the level of complexity in the design justified?
VII. Clarifications
- ``Physically close'' does not necessarily mean within wireless radio
broadcast range.
- Even for unpopular topics, your design must support an occasional user
from far away deciding to subscribe to this topic.
- If a node is powered off, or is out of radio range, it must reasonably
catch up on messages to topics it subscribed to, which it may have
missed in the meantime, once it powers up and comes within radio range
of other nodes. You can define what ``reasonable'' means in your
system---for example, it seems OK for a laptop to miss messages if
it was offline for a month---but make sure your design works for the
application mentioned above (announcing a meeting in Tahrir square).
- You cannot assume that devices have GPS units. As an optional challenge
problem, you can explain how your design would change if some devices
had access to GPS, but your base design has to work without GPS data.
- You can assume that the
send()
function is reliable: if
it indicates that transmission was successful, then the other node
received the message.
Keep in mind that the other node can crash shortly after receiving
your message, so this assumption may not be that important or useful.
The broadcast()
function is not reliable: it does not
request (or wait for) acknowledgments.