Design Project #2: Peer-to-peer File Sharing

This is a team project, so form a team of 3 people, where all of you have the same recitation instructor. Please send email to your recitation instructor as soon as possible with the names and email addresses of your team members. Please let us know by noon on Thursday, April 19th at the latest. And get to work! Your team's report is due on Thursday, May 10th, in recitation. Questions and answers about DP2 will also be posted here; check periodically for updates.

New: keep an eye on the DP2 FAQ for updates and clarifications.

Introduction

Your job is to design a peer-to-peer file sharing application. Your design should allow Internet users to cooperatively pool free disk space and network bandwidth for use in serving public-access data.

Many Internet users have useful and popular data, but do not have the server or network resources required to satisfy the demand for that data. For example, an open source developer might wish to publish a software distribution; the average demand for that distribution may be low, but the instantaneous demand after a new release may be too high for the developer's server or network connection to handle. Your design should allow such users to share disk space and bandwidth in a way that spreads the total load over all the participants. Your design should avoid the common requirement that each publisher of data own a network link with capacity equal to the peak demand for that data.

You should focus your design efforts on the availability and authenticity of the stored data; on avoiding centralized components; and on good performance with large numbers of clients and servers.

Your task

Your system should support the following model of use. Internet users, perhaps thousands or millions of them, will volunteer their machines for use as servers in your system. Publishers of data should be able to insert named documents into your system, causing them to be stored (in whatever manner you determine) on the servers. A publisher should be able to update a document without changing its name. A consumer of data should be able to retrieve a document from your system using its name, and verify the authenticity of the resulting data. You cannot assume that all the participating servers are trustworthy.

Your system should implement an API like the following for publishers and consumers:

insert(document, name): attempts to store or update a document under the given name. insert() is allowed to return a failure indication if, for example, it detects an un-authorized update.
get(name): if a publisher has inserted a document under the given name, return the document contents. Otherwise return failure.

You should also have some procedure for deleting or garbage collecting useless documents, or a good argument for why such a feature is not needed.

One of your design decisions is to figure out a reasonable naming scheme. You may find it useful to assume that names typically come from links embedded in documents, much like URLs; this means that you don't have to make it particularly convenient for users to remember and type document names. Consider using either content hashes (a cryptographic hash of the document's content) or public keys as document names. In the latter case, you could demand that publishers include a digital signature in each document; consumers could check the signature against the name/key, and your servers could reject inserts (including updates) for which the name/key did not match the signature. One place to look for naming ideas is the keys and searching section of the FreeNet paper.

The challenge in your project is to support the above API with no central control; your system should be truly cooperative, or in today's buzzwords peer-to-peer. Your system will be composed of the Internet hosts of a large number of volunteer users. None of these hosts will be particularly reliable or powerful. You cannot assume the existence of some special set of very reliable or high-capacity servers. You can't assume the existence of any formal organization that runs or maintains your system.

More precisely, your job is to design a cooperative system that:

Defines a naming scheme for documents;
Allows publishers to insert and update named documents;
Allows consumers to retrieve documents by name;
Allows consumers to verify that a retrieved document hasn't been tampered with (the retrieved document is the same as the published document);
Allows the consumers to retrieve a document, even though some of the nodes in your system have failed; remember the nodes are run by ordinary Internet users.

A user of your systems retrieves documents through a Web browser, but you are allowed to design a program that runs on the user's computer that intercepts the browser's HTTP requests and feeds them into your system.

Chord Interface

To simplify the problem, we provide you with a distributed hash system, called Chord, that maps keys to nodes. Chord keys are arbitrary 160-bit values; you'll probably want to use SHA-1 to generate them in some way based on your document names. You might use Chord to map a document name to the server that stores it.

Chord is described on a separate page. You don't have to understand its inner workings in order to use it, so you can safely skip the description. However, you may find that understanding Chord will give you better opportunities for optimization.

Chord assigns every participating node a 160-bit ID. Chord maps key k to the node with the smallest ID that follows k; that node is called k's successor. A Chord node does not keep a complete list of all other Chord nodes in order to perform this mapping. Instead, each node knows about a few other nodes, and sends "find successor" messages to those other nodes to help it map keys to nodes.

Chord presents you with this API:

join() Call this function when your software starts up. It will cause the node on which it is called to be added to the Chord system. The new node will exchange messages with some existing nodes to make them aware of its existance. join() returns the calling node's Chord ID.
lookup(key) Returns the Chord ID and IP address of key's successor Chord node. This typically involves sending find successor messages to other Chord nodes.
event_register(fn) Arranges for Chord to call function fn whenever a node with ID close to the current node's ID joins or leaves the Chord ring. You would supply such a function if you wanted to be notified when other nodes join and leave the system. Chord will call fn as fn(join?, id, addr), where the join? flag indicates if a node is joining or leaving, id is the node's ID, and addr is the node's IP address. Chord only uses event_register() to keep a node up to date about the identities of its immediate successor and precessor in the ID space.
lookup_register(fn) Arranges for Chord to call fn at each step of a call to lookup(key) on the current node. Chord will call fn as fn(key, id, addr), where id is the Chord ID of a node it is about to send a find successor message to, and addr is that node's IP address.

You can change the details of this interface if you need to, as long as the change is consistent with the Chord implementation. If you change the interface, your report should include an outline of how to implement your changes.

Other peer-to-peer systems

Peer-to-peer systems are a hot topic and quite a few have been proposed or implemented. They differ in the services they offer and the guarantees they provide.

Napster allows people to share music files across the Internet. It involves a central server, containing a database of everybody's music files. You ask the central server to perform a lookup for you, which returns the IP addresses of a few Napster users who have the music you want. Then you directly contact one of those users' computers and ask for the file. The main drawback of Napster is the central server, which is a potential performance and reliabilty problem. Napster retrieves documents using keyword search, which you don't need to implement; you should identify documents using unique IDs or names.

GNUtella provides distributed search, as well as direct peer to peer transfer of the resulting files. Each GNUtella node knows about a few other neighbor nodes; this results in a mesh of nodes. When a node wants to find a file, it floods a query for the file over the whole mesh, starting with the nodes it knows about. When a node receives a query, it tries to match that query against locally-stored files. If there's a match, it returns the file to the original querier. Otherwise the node forwards the queries to its neighbors. This approach eliminates the central component of Napster. However, every node is involved in every lookup, so GNUtella doesn't scale well to large numbers of nodes.

FreeNet provides an interface similar to the one we are asking for in this assignment. Each document has an identifier, clients ask for documents using identifiers, and FreeNet figures out which server node is likely to store a document based on the identifier. You may find FreeNet a useful source of ideas about naming and authenticity checking. FreeNet provides anonymity for both consumers and publishers of data, and makes malicious deletion of data difficult. However, FreeNet's focus on anonymity causes it to sometimes fail to find documents that exist. You are not required to implement anonymity.

The Eternity Service is a distributed storage service that guarantees that documents will never disappear. Its guarantees are quite strong; it can even defend against legal attacks by governments. You are not required to provide eternal storage.

Your report

Your report should include an explanation of how your system works, and a justification of why your design is good. You should include pseudo-code and diagrams as aids to (but not replacements for) your explanation.

Your paper must not exceed ten 11-point, single-spaced pages in length. Please use 1-inch margins.

Your report should describe the operation of your system. It should describe how your insert() and get() functions work. It should describe how to publish a document, how users name documents, and how users decide whether document content is authentic. You should characterize the extent to which server and network failures affect the availability of documents.

Make sure your report touches on the following points. You should not try to design a system that is perfect in all of these areas, but you should at least be able to describe its properties.

How does your system balance the load of serving a single extremely popular document? Consider both small (1000-byte) and large (1-megabyte) files. How long are users likely to have to wait while downloading such files?
How well does your system handle failure? Suppose a randomly selected 5% of your nodes crash; how likely is that to make some piece of data not available?
How scalable is your system? Would your system make sense with thousands of nodes? Millions?
How does your system implement deletion of files?
If a publisher updates a file, is it possible for a user to see an inconsistent mix of old and new data? Or for a user to observe new data followed by out-of-date data? Could a malicious party cause such inconsistency to occur?

Writing Suggestions

This section provides some suggestions and guidelines on writing style and some of the things we look for in your report.

1. Suggestions on Writing Style

Your paper should be as long as is necessary to explain the problem, your solution, the alternatives you considered, and the reasons for your choices. It should be no longer than that. Again, please make sure your paper is no longer than ten pages.

A good paper begins with an abstract. The abstract is a very short summary of the entire paper. It is not an outline of the organization of the paper! It states the problem to be addressed (in one sentence). It states the essential points of your solution, without any detailed justification. And it announces any conclusions you have drawn. Good abstracts can fit in 100-150 words, at most.

The body of your paper should expand the points made in the abstract. Here you should:

Explain the problem and the externally imposed constraints. You should explain to your intended audience the background of the problem in terms that the audience can understand.
Explain and elaborate your solution. Be sure to explain the approach or architecture conceptually before delving into details, components, and mechanics. (Remember, you aren't writing a mystery novel!) Present any analysis clearly and concisely.
Show how your solution satisfies the constraints and solves the problem (or how it fails to do so); explain how the properties of your solution that result from choices you have made in your design are reasonable or desirable in the context of the problem statement.
Briefly describe the alternative approaches that you considered and rejected, and why you rejected them. Your paper will be more convincing if you say not just why your idea is good, but why it is better than the alternatives. (For example, if another approach would meet all of the objectives perfectly, but the cost would be 100 times higher, then you should mention that as a reason for choosing your less general but cheaper approach.)
Document your sources, giving a list of all references (including personal communications). The style of your citations (references) and bibliography should be similar to the styles in the technical papers you're reading in this class. In particular, a bibliography at the end and no citations in the text of your paper is insufficient; we'd like to see what specific pieces of information you learned from where as we read your paper.

Write for an audience consisting of colleagues who took 6.033 five years ago. That is, they understand the underlying system and network concepts and have a fair amount of experience applying them in various situations, but they have not thought carefully about the particular problem you are dealing with. Give enough detail that you rdesign could be turned over to some programmers for implementation with some confidence that you won't surprised by the result.

2. How do we evaluate your report?

When evaluating your report, your instructor will look at both content and writing.

Some content considerations:

Does your solution actually address the stated problem?
Do you explain your decisions and the trade-offs?
How complex is your solution? Simple is better, yet sometimes simple won't do the job. But unnecessary complexity is bad.
Does your solution fit well with the rest of the system? If your solution requires modifying every piece of hardware, software, and data in sight, it won't be credible, unless you can come up with a very good story why everything needs to be changed.
How extensible is your design? Are there opportunities for later addition of desirable features that you decided to omit?
Is your analysis clear?

Some writing considerations:

Is the report easy to comprehend?
Is it well organized and coherent?
Does it use diagrams where appropriate? (A frequent problem when people use word processors is that they try to express everything in words, either because the word processor doesn't make it easy to include diagrams, or they haven't ever learned how to use the drawing features. Pictures can communicate some ideas far better.)
Does it use the concepts, models, and terminology introduced in 6.033? If not, does it have a good reason for using a different universe of discourse?
Does it address the intended audience?
Is there a good abstract and bibliography?

You can find other helpful suggestions on writing this kind of report in the M.I.T. Writing Program's on-line guide to writing Design and Feasibility Reports. You may also want to look at the Mayfield Handbook's explanation of IEEE documentation style. A very good book on writing style is: "The Elements of Style," by William Strunk Jr. and E. B. White, Third Ed., MacMillan Publishing Co., New York, NY, 1979.

VII. Schedule

Your report is due at the beginning of recitation Thursday, May 10th, 2001.

Go to 6.033 Home Page Questions or Comments: 6.033-tas@mit.edu