Introduction. One of the features of 6.033 is that we discuss real systems, both successful and unsuccessful. To get beyond discussion and give you some direct experience with designing systems we also assign two design projects. As in the real world, these projects have (we hope) a simple high-level problem statement but when you get into the system design you will find that there are many hard choices to make. And, as in the real world, it is your job to explore, understand, and explain what the best choices are, how to reconcile sometimes conflicting goals, and how to keep the complexity of your design under control.This term, the first of these design projects relates to caching, networks, and the World-Wide Web.
The problem. MITnet connects to the rest of the Internet via NEARNET (New England Academic and Research Network,) a consortium that provides links among the various universities and research laboratories in New England, and gateways to the Internet backbone providers. NEARNET charges the Institute a fixed monthly fee for this service.
This design project is based on a problem that has not actually come up, but it easily could, so from here on the problem statement is fantasy: The people who run NEARNET are pondering the idea of changing the method of calculating the fee: they want to start basing the fee on the number of bytes transferred in and out of M. I. T. Since M. I. T. is one of the busiest network sites in New England, they suspect that this will be a big revenue generator for NEARNET. They can easily install counters in M. I. T.'s NEARNET port, so from the point of view of NEARNET administration, this proposal looks like a big win.
The Provost, who is responsible for the Institute budget, is worried about this proposal, because increased revenue for NEARNET means a bigger budget deficit for M. I. T. But since the Provost is a faculty member in Computer Science, he believes that the problem is solvable with technology. He knows that a large fraction of the traffic between M. I. T. and the rest of the Internet is for World-Wide Web pages, and that often the same Web page is requested by many different users at M. I. T. So he has asked you to design a cache that can be located somewhere inside M. I. T. and that can hold frequently--accessed Web pages from outside M. I. T.
Your job is to analyze the situation based on what you what you have learned in 6.033 and its prerequisites about the World-Wide Web, caching, and networks. Then develop a complete design for such a cache.
One potential complication is that the Laboratory for Computer Science has been encountering bottlenecks in moving traffic across the campus from Technology Square to the NEARNET gateway, and that laboratory is discussing a proposal to install a second NEARNET gateway in Tech Square.
The design project. Propose a design of something (you get to decide what---a server, a gateway feature, a protocol, a network topology, a routing strategy, a modified network browser, or some combination of these and other things) that introduces a cache that can reduce the flow of traffic for repeatedly requested WWW pages from outside M. I. T., while at the same time introducing a minimum of new problems. You do NOT have to implement the design.
Before jumping to the design stage, you should do some more reading:
Your solution should attempt to achieve the following (presumably) desirable properties:
Your report. Your paper should be 5 to 10 pages in length. You should start by explaining to your intended audience the background of the problem in terms that audience can understand. Next, describe your solution and explain how well it achieves (or fails to achieve) the desirable properties, and any other properties that you notice are worth providing. Throughout the paper you should justify each of your design decisions, especially in relation to alternative decisions that you could have made. You will be more convincing if you say not only why your idea is good, but why it is better than the alternatives. (For example, if another approach would be perfect except that the cost would be infinite, you should point that out.) If you can find any statistics about network traffic to support your proposal (or to support a recommendation that a cache is a bad idea) that would be a useful addition to your report.
Write for an audience consisting of colleagues who took 6.033 five years ago. That is, they understand the underlying concepts and have a fair amount of experience applying them in various situations, but they are not familiar with the particular problem you are dealing with. Assume that your paper will also be used to convince whoever is responsible for deciding what design to use to choose your design. Finally, give enough detail that you could turn the project over to an implementor with some confidence that you won't surprised by the result.
When evaluating your report, your instructor will be looking at both content and writing...
Content considerations:
Schedule: Your report is due in recitation Thursday, March 21, 1996.
6.033 Handout 11, issued 3/7/96