M.I.T. DEPARTMENT OF EECS

6.033 - Computer System Engineering Handout 13 - March 4, 1999

Design Project #1 - Web Server Replication

Introduction. One of the features of 6.033 is that we discuss real systems, both successful and unsuccessful. To get beyond discussion and give you some direct experience with designing systems, we also assign two design projects. As in the real world, these projects have (we hope) a simple high level problem statement but when you get into the system design you will find that there are hard choices to make. And, as in the real world, it is your job to explore, understand, and explain what the best choices are, how to reconcile sometimes conflicting goals, and how to keep the complexity of your design under control. This term, the first of these design projects relates to World-Wide Web technologies.

The Problem.

The ACME company has encountered a significant problem with the speed of their Web service.  Their server currently consists of a single machine (www.ACME.com) in San Francisco connected to the Internet via a high speed link. They are currently serving serveral million documents per day. However, there are days when they serve significantly greater number of requests. The average number of requests has also been going up steadily since they started the Web site. In addition,  they also expect a major influx of users when they start adding interesting dynamic content to their Web site. To add to ACME's problems, most users from abroad and some users in the U.S.A. have begun to complain about unacceptable response times. The are unsure whether their link to the Internet, the speed of their server and/or the bandwidth of the backbone Internet links are the cause of the poor performance. However, ACME does know that the bulk of their customers have upgraded to new fast cable and ADSL links and that the customers' Internet links are not the performance bottleneck.

To summarize their problems:

A clever 6.033 student studies this problem and proposes solving it by creating identical copies of the Web server at various locations in the U.S. and around the world. Users will use the replica that best suits their needs and access data from it. This will make it easy to handle a large number of requests and ensure that all individuals get good response times. Easier said than done.

In the Web today, support for such mirror sites requires significant user intervention. Often, users are presented with a list of sites on a Web page and are asked to pick one. This is how the download sites for popular items such as Netscape browser are done. Other times, users must chose a server site by picking the correct server name (such as www.ibm.com, www.ibm.com.uk, www.ibm.com.ca, etc.).

Your job is to design a new automated Web server replica system. A user should be able to request the Web page http://www.ACME.com and your system should fetch the Web page from the replicated servers. Your primary goal is to maximize the number of requests that the entire system can handle while trying to minimize each individual's observed response time.

You may change clients, servers and/or protocols as needed. However, ease of deployment is an important factor in a good design. If your design changes significant portions of the Web infrastructure, you should provide a deployment plan as part of your design.

At a minimum, your design must meet the following requirements:

In addition you should answer the following questions in your report: You may also want to consider the following issues: You do not need to address the following concerns in your design:


You should assume that HTTP/1.0 is currently used by the server and browsers. See http://web.mit.edu/rfc/rfc1945.txt  for the specification. Athena users can access the RFC directly by the following command "attach rfc; more mit/rfc/rfc1945.txt". This RFC is very long and not all of it is relevant to this project. You may want to skim or only read parts of the document.

In addition, www.w3.org has a wealth of other information. You may also wish to look at www.netscape.com and www.microsoft.com for background information on their browsers. Also, since you are allowed to change protocols as necessary, you may incorporate any of the proposed extensions of HTTP or features of HTTP 1.1 in your solution.

The following reading will also be useful.


Your report. Your paper should be 8 to 10 pages in length. You should start by explaining to your intended audience the background of the problem in terms that the audience can understand. Next, describe why what you have decided are desirable properties of a solution. Then give your solution and explain how well it achieves (or fails to achieve) the desirable properties. Throughout the paper you should justify each of your design decisions, especially in relation to alternative decisions that you could have made. You will be more convincing if you say not just why your idea is good, but why it is better than the alternatives. (For example, if another approach would meet all of the objectives perfectly, but the cost would be 100 times higher, then you should mention that as a reason for choosing your less general but cheaper approach.)

Write for an audience consisting of colleagues who took 6.033 five years ago. That is, they understand the underlying system and network concepts and have a fair amount of experience applying them in various situations, but they have not thought carefully about the particular problem you are dealing with. Assume that your paper will also be used to convince your friend's computer guru that you have the right idea. Finally, give enough detail that he can turn the project over to that guru for implementation with some confidence that you won't surprised by the result.

When evaluating your report, your instructor will be looking at both content and writing.

Content considerations:

Does your solution fit well with the rest of the system? If your solution requires modifying every piece of hardware, software, and data in sight, it won't be credible, unless you can come up with a very good story why everything needs to be changed.

How extensible is your design? Are there opportunities for later addition of desirable features that you decided to omit?

Writing considerations:

You can find other helpful suggestions on writing this kind of report in the M.I.T. Writing Program's on-line guide to writing Design and Feasibility Reports.

Phase Two writing considerations: If you are enrolled in the 6.033 writing practicum, you don't need to do anything special; your practicum instructor will explain how the report will get you credit for the Phase II writing requirement. If you are not enrolled in the practicum, AND you want us to forward your design project report to the writing program as your phase II writing project, please say so on the cover page, and make sure that your report is at least 8 pages long. Note also that the writing program has a rule that they will accept only reports that earn a B or better from the class in which they originate. Finally, be aware that the second design project will probably be a team project, and thus much more difficult to tailor to the needs of the writing program than this one.

Collaboration: This project is an individual effort. You are welcome to discuss the problem and ideas for solution with your friends, but if you include any of their ideas in your solution you should explicitly give them credit, and you should be the sole author of your report.

Schedule: Your report is due in recitation Thursday, March 18, 1999


Go to 6.033 Home Page Questions or Comments: 6.033-tas@mit.edu