Preparation for MapReduce recitation
- Read MapReduce.
- Skip sections 4 and 7.
This paper was published at the biennial Usenix Symposium on Operating Systems Design and Implementation (OSDI) in 2004, one of the premier conferences in computer systems. (OSDI alternates with the equally prestigious ACM Symposium on Operating Systems Principles (SOSP), at which appeared Eraser, the paper you already read in a previous recitation.)
After reading through Section 3, you should be able to understand and explain Figure 1 (the "Execution overview"). You might take a look at Part IX of the 2014 quiz to check your understanding (solutions). After reading Sections 5 and 6, you should understand the real-world performance of MapReduce. An example question that you should be able to answer: How do stragglers effect performance?
As you read, think about the following:
- MapReduce has a constrained programming model. Are the benefits of using MapReduce worth that constraint?
- What types of failures does MapReduce handle, and how does it handle them?
Question for Recitation
Before you come to this recitation, write up (on paper) a brief answer to the following (really—we don't need more than a couple sentences for each question). If your TA has requested that you email your answer to them, you may do that instead, but it should still be handed in before your recitation begins.
Your answers to these questions should be in your own words, not direct quotations from the paper.
- What are the performance goals of MapReduce (both the programming model + its implementation)?
- How was MapReduce implemented at Google to meet those goals?
- Why was MapReduce implemented in this way?