Preparation for Recitation on GFS

Because some of you like to read the papers ahead of time, we will continue to post recitation content about a week in advance. However, you should not post your reflections for this paper until the start of the relevant week.

Read the GFS paper here. You've seen GFS before: it is the system that MapReduce relied on to replicate files. In a normal semester, GFS is the paper that would fall between Lectures 15 and 16.

GFS is a system that replicates files across machines. It's meant for an environment where lots of users are writing to the files, the files are really big, and failures are common. Section 2-4 of the paper describe the design of GFS, Section 5 discusses how GFS handles failures, and Sections 6-7 detail their evaluation and real-world usage of GFS.

To check whether you understand the design of GFS, you should be able to answer the following questions: What is the role of the master? How does a read work? How does a write work? If you don't know the answers to these questions, you should post on your teaching-team Piazza so that you can discuss; we will assume you know the answers to these questions going into your videochat.

As you read, think about:

Reflection

Below are three questions for you to reflect on as you read the paper. You will post your reflection, or respond to another student's reflection, on your Teaching Team Piazzas during the week of March 30th. You do not need to email responses to these questions to your TA.

As far for posting and responding to reflections, we'd like to see a decent amount of posting and responding. To that end:

Now, for the questions themselves. There are many possibile answers for each. We're expecting you to thoughtfully consider these questions, not come up with the single "best" answer. Your answers to these questions should be in your own words, not direct quotations from the paper.

  1. Why does GFS make its simplifying assumptions? (Alternately, what becomes simpler because of those assumptions)
  2. Could we use a RAID-like parity scheme, but in a wide area? Propose a scheme, or state some of the challenges with such a scheme.
  3. GFS replicates every file three times. Why do you think they chose three (not, e.g., four)? Do you think the overhead of this replication is worth it?