Read The Tail at Scale by Jeffrey Dean and Luiz Andre Barroso. Although it's our paper for Week 3 and not Week 4, this paper serves as a bridge between the first part of 6.1800, where we've been focused on how a single machine works, to the rest of the class, where we focus on systems made up of multiple machines. Many of the systems that we'll study after spring break will build on the ideas in this paper. As you read, consider the following:

  • The paper talks about "fan-out" architectures. What is a "fan-out" architecture? What might Google be using a fan-out architecture for, and how?
  • Why does latency, and in particular the tail of the latency distribution in a system, matter? Who is impacted by it?
  • The paper presents "hedge requests" as one way to decrease tail latency. What are the trade-offs when using this technique? What has potential to improve, and what has potential to get worse?
  • Describe the problem that "tied requests" are meant to solve

Submit your answers to these questions on Canvas by 12:00pm on Friday 2/20. You should be writing a few sentences in response to each question (so we don't need you to write an essay for each one, but we're also expecting more than one-word answers). Your responses should be in your own words, not direct quotations from the paper.