Spring 2007



FAQ

Preparation for Recitation 16

Read The Design and Implementation of a Log-Structured File System (reading #15 in the course packet).

As you probably remember from DP1, the file system is the part of an operating system that manages, among other things, how files are laid out on disk. In 6.033, you have so far seen several examples of file systems: the UNIX File System (presented in the UNIX paper and in Appendix 2-A) and your own high-performance-but-low-reliability file system, designed for DP1. Some of the tutorials also covered the Berkeley Fast File System (FFS).

In today's reading, the authors depart radically from FFS and propose, instead, a file system in which all of the data is stored in an append-only log, that is, a flat file that can be modified only by having data added to the end of it. In Chapter 9, we also hear about logs, specifically how they help achieve reliability. For today's reading, the purpose of the log is to achieve good performance.

To understand why an append-only log might lead to high performance, we quickly review the basics of disks (which you should recall from DP1). Data on a disk is stored on a constantly spinning platter and read by a head that floats above the platter. The platter is broken down into concentric rings called tracks; each track is divided into sectors. The time required to read or write data to a disk can be broken down into three components: moving the head to the correct track, waiting for the correct sector to rotate under the head, and transferring the data off the disk. The first two operations (known collectively as a "seek") are very expensive (and aren't getting cheaper very fast): together, they can take 10 or more milliseconds. Once the disk head is at the correct location, it can transfer data relatively quickly (at around 40 MB/s on today's high-end disks or 10 MB/s on the lower-quality disks DP1 assumed).

Now, given this description, the goal of a log-structured file system is to minimize seeks by treating the disk as an infinite append-only log. For example, the file system software simply appends new files to the end of the log. For this strategy to translate into higher performance, the file system software must:

  • Write in large batches to amortize the cost of a seek (if any).
  • Always start writing from the last place it stopped writing.
  • Avoid reading; reads are likely to cause a seek.

As you read the paper, consider how the authors achieve these three goals. Keep in mind that the "infinite log" is actually on a finite disk and ask how that constraint affects their goals. Also, ask yourself whether certain file access patterns by applications might make it hard for a log-structured file system to avoid seeks.

Questions or comments regarding 6.033? Send e-mail to the 6.033 staff at or to the 6.033 TAs at

Top // 6.033 home // $Id: lfs.html,v 1.2 2007/04/02 17:21:56 benmv Exp $