M.I.T. DEPARTMENT OF EECS

6.033 - Computer System Engineering Handout 5 - February 22, 2001

Assignment 3: March 1 through March 9


For Recitation: Thursday, March 1

Read Savage's paper, "Eraser: A Dynamic Data Race Detector for Multi-threaded Programs" (reading #10). If you read this paper sequentially, you might not reach all the important issues. Skim over the section describing the Lockset algorithm. After you understand the main concepts of the paper, return to the Lockset algorithm. We do not expect you to memorize this algorithm. Rather, we expect you to learn about synchronization issues.

In Figure 2 of the paper, remember that "y:=y+1" is not necessarily atomic. First, the processor loads the value of y into a register. Next, the processor increments the register value. Finally, the value is written back to y. Can you find interleavings which result in different values of y? For your convenience, here are some useful definitions:

Static prevention
When all operations occur before execution to help prevent problems.
Dynamic detection
When operations occur during execution to help detect problems.
Bohr bug /bohr buhg/
Jim Gray calls a bug with a repeatable, deterministic behavior a Bohr bug (e.g., a program which always fails when you divide by zero).
heisenbug /hi:'zen-buhg/
Jim Gray calls a bug with non-deterministic (or timing dependent) behavior a heisenbug. These are particularly nasty bugs (e.g., a data race).

The Eraser paper refers to "static detection" rather than "static prevention". In dynamic detection, a program will try to catch errors when they happen, but will halt execution when errors arise. In static prevention, a program will try to catch errors before they happen -- preventing errors from happening in the first place.

Complete the following hands-on assignment and turn it in at the beginning of recitation.

Introduction

This hands-on assignment will help you visualize how the UNIX process scheduler allocates CPU time in various situations. You'll use a program called csmeter, whose C source you can download from csmeter.c.

csmeter runs for about two seconds, continuously asking UNIX for the current time of day. It maintains an array indicating, for each period of a tenth of a millisecond, whether it ran during that period. Then it emits a line of output for each period, each with the time (in milliseconds) followed by a 1 if csmeter was running during that period, and a 0 otherwise. csmeter gives itself a low scheduling priority, so that it's likely to run only if no other process wants to run.

You can compile and run csmeter as follows:

athena% add gnu
athena% gcc -O -o csmeter csmeter.c
athena% ./csmeter > csmeter.out

You'll want to view the output of csmeter using gnuplot:

athena% add gnu
athena% gnuplot
gnuplot> set xlabel "Time (milliseconds)"
gnuplot> plot "csmeter.out"

By looking at the pattern of marks at Y=0 in the graph you can see a picture of when other processes were running. csmeter's low priority means that its presence won't affect when other processes are scheduled (this isn't actually quite true: the scheduler will give csmeter a little bit of CPU time no matter what).

You should run csmeter only on single-CPU machines. If you run it on a multi-CPU machine, then you can't use the fact that csmeter is running to conclude that no other processes are running; they might be running on a different CPU. You can check how many CPUs an Athena machine has with

athena% sysinfo | grep CPUs
If you see 1, you're OK; otherwise, try another machine. Most (perhaps all) of the Athena workstations have one CPU, while many of the dialup machines have two.

Example

Below is a sample graph of csmeter output. It means that csmeter got most or all of the CPU for the first second, then no CPU at all for half a second, then most or all of the CPU for the last half second. This is evidence that no processes other than csmeter wanted to run for the first second and last 1/2 second; but that from 1000 to 1500 milliseconds some other process (or system activity) was consuming all CPU time.

Experiments

Try running csmeter at the same time as a variety of other activities on the same machine. You should run csmeter in one window (redirecting its output to a file), and the other activity in another window on the same machine. Look at each output file with gnuplot and explain to yourself what you are seeing.

First, just run csmeter by itself on an idle machine, redirecting its output to file out1.

Second, run two csmeters at the same time in different windows, capturing their output in two different files, say out2a and out2b. csmeter doesn't start measuring until the start of the next wall-clock second after you start it, so your two graphs are likely to line up nicely; if they don't, try again. Plot the two graphs together:

gnuplot> plot "out2a", "out2b"

Third, run csmeter at the same time as a compute-intensive process, and capture csmeter's output in out3. You can use this as your compute-intensive process:

athena% perl -e 'while(1){}'
Perl is scripting language reminiscent of C; while(1){} is an infinite loop. You should stop perl with control-C after csmeter finishes.

Fourth, run csmeter at the same time as a disk I/O intensive process, capturing csmeter's output in out4. Use this as your I/O intensive process:

athena% tar cf /dev/null /usr/.
This tar command reads (and discards) the contents of all files in /usr. Again, stop the tar with control-C after csmeter finishes.

Questions

For each of the four experiments, explain why the graphs look the way they do. If there are long runs of continuous CPU activity, why are they there? Likewise, if there are short or instantaneous bursts of CPU, what causes them? When the CPU scheduler switches between csmeter and whatever else is running, what do you think causes it to make those switches?

Pay special attention to the differences between out2 and out3, and between out3 and out4.

Your answers for each experiment should be no more than a few sentences.

Please also note how much time you spent on this assignment.

What To Hand In

For each of the four experiments, print out your graph and your explanation. Hand them in to your TA by the beginning of recitation of March 1.

For Lecture: Monday, March 5

This lecture is the first lecture of five covering networking. Read Chapter 4, section A of the 6.033 class notes. Today's lecture will be an introduction to networking.

For Recitation: Tuesday, March 6

For recitation and your one-pager, read Metcalfe and Boggs's paper, "Ethernet: distributed packet switching for local computer networks" (Reading #11). You should address the following question in your one-pager:
The paper describes experimental Ethernet, which had a bandwidth of 3 Mbps (million bits per second). The latest commercial standard, Gigabit Ethernet, is rated at one billion bits per second. Describe the impact on the basic parameters of an Ethernet network (minimum packet size, maximum cable length) of moving from experimental to commercial grades. Explain clearly the effects of incommensurate scaling as the bandwidth of an Ethernet system is increased.

For Lecture: Wednesday, March 7

Read Chapter 4, section B of the 6.033 class notes. Today's lecture will be on network layers.

Review session for the first quiz will be in Rm 34-101 from 7-9pm.

For Recitation: Thursday, March 8

Design Project 1 is assigned today. Look out for the Design Project on the web.
No Reading or Hands-on assignments today. Study for Quiz 1.

Design Project 1 is due in two weeks: Thursday, March 22.

 

For Quiz 1: Friday, March 9

Quiz 1 will be held from 2-3pm on Friday, March 9, 2001. The quiz will cover all the material up to and including the March 6th recitation (R8). The quiz will be open book. That means you can bring along any printed or written materials that you think might be useful. Calculators are allowed, though not necessary. 6.033 quizzes typically begin with half a dozen independent questions that focus on material in the readings followed by one or two longer, multi-part questions that deal with concepts of computer systems covered in lecture. The quiz will be, at least in part, multiple choice. The quiz will be held in 34-101 and in 10-250. See the chart below to determine which location you should go to for the quiz.

            Last Name      Location
              A-K           34-101
              L-Z           10-250

There will be a quiz review from 7-9pm, Wednesday, March 7th. Room 34-101. During this time we will go over an outline of the covered subjects and explain a few questions from the Problems and Solutions section of the class notes, which were in the update handed out in lecture on Wednesday, February 28.

The quiz is being held in a regularly scheduled class hour. The date was announced at the beginning of the term, so you should not have problems with scheduling conflicts. If, nevertheless, you have managed to create a conflict, contact Prof. Kaashoek at kaashoek@mit.edu as soon as possible to resolve the problem.

System aphorism of the week

A complex system that works is invariably found to have evolved from a simple system that works" (J. Gall, Systemantics).
Go to 6.033 Home Page Questions or Comments: 6.033-tas@mit.edu