6.033 Handout 7: Assignment 3

February 20 through February 26

For Recitation, Thursday, February 20

To prepare for this section, read Chapter 15 through Section 15.5 from Tanenbaum's Modern Operating Systems. This chapter describes the Mach operating system better than any of the papers that have been written by the authors of Mach. Mach is one of the many microkernel systems built over the last decades; Mach is one of the few that had a considerable impact on industry, though. Mach forms the foundation of, for example, the NeXT operating system. Apple announced recently that it will use NeXT for its next operating system.

Don't be confused if this chapter uses slightly different language than we have in lecture; this is one of the unpleasant features of studying computer systems. Since this field is moving fast, there is not a standard set of definitions. For example, the words "thread", "process", and "task" are totally overloaded (in the 6.035 sense). Sometimes a thread is the same thing as a task, but other times a task is the same thing as a process; it is a mess. You have to figure it out from the context. Ambiguity is an unfortunate but inescapable property of computer systems.

For Lecture, Monday, February 24

This is the last lecture on infrastructure and deals with abstractions and primitives for concurrency control, which is a complicated and important topic. Many of the hard-to-find bugs in programs are concurrency bugs, since often they are hard to reproduce. Jim Gray calls these bugs Heisenbugs, in contrast to Bohrbugs, which have a deterministic behavior. To prepare for lecture, read Tanenbaum sections 2.2 (interprocess communication), 2.3 (classical IPC problems), and 12.1 (threads).

For Recitation, Tuesday, February 25

The last reading on the topic of computer system organization is #10 (Andrew D. Birrell, "An introduction to programming with threads"). For class, read pages 1 through 12; if you get interested in threads the material in pages 13-22 is also of considerable interest, and if you get really interested in threads, go ahead and read the whole paper. Don't be confused by the particular programming language that is being used in this paper; focus on the semantics of the thread primitives.

The question for your one-pager is:

One issue not addressed in Birrell's paper is what happens if something goes wrong with a thread that has locked a piece of data. For instance, Solaris 2.4 has a bug that causes sendmail applications to occasionally crash with messages left in a `locked' state such that other sendmail applications will refuse to deliver those messages (as reported in Friday's Tech: ``Bugs, E-mail Bombs Hinder Mail Servers'').
With threads in the same address space, this wouldn't happen since if one thread crashed, the entire address space would crash with it. But suppose we wanted to be able to recover from the situation where two threads within the same address space are deadlocked (as in the example on page 10 of Birrell's thread paper). What additional information (if any) would Birrell's thread scheduler need to keep to detect this sort of deadlock, and what would it need to be able to do to kill and restart the deadlocked threads without causing any data invariants to be broken?
Note that in 6.033 we defined threads and address spaces but not processes. One way to think about processes in the Solaris sense is that it is an address space with multiple threads.

For Lecture, Wednesday, February 26

This lecture is the first lecture of four covering networking. In preparation read Tanenbaum section 10.1 (layered protocols) and reading #11 (part I of Halstead's 6.033 networking notes).

System aphorism of the week
A complex system that works is invariably found to have evolved from a simple system that works. (J. Gall, Systemantics)

6.033 Handout 7, issued 2/18/96