6.033 Discussion Suggestions

6.033--Computer System Engineering

Suggestions for classroom discussion

Topic: Ken L. Thompson. Reflections on trusting trust. Communications of the ACM 27, 8 (August, 1984), pages 761-763.

By J. H. Saltzer, 4 April 2000, minor update April 2001. (Incomplete draft)

Note: in the paper, figure 2.1 is mislabeled 2.2 and vice-versa.

Thompson's paper helps emphasize how difficult it is to be sure that you know what your software does. One point that is worth emphasizing here is that writing all your software yourself, although it could in principle solve the problem, is overwhelmingly impractical; one has no choice but to rely on, and thus trust, software from other sources.

How do you add a new feature to a language? Suppose we want to extend Java to know about a new data type, the fuzzy boolean (it has value T, F, and Maybe). And suppose the Java compiler is actually written in the C language.
(
1. Modify the C source of the Java.0 compiler to
  - recognize the new feature
  - generate appropriate code when it sees the feature being used.
  Call the resulting source the Java.1 compiler.
2. Compile C source of Java.1 compiler to produce Java.1 compiler binary.
3. Replace Java.0 binary with Java.1 binary [N.B. a circle becomes a square.]
4. Compile App source with Java.1 binary to produce App binary.
At this point App source doesn't use fuzzy booleans, so it didn't matter. If you didn't damage the compiler, the App binary should be the same as before.)
This question sounds silly, but bear with me...Can we modify App source to use fuzzy booleans? (Yes. Remember this answer.)
Can we modify the Java.0 source to use fuzzy booleans? (No, because the Java source is written in C, not Java.0 or Java.1.)
Now consider a C compiler written in the C language. How do we add the fuzzy boolean type? (Same as before, replacing Java.0 with C.0 and Java.1 with C.1. [modify the picture])
Will the C.1 compiler correctly compile App? (Yes, if App uses only features that the C.1 compiler knows about.)
Will the C.1 compiler correctly compile C.0? (Yes, if C.0 uses only features that the C.1 compiler knows about.)
Will the C.1 compiler correctly compile C.1? (Yes, again.)
So can we trash all of our copies of the old C.0 compiler source. (Not a good idea. If we later discover a fatal flaw in our C.1 compiler we would like to be able to go back.)
Can we modify the App source to use fuzzy booleans? (Yes. But then App.1 becomes dependent on the continued existence of the new C.1 compiler. If we go back to the old compiler, App.1 won't compile--but at least all programs that don't use fuzzy booleans still will. This is the price of being leading-edge.)
Can we modify the C.1 compiler source to use fuzzy booleans? (Yes. It is now the C.2 compiler. But the C.2 compiler will be dependent on the continued existence of the new C.1 compiler binary.)
At what point does this thing stabilize? (We should compile C.2 source with C.2 binary to make sure that it reproduces C.2 binary. If it does, we are done.)
What does all this have to do with Thompson's paper? (Only that we now know exactly how it is that the C.2 compiler not only learned a new trick but it uses the new trick it learned. The C.2 compiler is now depending on the continued existence of either the C.1 or the C.2 compiler.)
So is there anything different about adding vertical tab to the list of tricks? (Not initially--it is the same story. You can't use the new feature with the old compiler, so you must implement the new feature using only features of the old compiler. And once you have a new compiler, you can use the new feature. What gets interesting is that in this case you can use the new feature to implement the new feature. That is a bit startling. And once you do that, the actual implementation (the correspondence between \v and 010) is no longer visible in the source. That is a bit scary.)
Since publication of this article, there have been two proposals for handling trust in programs that you didn't personally write: (1) signed programs, and (2) sandboxes. Do these solve the problem that Thompson raises?
(A signed program means that if it harms you, you know whom to sue. This may actually help, assuming that you can trust the chain of verification, and that it leads to someone that you can identify in the real world.
The sandbox sounds good, but you probably will not be willing to run the output of the compiler only in the sandbox. The reason you acquired the compiler was to produce application programs, and you would like to be able to apply those programs to your data, which you don't want in the sandbox.)

In case you are wondering about the fourth citation in the paper, which reads "4. Unknown Airforce Document.", Ken says that it is...

Karger, P.A., and Schell, R.R.
Multics Security Evaluation: Vulnerability Analysis.
ESD-TR-74-193, Vol II, June 1974, page 52.
(Karger was a 6.033 student; Schell participated in the graduate seminar that kicked off the development of 6.033.)

Comments and suggestions: Saltzer@mit.edu