Accessibility

6.033--Computer System Engineering

Suggestions for classroom discussion


Topic: Appendix of chapter 11; Why are there so many protection system failures?

By J. H. Saltzer, April 1998, updated January 2001, major addition January 2002, updated April 2003


Chapter six provides first a general perspective on how to think about building secure systems, followed by a set of design principles. It then goes on for a hundred or more pages giving details. One might expect, after reading all this stuff, that people should now know how to do it and get it right.

Then, Appendix 6-A relates several war stories of protection system failures that have occurred over a 40-year time span. Failures from decades past might be explained as mistakes while learning that have helped lead to the better understanding now provided in chapter 6. But the design principles of chapter six were actually formulated and published back in 1975. And that appendix includes several examples of recent failures, which are reinforced by regular reports in the media about yet another virus, worm, or distributed denial-of-service attack.

Two related discussions can follow from these observations. The first is narrowly focused on why specific failures happened. The second is much broader: why do so many failures still happen?


1. What really went wrong?

The current news often has one or another current story that may be a good thing to start with (if there is enough information available to figure out what happened. If there isn't, that gives a good lead-in to discussing the old stories, which usually do have enough to work with.

Starting either with the current news or with a few examples selected from the appendix ask, "what really went wrong here?" The goal is to get beyond the obvious first-level goof, and instead discover what design principles were violated, or what created an environment that allowed the first-level error to happen and not be discovered before it became a problem.

It may help to start by writing on the board the list of design principles from Section A.3 of the chapter that are candidates for violation:

Note that the last two are generic fallbacks that can be used to explain almost any failure case. If none of the more specific design principles seems to apply, perhaps the class can help formulate an additional design principle.

For starters, listed below are a few obvious examples; you and your class will probably find more. (If you report them to me, I will add them to the list, for the benefit of future recitation instructors.)
War storyViolated design principle
6-A.1use fail-safe defaults: always overwrite when deallocating (and again when allocating!)
6-A.2principle of least privilege: user of name file shouldn't have to see passwords.
6-A.3paranoid design: don't let amateurs dabble in cryptomathematics!
6-A.4.1principle of least privilege
6-A.4.2be explicit, think safety-net
6-A.5.1complete mediation: verify value before using
6-A.5.2complete mediation: every reference to user data must be checked
6-A.8
  • complete mediation: should not be able to run any command in user environment
  • economy of mechanism: complex system overwhelms user with choices
  • use fail-safe defaults: checklist should start empty, let user fill in
  • open design: script language should be documented
  • least privilege: automatically launched programs should run in a padded cell
  • defense in depth: completely missing
(From the length of the list one might guess that this one is a Microsoft design.)
6-A.9psychology: if allowed, people will reuse passwords
6-A.10minimize common mechanism
6-A.11.1
  • be explicit: DES package documentation should have large warnings
  • feedback paths: Programmer should not ignore status
  • separation of privilege: One programmer for client, another for server
  • psychology: Ignoring status is common; load_key designer ignored that.
  • fail-safe defaults: When loaded with a bad key, don't use identity transformation!
6-A.11.2be explicit: known weak system should be surrounded by large red flags
6-A.12complete mediation, think safety-net (least privilege is probably implicated, too)
6-A.13be explicit: every displayed line should have an alert prefix
6-A.14economy of mechanism: don't add a special-case version

A good second question for each case is to ask the class if anyone knows of any other examples of the same or a closely similar problem.

This discussion leads naturally into the second topic: Why are there so many protection system failures?


2. If we know how to build secure systems, why does the real world of the Internet, corporate servers, desktop computers, and personal computers seem to be so vulnerable? (Note: Section H of chapter 6 now contains a version of this discussion.)

The question does not have a single, simple answer A lot of different things are tangled together, and it can be instructive to ask your class to help tease them apart. The first step might be to break the question into several more specific questions:

Some of the answers to some of the questions may be found in the following list :

More on firewalls:

Obviously, firewalls help prevent some access. So don't they solve the problem?

Firewalls are a band-aid that doesn't address the underlying problem: the end systems (the clients and servers inside the firewall) are insecure. So while they do help block certain kinds of attack, the down side is that they tend to create a false sense of security that can be damaging in the end. Here are several examples:

The best place for a firewall is on each individual computer system. A system should be configured to accept network requests only for services that it needs to supply (fail-safe defaults). And the system itself should be designed to be reasonably paranoid. That way, a failure in one system probably won't lead to penetration of all the other systems of that organization. If one starts by properly securing all of the individual systems inside an organization, then an external firewall may be a useful addition, following the principle of defense in depth. But it should be viewed as a secondary, not a primary defense.

M.I.T. does not have a firewall. The policy is instead to encourage individual systems to protect themselves. The policy is enforced by shutting off the network connections of systems that don't. The gateways to the outside world do have egress filtering in them--they do not allow packets to go out to the Internet with a return address that MIT does not advertise. And during worm attacks, temporary ingress filters are sometimes put in place to reduce the rate at which those worms get footholds on unsecured MIT systems.


Comments and suggestions: Saltzer@mit.edu