6.033 Discussion Suggestions

6.033--Computer System Engineering

Suggestions for classroom discussion

Topic: Appendix of chapter 11; Why are there so many protection system failures?

By J. H. Saltzer, April 1998, updated January 2001, major addition January 2002, updated April 2003

Chapter six provides first a general perspective on how to think about building secure systems, followed by a set of design principles. It then goes on for a hundred or more pages giving details. One might expect, after reading all this stuff, that people should now know how to do it and get it right.

Then, Appendix 6-A relates several war stories of protection system failures that have occurred over a 40-year time span. Failures from decades past might be explained as mistakes while learning that have helped lead to the better understanding now provided in chapter 6. But the design principles of chapter six were actually formulated and published back in 1975. And that appendix includes several examples of recent failures, which are reinforced by regular reports in the media about yet another virus, worm, or distributed denial-of-service attack.

Two related discussions can follow from these observations. The first is narrowly focused on why specific failures happened. The second is much broader: why do so many failures still happen?

1. What really went wrong?

The current news often has one or another current story that may be a good thing to start with (if there is enough information available to figure out what happened. If there isn't, that gives a good lead-in to discussing the old stories, which usually do have enough to work with.

Starting either with the current news or with a few examples selected from the appendix ask, "what really went wrong here?" The goal is to get beyond the obvious first-level goof, and instead discover what design principles were violated, or what created an environment that allowed the first-level error to happen and not be discovered before it became a problem.

It may help to start by writing on the board the list of design principles from Section A.3 of the chapter that are candidates for violation:

economy of mechanism
fail-safe defaults
complete mediation
open design
be explicit
separation of privilege
use the least privilege that gets the job done
minimize common mechanism (this is the end-to-end argument)
design feedback paths and analyze failures
mere mortals must be able to figure out how to use it correctly
defense in depth
paranoid design

Note that the last two are generic fallbacks that can be used to explain almost any failure case. If none of the more specific design principles seems to apply, perhaps the class can help formulate an additional design principle.

For starters, listed below are a few obvious examples; you and your class will probably find more. (If you report them to me, I will add them to the list, for the benefit of future recitation instructors.)

War story Violated design principle
6-A.1 use fail-safe defaults: always overwrite when deallocating (and again when allocating!)
6-A.2 principle of least privilege: user of name file shouldn't have to see passwords.
6-A.3 paranoid design: don't let amateurs dabble in cryptomathematics!
6-A.4.1 principle of least privilege
6-A.4.2 be explicit, think safety-net
6-A.5.1 complete mediation: verify value before using
6-A.5.2 complete mediation: every reference to user data must be checked
6-A.8

complete mediation: should not be able to run any command in user environment
economy of mechanism: complex system overwhelms user with choices
use fail-safe defaults: checklist should start empty, let user fill in
open design: script language should be documented
least privilege: automatically launched programs should run in a padded cell
defense in depth: completely missing
(From the length of the list one might guess that this one is a Microsoft design.)
6-A.9 psychology: if allowed, people will reuse passwords
6-A.10 minimize common mechanism
6-A.11.1

be explicit: DES package documentation should have large warnings
feedback paths: Programmer should not ignore status
separation of privilege: One programmer for client, another for server
psychology: Ignoring status is common; load_key designer ignored that.
fail-safe defaults: When loaded with a bad key, don't use identity transformation!

6-A.11.2 be explicit: known weak system should be surrounded by large red flags
6-A.12 complete mediation, think safety-net (least privilege is probably implicated, too)
6-A.13 be explicit: every displayed line should have an alert prefix
6-A.14 economy of mechanism: don't add a special-case version

A good second question for each case is to ask the class if anyone knows of any other examples of the same or a closely similar problem.

This discussion leads naturally into the second topic: Why are there so many protection system failures?

2. If we know how to build secure systems, why does the real world of the Internet, corporate servers, desktop computers, and personal computers seem to be so vulnerable? (Note: Section H of chapter 6 now contains a version of this discussion.)

The question does not have a single, simple answer A lot of different things are tangled together, and it can be instructive to ask your class to help tease them apart. The first step might be to break the question into several more specific questions:

The Internet protocols do not provide a default of authentication of message source and privacy of message contents. Why not?
Some widely used personal computer operating systems (Windows 95/98/ME and Mac OS up through version 9) do not come with enforced modularity that creates strong internal firewalls. Why not?
Inadequately secured computers are attached to the Internet. Why?
Unix systems, commonly used as servers, have enforced modularity, but they seem to have a lot of buffer overflows, which can allow subversion of modular boundaries. Why are these failures so common?
Why doesn't security certification help more?
MIT has what seems to be an effective authentication system (Kerberos). Why aren't such systems deployed more widely?
Many organizations have installed Network firewalls between their internal network and the Internet. Do they really help? (extended discussion below)
We are hearing reports that wireless network (WiFi or 802.11b) security is awful. This is a brand-new design. Why is it so vulnerable?
Cable TV scrambling systems, DSS (Satellite TV) security, the CSS system for protecting DVD movie content, the proposed music watermarking system, were all compromised almost immediately following their deployment. Why were these systems so easy to break?

Some of the answers to some of the questions may be found in the following list :

Stupidity.
Government interference.
Lack of regulation.
Cost.
Deployed systems resist change. Compatibility. Persistence of old software.
Technology is changing too fast; the security designers can't keep up.
Lack of awareness.
Honest opinions that the problem isn't that important. (Reality or perception?)
Dishonest claims that the problem isn't that important. (With what ulterior motive?)
Using systems in environments they weren't designed for.
Chapter six notwithstanding, no one really knows how to build a secure system.
Authentication infrastructure is {choose one: economically, politically, realistically} very hard to develop.

More on firewalls:

Obviously, firewalls help prevent some access. So don't they solve the problem?

Firewalls are a band-aid that doesn't address the underlying problem: the end systems (the clients and servers inside the firewall) are insecure. So while they do help block certain kinds of attack, the down side is that they tend to create a false sense of security that can be damaging in the end. Here are several examples:

There is no point in having a firewall that blocks all communication, so every firewall has a few holes intentionally drilled through it. But if any one of these holes can be exploited, *all* of the unprotected systems inside the firewall immediately become vulnerable. And there are constant pressures on the network administrator to drill additional holes, because people want to get useful work done, using new protocols whose implementation may be buggy.
Walk down the street of a business district with an open laptop listening for Wi-Fi/802.11 networks. You will generally find one, it will be unprotected, even by a password, you can connect to it, and it will be on the inside of the company's firewall. You now can attack any of those insecure clients and servers. The reason the wireless network is unprotected is partly from ignorance, and partly from the false sense of security that was engendered by the firewall. If they hadn't had a firewall the guy who set up the wireless access point might have been more careful, or someone might have called him on the fact that he didn't password-protect it.
It is typical to find that in addition to the connection to the Internet, there are also dial-in modems to desktop systems inside the firewall. There is such a modem on my desk at M.I.T. Dial that number, get into the unprotected system that answers, and now you are inside the firewall and ready to attack the other, unprotected servers and clients.
Well-known folklore among security specialists is that the biggest source of computer security problems in business is unauthorized actions by authorized people--corrupt, dishonest, or unthinking employees (there is an advertisement stressing this point that appears frequently in the Wall Street Journal that is captioned "but what about Ruth in accounting?". The firewall provides no protection at all against this threat, but if the individual computers were secured, that would help limit the amount of damage any one employee can cause.

The best place for a firewall is on each individual computer system. A system should be configured to accept network requests only for services that it needs to supply (fail-safe defaults). And the system itself should be designed to be reasonably paranoid. That way, a failure in one system probably won't lead to penetration of all the other systems of that organization. If one starts by properly securing all of the individual systems inside an organization, then an external firewall may be a useful addition, following the principle of defense in depth. But it should be viewed as a secondary, not a primary defense.

M.I.T. does not have a firewall. The policy is instead to encourage individual systems to protect themselves. The policy is enforced by shutting off the network connections of systems that don't. The gateways to the outside world do have egress filtering in them--they do not allow packets to go out to the Internet with a return address that MIT does not advertise. And during worm attacks, temporary ingress filters are sometimes put in place to reduce the rate at which those worms get footholds on unsecured MIT systems.

Comments and suggestions: Saltzer@mit.edu

War story	Violated design principle
6-A.1	use fail-safe defaults: always overwrite when deallocating (and again when allocating!)
6-A.2	principle of least privilege: user of name file shouldn't have to see passwords.
6-A.3	paranoid design: don't let amateurs dabble in cryptomathematics!
6-A.4.1	principle of least privilege
6-A.4.2	be explicit, think safety-net
6-A.5.1	complete mediation: verify value before using
6-A.5.2	complete mediation: every reference to user data must be checked
6-A.8	complete mediation: should not be able to run any command in user environment economy of mechanism: complex system overwhelms user with choices use fail-safe defaults: checklist should start empty, let user fill in open design: script language should be documented least privilege: automatically launched programs should run in a padded cell defense in depth: completely missing (From the length of the list one might guess that this one is a Microsoft design.)
6-A.9	psychology: if allowed, people will reuse passwords
6-A.10	minimize common mechanism
6-A.11.1	be explicit: DES package documentation should have large warnings feedback paths: Programmer should not ignore status separation of privilege: One programmer for client, another for server psychology: Ignoring status is common; `load_key` designer ignored that. fail-safe defaults: When loaded with a bad key, don't use identity transformation!
6-A.11.2	be explicit: known weak system should be surrounded by large red flags
6-A.12	complete mediation, think safety-net (least privilege is probably implicated, too)
6-A.13	be explicit: every displayed line should have an alert prefix
6-A.14	economy of mechanism: don't add a special-case version