Commentary on Distributed Systems

by J. H. Saltzer

Seventh SOSP, Asilomar, 11 December 1979.
[transcribed to HTML from hand-written notes]

[SLIDE: COMMENTARY--while audience is assembling]

Anita Jones asked me to make a few personal comments about the state of the world in distributed systems. This is an interesting opportunity that I am not quite sure how to cope with. First of all, I don't know what distributed systems are. Second, there were no papers presented on the left-handed least unlikely last page-removal selection algorithm, so I can't reuse my comments from last time.

In preparation for this commentary, about six months ago Anita sent me copies of most of the papers to be presented on distributed systems. This appeared to be a unique opportunity to get an advance look, but I hadn't fully appreciated that these were to be nth generation copies of unrefereed drafts, the majority of which one must imagine what the figures might look like. I don't recommend the experience.

Despite these obstacles, I've come up with a short list of 7 comments. Now the intent is that these comments stimulate further comment and discussion, so the plan is that I will carry on for a while--no more than half an hour, I hope, and then open the floor to further comments and rejoinders, for which I will act as moderator. So--please be thinking of suitable things to contribute. If we haven't run down by then, sometime around 4:30 I will call a halt to the proceeding so that we can all go out to watch the sunset.

My comments are numbered.
There are 7 of them.
So you can keep track, I'll keep you posted on the comment number.

COMMENT #1 - ABOUT SYSTEMS

You may not have noticed a small slip-up in the printing of the proceedings of this symposium. The title page accidentally got reproduced as the bound cardboard cover. The cardboard cover is supposed to look like a special issue of Operating Systems Review. Fortunately, I've got a slide copy of the missing cover.

[SLIDE: OPERATING SYSTEMS REVIEW]

The reason I bring it up has more to do with the contents
Now that Lantz and Rashid renamed their paper, every session but one of the conference has a paper or title with some variant of the word distributed in it. This suggests that we might consider a minor revision in the cover page as indicated in this overlay.

[OVERLAY SLIDE: DISTRIBUTED]

That suggestion is almost certainly wrong, and I would like to explain why I think that.

Why not? Proposal implies a misunderstanding of the business that is very common.

The systems business is the study of glue
Sysems people do integration.

What this means in practice is

  1. Language people do their thing
  2. Database people do theirs
  3. Communication people do theirs
  4. Hdw architects do theirs
  5. etc.

What's left is what the systems people tackle - it is the rest of the problem - implies an inclusive view, keeping things from slipping thru the cracks, picking up loose ends, not an exclusive view of what is important

Historically, automatic operation, and therefore scheduling was one of the first missing pieces, to replace operators on roller skates.

More recently, file sytems, protection, coordination, intallation management, virtual memory...

As these things became understood, the

take pieces into their respective folds, and absorb them as their own,

leaving the remaining glue to us.

It seems to be that today we have a technology/cost revolution on our hands that potentially puts a complete computer on every desk, in every room, next to every sensor or actuator. The system problem is again the glue. How do you take advantage of the technology?

That is the subject. It happens to involved "distributed" systems today.

Distributed System def.: The glue required to harness low-cost hardware technology of the 80s.

Components used to be memory, processors.
abstraction: virtual memory, virtual processors.
problem: resource management.

Today's components:

Todays glue and loose ends:

COMMENT 2

[SLIDE: PATTERN]

Pattern for case study papers in the system area.

  1. Say Distributed is good.
  2. Communication over wires is done by messages.
  3. Post facto wire placement -> Let's use messages for everything.
  4. Resolve arguments 1 thru 99 over details of message semantics.
  5. [If implementation is up by the paper deadline] Express astonishment over path length to send a message.

This pattern has occurred often enough that it is starting to get tiresome.

No need to say any more.

COMMENT 3 - Personal Computers

[SLIDE: 3] Someone once advised me that you shold have one slide for every idea. I wasn't able to think of anything very imaginative for this comment, but maybe this slide will allow you to keep your place.

Question - is the era of the time-sharing system gone forever?

I thought that maybe it was, until I heard that there is available now, for $50, a TWO-USER BASIC for your APPLE Computer. Add a second TV and you are in business.

"When there are cycles about, someone will covet them."

Comparison with the Automobile is actually quite fruitful.

Likely a similar situation with Personal computers, which are a little like automobiles with memory:

COMMENT 4 - Robustness may accidentally degenerate into unplanned total dependence

[SLIDE: NEW HAZARD]

(from early experience at PARC and MIT)

Distributed Systems provide robustness by

          IF service-available USE service
          ELSE do-it-yourself FI     
e.g.,
          IF WWV-can-be-received USE WWV for time-of-day
          ELSE use local calendar clock FI     

Leads to unplanned total dependence!

Do-it-yourself rarely invoked

If WWV works all of the time, you may not notice that your own clock is broken.

Have run into this problem more than once in tryig to disentangle some software from the Xerox environment into our environment with a less-rich collection of services.

A little like the electric power generator that use electric motors to pump oil into the bearings.

COMMENT 5

There are two unavoidable facts of life about distributed systems:

  1. Things really happen in parallel
    so coordination must be done right
  2. things fail independently
    so recovery finally has to be faced up to.

It seems to be a fact of life that you can't separate these problems. This leads to the next slide.

[SLIDE: The Recovery and Coordination Jungle]

Tomorrow's session is worth watching very carefully., If you really get to where you understand that stuff you will find a whole lot of previously disconnected facts beginnning to make sense.

There is one problem though

The people doing the work are unsure of themselves, too.

--> Papers are three years in draft (in contrast w/ biology, where as soon as you think you have a result you rush to publish in Science.

COMMENT 6 - Too many not-quite-orthogonal abstractions

[SLIDE - LOTS OF...]

(Resembles quantum chromodynamics)

My mind is getting overloaded with overlapping concepts.

We need a Galileo or a Newton or a Kepler to restore simplicity to these proceedings.

COMMENT 7 - Is there anything left to do?

[SLIDE: HEADLINES WE HAVEN'T SEEN YET]

ComputerWorld headlines, evidence that systems is mature, nothing to do:

That is my last comment. Thank you for listening patiently.