Accessibility

6.033--Computer System Engineering

Suggestions for classroom discussion of:

Effy Oz. When Professional Standards are lax: the CONFIRM failure and its lessons. Communications of the ACM 37, 10 (October, 1994) pages 30-36.

by J. H. Saltzer, April 24, 1996, from 1995 notes.


Hilton Hotels, Marriott, Budget Rent-a-car, and American Airlines joint reservation system.
1988-1992.
Spent $125M before abandoning project.

One fruitful approach to discussing this paper is to look in it for
examples of the problems identified in our discussions of complexity and
in Brooks' book.

1.  The Therac-25 paper and this paper are the only ones we have seen
this term that provide any level of detail about disasters.  There are
practically no others.  Why don't we see more papers that explain
disasters?  (Companies don't like to discuss losing ideas; the want to
project a corporate image that the company is associated with winning
ideas.  There may be lawsuits hanging on who gets blamed.  There are
usually multiple causes and it is hard work to tease them apart to see
what really happened.  The individuals involved may be embarrassed to
admit that they made key, incorrect decisions, or they may feel that
their future career would be jeopardized.  Everyone wants to get on with
the next project, not do post-mortems on the last one.  Many such
projects trail on for a surprisingly long time before they are finally
put out of their misery; during this time the original principals have
all moved on to other projects and are hard to locate.)

2.  Can we find examples of things warned about by Brooks?  (The
second-system effect is the main one.  The new system is supposed to be
much better than the existing systems.)

3.  What evidence is there of the second system effect?  (Buzzwords in
the discussion of its objectives:  "state of the art", "functionally
richer," "costs will be less, "superior to any current reservation
system," "completed in time to outpace the competition.")

4.  Is there any evidence of the Mythical Man-Month?  (The preliminary
design team spent $1.5M in only 5 months, a rate of $300K/month.  If the
staff earned $100K/year there must have been at least 35 people working
on that preliminary design.  The main project was budgeted at $56M over
52 months, a rate of $1.1M/month, enough to pay for about 150
professionals.)

5.  Is there any evidence of the bad-news diode?  (Everywhere.  Spring
1990: "employees estimated CONFIRM would not be ready in time; they
were instructed to change their revised dates so that they reflect the
original project calendar."   Spring 1991: employees were told to
change timetables to meet the schedule or be fired. Management was
eventually fired for not revealing the true status of the
project.  Summer 1991:  Consultant hired; his review was too negative,
his report was buried and he was fired.)

6.  Another interesting approach is to plot a time-line:

Sep., '88  project launch, completion target June, 1992, cost $55M

Feb., '90  1 quarter behind schedule

Oct., '90  1 year behind schedule after 2 years of work
           (fell 9 months farther behind in 8 months)

Feb., '91  Replan:  new cost $92M
                    June '92 will have only some features
                    Full features in March, '93

Apr. 30, '92  15-18 months from completion = Oct 30, '93, 7 months behind
           schedule after 1 year of work on Replan.
           Second estimate:  2 years behind schedule after 1 year of
           work.

Jul '92   Project cancelled, $125M spent, plus opportunity losses of
$160M.

8.  The paper suggests that the heart of the problem lies in ethical
considerations.  Do you think that if all the management people had
acted ethically according to Oz's prescription, the system would have
been successfully delivered on time?  (Almost certainly not.  The
article doesn't say that they chose to develop the system developed
using an IBM system extension called "Transaction Processing Facility"
(TPF).  Unfortunately, if you use TPF your application must be written
in machine language.   The admonition to "use sharp tools" has been
ignored. Chances are this project was doomed from the outset; the
management attempts to hide the problems probably only had the effect
of delaying recognition of the disaster.)

9.  Are these system objectives beyond what the state of the art can
accomplish?  (No.  Hyatt put together a system based on Unix, the
Informix database system, and the Novell Tuxedo transaction processing
monitor. The prototype was working in three months, and the entire
project, including hardware, cost $15M and it is apparently working
just fine, handling 1,000 booking agents.  It is claimed to be
delivered ahead of schedule, under budget, with extra features.)


Comments and suggestions: Saltzer@mit.edu