Note: The following is the text version of a paper originally prepared with
the word processor RUNOFF. Thomas Van Vleck kindly rendered the RUNOFF
format into HTML. The figures are currently not available in any of the
on-line versions. The full citation of the paper is
Jerome H. Saltzer and Kenneth T. Pogran.
A Star-Shaped Ring Network with High Maintainability.
NBS-Mitre Local Area Communications Network Symposium (May, 1979)
pages 179-190. Reprinted in Computer Networks 4, 5 (October 1980)
pages 239-244.
A Star-Shaped Ring Network with High Maintainability
J. H. Saltzer
K. T. Pogran
Key Words: Networks; Local Networks; Ring Networks; Star
Networks;
Data Communications; Network Maintainance; Network Serviceability.
Abstract
Ring networks exhibit a number of desirable properties: they are
simple in concept and in implementation; one-way point-to-point signal
transmission minimizes analog circuitry and design problems; the cost
of a small net is small; and transmission speed is not limited by
propagation time. However, there are several potential reliability
problems: all repeaters must be powered and operating reliably at
all times; cutting of any transmission line in the ring will disrupt the
entire network; and trouble-shooting may require visiting each node
with test equipment.
The advantages of ring networks are sufficiently attractive to merit
careful attack on the reliability and maintainability problems.
This paper describes a physical organization of a ring network that we
term a "star-shaped ring."(1) This organization addresses the reliability
issues mentioned above by looping all internode links back through a
central location so that broken lines or repeaters can be bypassed.
This approach allows automatic recovery if the ring is accidentally
broken. In addition, it simplifies trouble-shooting and, because the
lengths of transmission links can be normalized, permits the use of still
simpler
transmission circuitry. Finally, the "star-shaped ring" appears to
be a satisfactory approach to installing a network that can grow to
a large number of nodes in a typical office building.
Introduction
The M.I.T. Laboratory for Computer Science is engaged in the
design and implementation of a local area network with the following
principal goals, desiderata, and constraints:
1) Interconnection of independent desktop computers within a building,
as many as one per office. A smaller number of typically larger computers
must also be attachable.
2) Application to clusters of two or three computers as well as to
clusters of a
hundred or more.
3) Extension to a campus- or site-wide network of, say, 10,000 computers.
4) Bandwidth in the 1 to 10 Mb/sec. range, so that file transfer
can be a convenient part of interactive operations, even though average
load may be only a few percent of the peak capacity.
This set of goals led us to conclude that the best technology
choice has little to do with the usual analyses of performance,
collision rate, or bit error rates, but rather with more mundane issues
such as which technology is easiest to install, reconfigure, and maintain
in a typical office building. We were first attracted to the technology
of a repeater ring because it appeared to have one advantage of each of
the alternative broadcast (Ethernet) and star configurations, without
the corresponding disadvantage of each:
a) The transmission system of a star is point-to-point, allowing
the use of
simpler analog technology than the Ethernet in which every receiver
must be able reliably to hear transmissions from every transmitter.
b) The hardware complement of the Ethernet is a tranceiver at each
node and so remains a constant percentage of the system cost as
the network grows. This means that the cost of getting a small
network started is small.
The centralized star switch tends to have a substantial fixed
cost component that discourages its use in small configurations
that might later grow.
The repeater ring network uses point-to-point transmission and also
has a hardware complement--a repeater at each node--that is a constant
percentage of the system cost as the network grows. Thus it captures
both advantages.
In turn, a ring of repeaters appears to introduce a
reliability/availability disadvantage since every component must work
properly all the time. Our analysis of these considerations suggested
that the advantages of the ring configuration were interesting enough to
warrant devising a strategy that directly attacks the reliability
and maintainability problems. The remainder of this paper describes
what promises to be a successful such strategy, based on the observation
that the cited advantages of a ring do not depend on the actual routing
of the wire cables that link the repeaters. An earlier paper describes
in detail the design of a small-delay ring repeater and the hardware
and software protocols used to achieve reliable local packet
transport, and those details are not repeated here.(2)
The basic ring
Proposers of ring networks usually draw pictures
as in figure one, with the implication that the
cables interconnecting individual repeaters follow any convenient,
reasonably direct route from one repeater to the next. Installing a
ring network with that approach could be expected to expose the
following problems:
1) Cable Vulnerability. The physical ring
trails widely through the building, and is
vulnerable at every point to accidents. If a link is accidentally
severed or shorted, the entire ring is inoperable until
the problem can be isolated and a new cable
installed.
2) Repeater Failure. Since the repeaters are
all in series, failure of any repeater would
make the ring inoperable.
For the kind of office environment for which our ring is intended,
it will be common for many of the nodes not to be in operation
at any time. But the repeaters must always operate.
3) Perambulation. When either a repeater or a cable
linking two repeaters
fails, locating the failure requires perambulation of the ring,
and thus access to all offices containing repeaters and wire runs
containing cables.
Portable test equipment is also required.
4) Installation headaches
Installation of a new repeater requires selecting two repeaters
that are supposed to be directly linked, verifying that they actually are
linked (that is, that the documentation of network topology is up-to-date)
and installing new cables from each of them to the site of the new repeater.
This installation approach has several
consequences. The length of cable driven by the source repeater will
change, possibly requiring retuning of its transmission circuitry. The
old cable, now unused, is likely to be abandoned;
one would expect the walls and ceilings to accumulate
forgotten wire. Finally the ring may develop a wildly irregular topology,
thereby increasing the effort involved in perambulation.
Bypass relays
The ring network repeater can be designed
to provide a partial solution to the problem of repeater failure, since that
is almost certainly the most significant of the four
problems. The repeater can provide an "I-am-healthy" signal that
operates a mechanical relay that would, if not energized, bypass that
repeater, as in figure two. Thus turning off the primary power at a node would
kill the
"I-am-healthy" signal, the relay would de-energize, and its contacts
would cut the repeater out of the ring. This approach replaces the
repeater failure problem
with four new problems, that are (one hopes) easier to cope with:
2a) Repeater self-check failure. There is still a class of repeater
failures that can disrupt the
network, namely those in which the repeater's internal self-check circuits
insist that
the repeater is healthy when it really isn't. The frequency of
such failures ought to be substantially smaller than that of all repeater
failures, however.
2b) Relay contact failure. The relay
contacts may fail, thus disrupting the ring.
2c) Line length variation. The length of
transmission cable connecting a healthy repeater
to its next healthy neighbor may vary widely, depending on how many
intervening repeaters are bypassed. Thus the transmitter/receiver
system must be designed to operate over a wide range of transmission
distances, perhaps automatically adjusting to the cable length.
(This problem is a rapidly varying version of one of the installation
headache problems.)
2d) Bypass disruption. At any instant a bypass
relay may cut in or out, and its contacts may
"bounce" for a few tens of milliseconds, thus destroying any message
or token currently circulating around the ring.
The software recovery protocols of the ring network are assumed to be
designed to take
occasional lost packets, messages, or tokens in stride, so
the bypass disruption problem is not
an important concern, so long as it happens only occasionally.
The relay contact failure problem may be addressed by connecting
normally closed relay contacts
in parallel as shown in figure two, thus requiring that two relay contacts
must fail simultaneously before a de-energized relay can disrupt the ring.
Reed relays that can reliably transmit high data rate
digital signals seem to be readily available.
The star-shaped ring: step one
The perambulation and installation headache problems
can be attacked simultaneously simply by rearranging the inter-repeater
cables so that they always loop back through a single room, called the
wire center, as in figure three.(3)
With this star-shaped arrangement, it is not necessary for a
trouble-shooter
to have keys to every office containing a repeater or to carry test equipment
from point to point. With access to the
signal on every cable as it passes through the wire center, one can launch
a message into the ring, observe the signals on successive cables to see how far
it gets,
and then reconfigure the ring to disconnect temporarily the repeater or
the cable that seems not to be working. Then access is needed only to
the area containing the troubled repeater or cable.
Installation of new repeaters is also regularized. Two cables are
installed from the location of a new repeater to the
wire center. Then the new repeater is spliced into the ring entirely
by rearranging wires at the wire center. At no point is one tracing old
cables through walls or ceilings or abandoning them there when making
new installations. Unanticipated installations can be
handled without creating a hodge podge of wires crisscrossing through the
walls and ceilings; physical planning of a network is thus simplified.
The order of repeaters on the ring is determined at
the wire center and can be rearranged if for some reason
reordering seems necessary. (E.g., when trouble-shooting a
repeater-repeater transmission failure, quick rearrangement might be a useful
technique to isolate the problem to the transmitter or receiver side of the
link).
The star-shaped ring: step two
A further refinement of the wire center concept considerably reduces
the wire vulnerability problem, simplifies
maintenance and installation further, and normalizes
transmission line lengths. This refinement involves simply
moving the bypass relays from the repeaters to the wire center.
The primary impact of centralizing the relays is that it provides for
automatic bypass of the vulnerable data transmission lines as well as
of the repeaters. To
control the relay, an extra pair of wires runs from the repeater
to the wire center. In practice, installation of a repeater would be
accomplished by pulling a single cable containing two data
transmission lines and the relay control pair from the area of the
repeater to the wire center. A single plug connects all of these lines
to the repeater. Almost any accident, ranging from chopping the cable
to kicking out the power supply plug of the repeater
has the effect of bypassing both the repeater
and also the data transmission lines to and from that repeater.
An important additional effect of bypassing the data transmission
lines that lead to bypassed repeaters is that the path from one working
repeater to the next working repeater always
consists of the same cable
run to the wire room, some number of bypass relays, and exactly one
cable run to a receiver. This path varies mostly in the length of the
final cable run; thus the range of signal levels to which the transmission
system must automatically adapt is much smaller. Further, at the
wire room an appropriate
level-setting attenuator can be placed in series with
short cables. Then a transmitter will always "see" essentially the same cable
length no matter what the configuration of the ring.
Thus the line length variation problem can be eliminated by this arrangement,
allowing more flexibility in the design choices for the transmission system.
Physical realization of the star-shaped ring
Serviceability of the ring can be further enhanced by a suitable
physical realization of the interconnections at the wire center. The
design of figure four illustrates.
A printed circuit board is constructed with, say, eight bypass relays in a row,
and eight connectors into which cables to repeaters can be plugged.
Each relay is connected to the next when its coil is de-energized. When
energized, the relay cuts into the ring a pair of transmission paths that
lead through a socket that can contain an attenuating
network (to normalize cable lengths) to the
repeater connector at the edge of the board. Current for the relay coil
comes from the repeater.
A light-emitting-diode is connected across the relay coil for visual
observation of the "I-am-healthy" signal.
The sequence of normally closed relay contacts leads to
connectors at the top and bottom edges of the board. A "ring continuity
cable" runs from the top connector around to the bottom one, completing
the ring. In a sense, this ring of normally closed relay contacts on
a single board in a controlled environment is the data communication
ring, from the point of view of identifying what must be working to
allow the ring to operate. If any two repeaters can energize their bypass
relays, they can communicate, even though all other potential ring participants
may have failed, powered down, or tripped over and disconnected their cables.
Installation of a new ring participant is accomplished by installing
a cable from the repeater to the wire center, attaching a
connector, and plugging it into an unused bypass relay. No disruption
of ring operation occurs. If all of the bypass relays have
repeaters attached to them, another printed circuit board containing eight
more relays can be installed next to this one, and then in a matter of
seconds cabled into the ring by interchanging top ring continuity
cable connectors. (This kind of ring expansion may be scheduled at times
when a few-second disruption is tolerable. Then, addition of repeaters
can be accomplished at any time.)
Note that trouble-shooting with this physical configuration is
especially straightforward. If the ring stops working, it is almost certainly
because some repeater's "I-am-healthy" line is incorrectly energized. One
starts by unplugging all the cable connectors to all the repeaters, checking
for ring continuity, and then plugging in the cables to the repeaters
one at a time to see which one (or ones)
seem to disrupt continuity. Any
cable whose reattachment causes trouble can be left unattached for the
moment until it can be
checked out more carefully. In this way the
ring can be brought back into operation for the correctly behaving
participants quite rapidly. Further, by rearranging the ring
continuity cables that interconnect one printed circuit board to the
next it is possible quickly to isolate problems to a group of eight
relay/cable combinations. Finally, if a relay or printed circuit board
component fails, that board can quickly be replaced or, at worst, bypassed.
Wire center interconnection
One might envision equipping a wire center with up to eight
boards, each with eight relays, producing a ring of sixty-four
nodes, probably enough to handle a typical building floor or wing.
Another floor or wing would have its own wire center. The simplest
way to interconnect the wire centers, if they are not too far apart,
is to interconnect the ring continuity cables of the two wire centers.
However, if the wire centers are very far apart, the center-to-center
cables will add to the data transmission path between some repeaters
but not between others, and the cable length variability problem
will reappear. A better approach
would be to build a special wire-center-to-wire-center repeater that
would plug into both wire centers in the same way as a node repeater.
If traffic is light enough to allow 128 nodes to be arranged in a single
ring, this repeater would be a simple non-addressable full duplex
repeater. If traffic grows to the point that two 64-node rings would be
a better arrangement, the repeater could be replaced by a filtering bridge that
forwards to the other ring only
messages that are addressed to nodes not on this ring segment. (This
same basic filtering bridge design could also be used to divide a 64-node
ring at a single wire center into two or more smaller rings, if
necessary.)
Automatic ring recovery
A further possibility for automating maintenance is to
connect a microprocessor controlled repeater to one of the repeater
connectors, and program the microprocessor to occasionally (say, once
per second) launch a test packet around the ring. The microprocessor
would also control, perhaps through another set of relay coils, the
continuity between each repeater's "I-am-healthy" line and its bypass relay.
If a test packet fails to make it around the ring, the microprocessor
would force de-energization of all of the bypass relays (except
its own), check for ring continuity, and then reconnect the
"I-am-healthy" lines one at a time, testing ring continuity after each
reconnection. When it finds an "I-am-healthy" line that
disrupts ring continuity it might broadcast a
trouble report around the ring as a way to call for a repairman.
Since no other peripheral devices are needed, and the program is fixed, the
extra cost of this automatic test system could be very small.
It is not clear how useful this notion of a microprocessor
controlled automatic ring maintainer may be in a network
consisting only of a single, small wire center. It may be overkill to
automate the function because the ring may be very reliable anyway,
and there is a danger that the
microprocessor will fail in such a way as to disrupt the network, or that
its automatic operation will be so effective that it reduces the
incentive to repair misbehaving nodes.
On the other hand in a campus-wide or industrial site-wide network with 100
interconnected wire centers,
the automatic maintenance feature may be very helpful in getting
service attention quickly and in rapidly restoring operation.
Conclusion and progress report
The star-shaped ring strategy has the decentralized control
advantage of a ring network, and at the same time the
centralized maintenance advantage of a star network. It thus captures some of
the better properties of each while avoiding the key disadvantages
of both.
At the M.I.T. Laboratory for Computer Science we have been since
January, 1979, operating a ring network that runs at 1 Mb/sec. and
that now has five nodes consisting of Digital Equipment Corporation PDP-11
and VAX-11/780 computers. This ring uses the traditional shortest
convenient path from one node to the next for wire runs.
In anticipation of delivery of as many as 100 desktop computers
during 1980, the ring network hardware interface, which includes
the repeater, is being redesigned, simplified, and increased in
speed. The star-shaped ring strategy is planned for this expanded
ring network.
Footnotes
1) The star-shaped ring of this paper has a different set of
goals from the similarly named Star-Ring, which is an I/O
bus with properties of both ring and star topology. See Potvin, J.N.,
et al., "Star-Ring: A Computer Intercommunication and I/O System,"
Information Processing 71, North
Holland Publishing Co. (1972)
pp. 721-718.
2. Clark, D.D., Pogran, K.T., and Reed, D.P., "An Introduction
to Local Area Networks," Proc. IEEE 66, 11 (November, 1978)
pp. 1497-1517.
3) The wire center is an idea borrowed from years of telephone
practice. The IBM 2790 transmission loop provides a kind of wire
center in that a single ring of many repeaters is broken
into four segments, each of which comes back to a central controller.
Again despite superficial similarities the design
goals of that system are substantially different from the one described
here. See Hippert, R.O., "IBM 2790 Transmission Loop," IBM J. of
R. & D. 14, 6 (November, 1970) pp. 662-667.