Accessibility

6.033--Computer System Engineering

Suggestions for classroom discussion


Topic: Robert M. Metcalfe and David R. Boggs. Ethernet: Distributed packet switching for local computer networks. Communications of the ACM 19, 7 (July, 1976) pages 395-404.

By J. H. Saltzer, March, 1995, revised March 1996, 1997, 2001, 2003, and 2004.


A prerequisite to discussion of the Ethernet paper is a clear understanding of the role of the link layer. One approach is to draw a picture of a workstation that has several physical links (a serial line to a computer in the next room, a telephone line, a dialup line, and an Ethernet) coming out the back and ask the class to help develop the interface between the network layer and the link layer. This interface specification helps to clarify the need to distinguish between point-to-point links and multi-drop links, the need for the network layer to tell the link layer which link on which to send the message, and the distinction between the addresses used in the network layer (e.g., IP addresses) and the addresses used in the link layer (e.g., telephone numbers or Ethernet drops).

  • Buzzword patrol:
    1.1 Aloha: A predecessor radio network that used a protocol similar to the Ethernet.
    1.1 Imp: Interface Network Processor, an early router.
    1.1 Arpanet: The predecessor of the Internet.
    1.3 CATV: cable television.
    3.. Synchronous Time Division Multiplexing (STDM): what chapter 4 calls "isochronous multiplexing".
    3.4 Best efforts: should be "best effort".

  • Technology check: What are the relevant technology changes since this paper was written in May 1975?

    (January 2004: On a monomode optical fiber, you can send 40 Gigabits/second over a distance of 10 kilometers without repeaters. The product is 400,000 gigabit-meters per second, compared with the 1 gbm/s mentioned in the paper. The technology has improved by 4×105 in 28 years, an improvement rate of a little over 60% per year.)

  • Yes, but what does it cost?

    (That would be using Dense Wave Division Multiplexing (DWDM), and the end stations would cost between $10K and $100K. If you want to make it cheaper, omit the 40-way multiplexing, reduce the data rate to 1 Gigabit/second and the cost drops to under $1K at each end. Cost per bit transmitted is still about the same, but the entry cost is much lower.)

  • Perhaps it is worthwhile listing some different things that are labeled "Ethernet":
    Experimental Ethernet: 3 Megabits/second
    Commercial Ethernet: 10 Megabits/second
    Fast Ethernet: 100 Megabits/second
    Gigabit Ethernet: 1000 Megabits/second
    The paper is discussing the Experimiental Ethernet. It is important to keep this in mind.

  • [suggested by Larry Rudolph] What are the alternatives to Ethernet? To answer this it might help to first to answer this question: What problem does the Ethernet target?

    (The LAN problem: with minimum cost and fuss, interconnect the computers within one building. Be prepared for a lot of flux as people move computers around and get new ones. Design target: 100 computers. The chief alternatives are

    )

  • So what are the advantages/disadvantages of these four?

    ([minus signs for disadvantages, plus signs for advantages]

    full mesh: -For 100 computers, need 9900 links. -Expensive, especially for labor. -Adding one more computer requires crawling through the ceiling to 100 different places. -each computer needs 100 ports. +problems are easy to isolate: if computer 14 can't talk to computer 37, it is immediately clear what equipment and wires to inspect.

    central switch (telephone): -Setting up connections takes time, requires state, may block other connections. -Single point of failure (but telephone people figured out how to make it very reliable). -A new wire for every computer. -Hard to start small (2 computers) and grow to 250; initial switch needs to anticipate maximum size. +Easy to understand. +When network goes down, can isolate the problem by visiting just one place. +uses point-to-point communication, virtues of which we discuss later.

    central switch (computer): -possible performance bottleneck in the middle. -single point of failure looks real this time. -a new wire for every computer. +When network goes down, can isolate the problem by visiting just one place. +uses point-to-point communication, virtues of which we discuss later.

    token ring: -if repeaters fail, entire network goes down (fixed with relays). -When it fails, isolating the failure requires perambulating the building (fixed with star topology) +uses point-to-point communication, virtues of which we discuss later.

    ethernet: -very hard to isolate failures--see next question. -uses broadcast communication, which is fragile. +can start with two computers; initial cost is very low. +The shared medium is completely passive, making it reliable. +since the hub was invented, you can configure it as a star to ease maintenance.

    Other virtues/failings will come out as we study the details.)

  • [suggested by Martin Rinard] What are the ways that an Ethernet may fail? (The two most common problems are lack of termination, which creates echoes, and transceivers that start jabbering and won't shut up. Both can affect the *whole* Ethernet and both can be hard to find in the original bus configuration. With the original Ethernet, the classic rule for finding trouble was "look for the ladder". Most Ethernet disruptions were caused by someone doing something in the wire trays or above the false ceilings. So one of the ideas praised in the paper fell by the wayside, in favor of the running wires to a central point.)

  • If a star is such a win, why not just have a central switch? (The real question is, "what is in the central switch?" The original concept was one wire from the switch to each computer, and the switch set up connections between computers that wanted to talk. The switched Ethernet has a couple of subtle differences with that model. First, I can put two or more computers at my end of the wire in my office; the contention mechanism is still useful. Second, the switch in the center doesn't make connections, it is a simple packet forwarder. And note that an Ethernet switch generally costs twice as much per port as an Ethernet hub, so central switching apparently has an identifiable price tag.)

    Students generally come away from a first reading of this paper with a superficial understanding of the mechanics that can be tuned up a lot with focused discussion. The next few items build on what was learned about digital transmission lines in 6.004.

  • First, the physical mechanism. The usual acronym for the protocol used by the Ethernet is "CSMA/CD", which stands for Carrier Sense (the paper uses the term "carrier detection"), Multiple Access, with Collision Detection (the paper uses the term "interference detection"). Let's start with "Multiple Access (MA)". What's that?

    (There are 100 computers sharing the same wire. So we need a protocol to decide who gets to transmit. Today, the protocol is called "Media Access Control", or MAC. Unfortunately the same three-letter abbreviation also applies to a computer brand and to "message authentication code", which is a completely unrelated authentication scheme.)

  • Now, let's try "Carrier Sense (CS)". Section 3.5.1 calls it "Carrier detection". And the first sentence says that the bits are "phase encoded". What is "Carrier sense/detection"?

    (It means that you know the sender is actually sending something, rather than sitting quietly doing nothing.)

  • Then what is phase encoding, and why are they using it? (The paper gives only half of the explanation.)

    [Thanks to Larry Rudolph:] Imagine that you are sending the data to a friend across the street with a flashlight at night. Light ON means a ONE, light OFF means a ZERO. How can the person watching distinguish no transmission (carrier off) from a long string of zeros?

    (No way. But suppose you instead have a coding convention that "light ON for one second followed by light OFF for one second" means ONE and "light OFF for one second followed by light ON for one second" means ZERO. Now the person watching can easily tell that something is being sent.)

  • So what is the role of "Carrier Sense" in the protocol?

    (The first step in the MAC protocol: You listen before you transmit, and transmit only if you don't hear anything. The paper uses the word "defer".)

  • What is the rest of the MAC protocol?

    listen while sending, and if you hear a collision you should

        -  stop sending
        -  jam ("consensus enforcement")
        -  wait
    
    We come back to those for a moment. Lets figure out how to detect collisions.)

  • Collision Detection (The "CD" of CSMA/CD): How is a collision detected? (This is hard to figure out, because the paper gives a very misleading explanation--perhaps because they were filing a patent and didn't want to reveal the trick. Section 3.5.2, on interference detection, says that the transceiver "notices a difference between the value of the bit it is receiving and the value of the bit it is attempting to transmit").

  • Why is this explanation bogus?

    (The transmitter and the receiver are connected to the wire at the same point. The receiver is going to have a very hard time hearing anything but the local transmitter, which will be much stronger than one at the other end of the wire. And, comparing the received waveform with the transmitted waveform is a fragile method--suppose the other transmitter is sending the same bit sequence that you are. They may have tried that technique, but it wouldn't work very well.)

  • So how can they possibly detect collisions reliably?

    (Here is the other half of the reason for using phase encoding. Recall that each data bit is sent as two signal bits of opposite sense. If zeros are represented by "01" then ones are represented by "10". Phase encoding actually has two features:

    Since the rate of current injection is constant and known, the "interference detector" can simply measure the average (DC) voltage on the cable, which has 50-ohm impedance and is properly terminated. If one station is transmitting, the DC voltage will be near an easily predictable value. If two stations are transmitting, the DC voltage will be double what it should be. If the voltage is very far from the one-station value, the detector has sensed a collision.)

  • So, from an electrical engineering point of view, what is the "carrier"? (This system runs at baseband--the frequency spectrum includes D.C., and the carrier is a DC voltage. Draw a frequency spectrum on the board to illustrate.)

  • So we listen while transmitting. Are we certain to detect the other guy's carrier? (Only if we both transmit long enough.)

  • How long is "long enough"? (Long enough for my signal to get to him before he stops transmitting. And vice versa. The worst case is that my signal gets to the most remote station just as it is starting to transmit, and that station's signal has to get back to me before I stop transmitting. So we need to know the worst case end-to-end propagation delay and multiply that by 2.)

  • But 3.5.1 claims that "acquisition" is complete after one end-to-end propagation time, and it is guaranteed that the transmission will complete.

    (Don't believe everything you read. The paper is half right and half wrong. It is true that after one end-to-end propagation time, all stations will be deferring. But a station that is at a maximum distance from the first transmitter, and that started transmitting just before the the first transmitter's signal arrived at its interference detector will have fouled things up, and the transmitter won't discover the foul-up until the second station's signal propagates back to it. So the transmitter isn't assured that the transmission will we interference-free until two end-to-end propagation times, or one round-trip time (RTT).)

  • How long is that?

    (The Metcalfe and Boggs paper describes the "Experimental Ethernet", which runs at 3 megabits/second, has a maximum end-to-end extent of 1 kilometer, and for which repeaters are mentioned as problematic. Keeping in mind that the velocity of propagation in coaxial cable is about 60% of the speed of light, the maximum end-to-end delay across 1 kilimeter of Ethernet is thus about 5 microseconds.

    The commercial standard Ethernet scaled up both the speed (to 10 megabits/sec) and the maximum size (to 2.5 kilometers). As an additional complication, to reach the larger extent and yet maintain adequate signal-to-noise ratios, they had to introduce repeaters. And in order to handle the higher data rate and worst-case attenuation, the repeaters had to be outfitted with phase-locked loops in their receivers, and the PLL's required several bit times of incoming data to synchronize before the measured bit rate was stable enough to use to transmit data on the other side; this adds still more end-to-end delay. 20 microseconds is the official worst-case number. The speed-of-light delay is 12.5 microseconds; the rest is startup delay of up to three PLL's, two in the repeaters and one in the target receiver. We can model this by saying that the speed of light in the commercial Ethernet is only 125 million meters/second.)

  • Back to the largest possible Experimental Ethernet: We have to transmit for at least 2*five microseconds to be sure we can detect collisions. How many bits can we pump into the wire in this time?

    (We are sending 3 megabits/second, or one bit every 1/3 of microsecond. Looks like we could get 30 bits out. Another way of looking at this is that each bit is 333 nanoseconds in time extent, or 67 meters in physical extent along the wire, so there is room for up to 15 successive bits on the wire. The other guy starts transmitting just as bit 1 gets to him, so the first transmitter doesn't notice the second guy's carrier until 15 bit times after that.

  • This implies that there is a minimum packet size. Why doesn't the paper mention this?

    (Since the shortest useful packet always has at least five bytes (two address bytes + data + 2-byte CRC) and that is 40 bits, one doesn't have to think about minimum packet sizes.)

  • How about in the commercial (10 megabit/second) Ethernet?

    (In 2*20 = 40 microseconds one can pump 400 bits (50 bytes) into the wire before the first bit is recognized by the most distant possible receiver. A transmitter could launch a complete 50-byte packet without realizing there was a collision. Receivers in the middle would, of course, see the collision, but the collision sense mechanism in the transmitters would fail to alert either transmitter of the problem, and the only way to recover would be with an end-to-end timeout. Recovery using end-to-end timeouts has been extensively studied; one gets a maximum utilization of the channel of a fraction 1/e of its potential capacity. More important, as you try to exceed that capacity (about 1/3) the network goes into congestion collapse, in which colliding retransmissions dominate the traffic and nothing useful gets done. The resulting avalanche of packets is sometimes called the Aloha effect because it was first observed in the Aloha wireless network that connected terminals among the Hawaiian islands.)

  • How does 50 bytes compare with the smallest useful packet size now?

    (The smallest useful packet has 2 6-byte addresses, a 4-byte CRC, a 1-byte TYPE field, and some data--at least one byte. That is 18 bytes.)

  • How do you suppose they fixed this?

    (By putting 500 padding bits on the front of every packet, whether it needs it or not. These are not a complete waste; the receivers use the padding bits to synch up and stabilize their phase-locked loops.)

  • The need to specify a minimum packet size is an example of something mentioned in the first lecture. What?

    (A minimum packet size is an example of an emergent property--a surprise--that shows up only when the delay (amplified by propagation of effects in the form of need for repeaters and PLL's) and speed of the network are large compared with the smallest packet you can imagine sending.)

  • Now let's push the scaling. What happens if the Ethernet is speeded up to 100 megabits/sec.? How about 10 gigabits/sec.?

    (Incommensurate scaling rears its ugly head. The speed of light doesn't change, so either you have to increase the minimum packet size to 4000 bits (400,000 bits) or else you have to reduce the maximum diameter to 250 meters (2.5 meters). Or some combination thereof. Although one can imagine the 100 megabits/sec scale, the 10 gigabit/sec scale doesn't look very promising. Fast Ethernet (100 megabits/second) reduces the maximum network diameter to 200 meters. Gigabit Ethernet (1000 megabits/second) couldn't reduce it more and still be useful, so it instead pads out short packets to 512 bytes.)

  • [This question and the next one suggested by Peter Druschel] What is the effect of a large minimum packet size on an Ethernet's efficiency?

    (If many packets need to be padded we waste channel capacity.)

  • Apart from imposing a minimal packet size, in what other way does the worst-case delay affect an Ethernet's performance?

    (Consider the formula given in the paper for the Ethernet's efficiency:

                     P/C
             E = ------------
                  P/C + W*T
    
    where E is the fraction of time the Ethernet is carrying useful data, P is the packet size, C the channel capacity, W the expected number of collisions a station suffers before it can transmit, and T is the worst-case time it can take to detect a collision. As we increase C while keeping P and T constant, the efficiency under high load (when W becomes > 0) diminishes. We could try to avoid this by increasing P, but many applications have only a limited amount of useful data to transmit in a packet. Padding does not help the efficiency either. Fortunately, all this doesn't matter as long as we operate the Ethernet under reasonably low load.)

  • What is "jamming"

    (The idea is that if you detect a collision, you are supposed to force the signal level high for one round-trip time, to assure that everyone on the wire hears and agrees that there was a collision.)

  • Why is jamming needed?

    (For the case where the two colliding transmitters are not a maximum distance from each other. If they are close to each other, each will hear the other immediately and stop transmitting. More distant receivers will hear various random things depending on where they are located relative to the two colliding transmitters. We want to ensure that they all ignore the collision.)

  • How long should a transmitter wait before retransmitting?

    (The problem is that if the two transmitters both wait the same time, they will collide again. So each chooses a random delay from a preset interval.)

  • What is retransmission backoff?

    (A fundamental idea for congestion control that can be used in a wide variety of system situations: If you run into interference--in the case of the Ethernet, a collision--wait a random time before trying again. If the next try runs into interference again, choose the random delay from an interval twice as large. Keep doubling the interval until you get through. If all participants follow this approach, they will eventually back off enough to clear the congestion.)

  • [suggested by Hari Balakrishnan] The modern Ethernet header has a TYPE field, the addresses are 48 bits, and the CRC is longer. Compare this with the header in figure 2 (page 400.) What benefit to mankind comes from the changes?

    (

    )

  • Implications. What is the implication of being a broadcast medium?

    (The energy of a transmitter is divided among the many receivers, so signal-to-noise ratios aren't very high. For maximum sensitivity and minimum noise, the receiver consists of a transistor with its base connected to the center wire and its ground reference connected to the cable shield.)

  • What is the implication of connecting the ground reference to the cable shield?

    (I have to supply DC power to my receiver. If the ground reference is directly connected to the cable shield, my power supply must be grounded to the cable. But that in turn means either

    The Ethernet chooses the latter design, because the former design would produce building-sized ground loops and there could be many amperes of noise flowing through the coax cable shield.)

  • Beware of table I. Why?

    (First, it reports numbers with four significant figures, something that one should always consider suspect. Very few computer or communication systems have any parameter that can be measured with that kind of precision. More important, despite appearances, this table is reporting the result of the calculation of section 6.3, not the result of measurements in the field. Predicted performance is interesting, but it should be more clearly identified as such.)


    Comments and suggestions: Saltzer@mit.edu