6.02 - Addressing, Forwarding, Routing without Failures
Lecture #20
Katrina LaCurts, lacurts@mit.edu

======================================================================
Recap of networking so far
======================================================================

So far in 6.02 we have discussed sharing a network.  This came about
in two forms:
1. Sharing a channel, i.e., allowing multiple nodes to access the same
   communication medium at once.  This is generally an issue in
   wireless networks.
2. Sharing a particular link within a network, i.e., multiplexing data
   (frames or packets) from multiple connections onto a single
   outgoing link.  This is generally an issue that switches handle.

In the context of the second point, we looked at two methods:
circuit-switching and packet-switching.  We settled on
packet-switching as the better method for many cases because of the
following:
- Packet-switching can handle traffic that is bursty and/or arrives in
  variable rates
- Because it uses packets, which have headers that indicate their
  destination, packet-switching can allow us to re-route around link
  failures (though we have not actually seen how to do that yet).

To handle bursty-traffic, packet-switched networks use queues.  The
introduction of queues means that packets in these networks can be
dropped, delayed, or re-ordered.  These networks are "best effort"
networks: they try hard, but make no guarantees that packets will be
delivered in any particular amount of time.

The problems that we have left to solve regarding packet-switched
networks are the following:
1. How do nodes determine routes to destinations?
2. How can nodes discover if a link has failed, and route around that
   failure?
3. How can we reliably communicate in a best-effort network?

The next three lectures will solve these problems; today we'll
concentrate on the first.

======================================================================
The role of a switch in slightly more detail
======================================================================

The goal of routing is the following: To allow each switch to know,
for every node in the network, a path to that node (if one exists).
This means that when a switch gets a packet destined for node X, it
will know what to do with said packet.

----------------------------------------------------------------------
Addressing
----------------------------------------------------------------------

We need to solve one problem first: how do we name a node?  Does my
machine send packets with a source address of "Katrina's Laptop" to a
node with a destination address of, e.g., "George's Laptop"?  No;
there are a lot of problems with identifying nodes that way (for
instance, what if I know two people named George who both have
laptops?).

This issue of naming nodes is known as addressing.  We will not go
into much detail in 6.02 about addressing, but you should know the
following:
- Addresses are globally unique.  In 6.02, they'll be things like "A",
  "B", "C", etc.  In the real world, they're more complicated, and
  might look like, e.g., 128.30.76.56.
- We'll also give names to links.  I'll often use the super-clever
  name "A -> B" to name the link from node A to node B, but we could
  use different names.  For instance, we might call the link from A to
  B "L1".

----------------------------------------------------------------------
Link costs
----------------------------------------------------------------------

In last week's lecture I hid one part of networks from you.  Each link
in a network comes with an associated cost, like so:

     4
  A ---- B
2 |      | 3
  C ---- D
     2

These costs represent how "expensive" it is to send a packet along
that link.  Notice that in the above example, A has two possible paths
to D:

   1. A -> B ; B -> D
   2. A -> C ; C -> D

  A note about terminology: In networking, we often abuse terminology
  and use the terms "route" and "path" interchangeably.  I am going to
  try to maintain a difference between the two: The route from A to D
  is the outgoing link that A uses to send packets to D (e.g., A ->
  C).  The path from A to D is the actual path that those packets take
  (e.g., A -> C -> D).

  Aside: If you want to get technical, and allow paths to contain
   cycles, then there are infinitely-many possible paths from A to B.
   We're not going to allow cycles.

Given these two options, A would rather use the one with the lower
cost.  To calculate the cost of a path, we simply sum the cost of the
links along that path:

   1. cost(A -> B ; B -> D) = cost(A -> B) + cost(B -> D) = 4 + 3 = 7
   2. cost(A -> C ; C -> D) = cost(A -> C) + cost(C -> D) = 2 + 2 = 4

The second path is the lower-cost path, so A should use that, i.e.,
its route to D should be A -> C.  (A is interested in finding the
*minimum-cost path* to D.)

You might be wondering what these link costs represent.  They can
represent a number of things: the delay on the link (higher delay =
higher cost), the amount of other traffic using the link (more traffic
= higher cost), the physical distance of the link (longer link =
higher cost), a combination of these things, or other things.  We
won't develop cost metrics in this class, but the following three
properties are important:

   1. The costs cannot be negative.  They can be zero, and they can
      also be non-integers, but they cannot be negative.

   2. The costs do not have to obey the triangle inequality, i.e., we
      may have three nodes such that

         cost(A -> B) + cost (B -> C) <= cost(A -> C)

   3. The cost of a particular link can change.  This especially makes
      sense when the link costs are a function of, e.g., congestion on
      the network.
  
----------------------------------------------------------------------
Routing tables
----------------------------------------------------------------------

Now that we have a good sense of node addresses and link costs, we can
think about routing in more detail.  Consider this network, and node A
in particular:

     4
  A ---- B
2 |      | 3
  C ---- D
     2

We want a routing protocol such that, once it is finished, A knows
that:
  - If A receives a packet destined for B, it sends it on link A -> B
  - If A receives a packet destined for C, it sends it on link A -> C
  - If A receives a packet destined for D, it sends it on link A -> C

You can imagine storing this information in a table at node A, like
such:

  routing_table[B] = A -> B ; 4
  routing_table[C] = A -> C ; 2
  routing_table[D] = A -> C ; 4

There might also be an entry for A itself:

  routing_table[A] = Self ; 0

And indeed, this process is what happens in a switch; we call this
data structure the routing table.  Here, I have included the cost in
the routing table.  It's also fine to keep two separate tables: a
routing table and a cost table.

Note that each node's routing table is not identical.  C's, for
instance, would be this:

  routing_table[A] = C -> A ; 2
  routing_table[B] = C -> D ; 5
  routing_table[C] = Self   ; 0
  routing_table[D] = C -> D ; 2

Now we can state the goal of a routing protocol clearly: For every
node X, after the routing protocol is run, X's routing table should
contain the *minimum-cost path* to every other reachable node.

----------------------------------------------------------------------
Forwarding
----------------------------------------------------------------------

We talked about forwarding last week -- the process wherein a switch
examines a packet's header and decides what to do with it -- but now
we can put the language of addressing and routing to that process:

When a switch receives a packet:
1. Check the packet header for the destination address, dst_addr
2. Look up routing_table[dst_addr], which returns the appropriate
   outgoing link.
3. Add the packet to the queue for that outgoing link
4. Once the packet is at the head of the queue, send it

In real networks, the forwarding step involves a few more steps.  The
four steps above are the core functions of forwarding, and they're all
we need for right now.

======================================================================
Distributed Routing
======================================================================

There are two possible ways we could design a routing protocol: it
could be centralized, or distributed.  A centralized routing protocol
means that a single node would determine all of the routes for all
switches in the network, and then communicate those routes to the
nodes somehow.  A distributed routing protocol means that each node
will run the routing protocol themselves, and determine their own
routes.

We will study distributed routing protocols; they scale better
(remember, whenever we talk about networking, we have to think about
scaling to the billions of nodes that make up the Internet).

We are going to look at two different protocols: distance-vector and
link-state.  These protocols both follow the same general structure:

1. Nodes learn about their neighbors via the HELLO protocol.

   Each node determines its live neighbors, i.e., the nodes to which
   it is directly connected.

   This step is done with a HELLO protocol.  Nodes send HELLO packets
   to each neighbor to let them know who is at the other end of the
   outgoing links.  This gives us a mapping of neighbor addresses to
   links.  E.g.:

       (B, A -> B),  (C, A -> C)

   Or maybe:

       (B, L1),  (C, L2) if we were naming differently.

2. Nodes learn about other reachable nodes via advertisements.

   Nodes send advertisements.  A node's advertisement will contain
   information about some (possibly all) of the other nodes that it
   knows about in the network, as well as cost information associated
   with each of these other nodes.  The exact format of the
   advertisements will depend on the routing protocol.

   As a node begins to receive advertisements from other nodes, it
   will begin to be able to determine connectivity and costs to
   reachable nodes.

3. Nodes determine minimum-cost routes.

   Nodes will compute their routing table using the information from
   the advertisements that they received (this step is commonly known
   as the "integrate" step).  They will also deal with stale data.

All of these steps happen periodically.

In both distance-vector and link-state, step 1 (the HELLO protocol) is
the same; we won't talk about the details, so from now on you can
assume that nodes know their neighbors, and will be able to detect if
a link to their neighbor goes down (this last bit will be important in
the next lecture, when we discuss failures).

The difference between distance-vector and link-state comes in the
second and third steps.  These protocols differ in
- The type of advertisements they send
- The nodes to whom they send advertisements.
- The way nodes integrate advertisements.

======================================================================
Distance-vector routing protocol
======================================================================

1. Advertisement format: Each node's advertisement is a lists of all
   the nodes it knows about, and its current costs to those nodes.
   Initially, this advertisement is just [(self, 0)].

2. Nodes who receive an advertisement:  Node X's advertisement will be
   received only by its neighbors.

3. Integrate step:  When node X receives an advertisement from its
   neighbor Y,this advertisement will be a list of [(dst, cost)]
   pairs.  Each cost represents Y's cost to dst.

   For each (dst, cost) in the advertisement, X needs to check for two
   things:
   - If X is already using Y to get to dst, update the cost
     information (remember, costs can change!)
   - If X is not already using Y to get to dst, see if Y could provide
     a better path; if so, update the routing and cost information

   The formal algorithm is:

   For each (dst, adv_cost) in the advertisement:
	  - calculate X's cost to dst via Y:
           my_cost = (cost to Y) + (Y's cost to dst)
                   =  link_cost  +  adv_cost
        (remember we know link_cost because of the HELLO protocol)
      - check if our current route to dst is through Y
          - If yes, cost_table[dst] = my_cost
          - If no, check if my_cost < cost_table[dst] (i.e., if the
            cost to dst via Y is less than the current cost to dst).
            If so,
               route_table[dst] = X -> Y
               cost_table[dst] = my_cost

We will discuss why this protocol is correct, and how fast it
converges, after we discuss link-state routing protocol.

======================================================================
Link-state routing protocol
======================================================================

1. Advertisement format: Each node's advertisement is a list of its
   neighbors and its link costs to those neighbors.

2. Nodes who receive an advertisement:  Node X's will send its
   advertisement to its neighbors.  However, when X receives an
   advertisement from another node, it will send *that* advertisement
   to *its* neighbors.  In this way, every nodes advertisement is
   flooded through the network.  Even though X only sends directly to
   its neighbors, its advertisement will reach everyone in the
   network.

   The technical details of this process: each advertisement contains
   a sequence number as well as the list of neighbors and costs.
   Nodes increase the sequence number every time they send out a new
   advertisement.  When X receives an advertisement from Y, it
   compares the sequence number of that advertisement to the sequence
   number of the last advertisement it received from Y (which it has
   stored).  If the number is greater, it floods the advertisement,
   and updates its stored sequence number.  This process prevents
   nodes from forwarding around stale data, or forwarding around data
   that it has already sent.

   Once all advertisements are flooded, *each node has a complete map
   of the network.*

3. Integrate step:  After flooding, X has a complete map of the
   network.  It runs Dijkstra's shortest path algorithm over this map
   to determine the correct routes.

----------------------------------------------------------------------
Dijkstra's Shortest-path Algorithm
----------------------------------------------------------------------

In each step of Dijkstra's algorithm, we keep track of W, the set of
nodes we haven't processed yet.  Initially, W is all nodes in the
network.

We also keep track of the current costs and routes to all of the
nodes.  Initially:
   routing_table[self] = Self; routing_table[anyone else] = ?
   cost_table[self] = 0; cost_table[anyone else] = infinity

This is the algorithm:

While W is not empty:
   1. Let u = the node in W we have the minimum cost to so far
   2. Remove u from W
   3. For every neighbor v of u:
         d = cost_table[u] + cost(u, v) # current cost to u plus
                                        # cost of link u -> v
         if d < cost_table[v] # we've found a shorter path to v via u
             cost_table[v] = d
             routing_table[v] = routing_table[u]

======================================================================
Comparison
======================================================================

Now that we know the two protocols, we'll compare them in a few ways

----------------------------------------------------------------------
Protocol behavior
----------------------------------------------------------------------

1. In distance-vector, nodes advertise to their neighbors about all of
   the nodes they know.  In link-state, nodes advertise to everyone
   (via flooding) about their neighbors.

   Aside: I like to make the less formal statement: "In
   distance-vector, nodes talk to their neighbors about everyone.  In
   link-state, nodes talk to everyone about their neighbors."

2. In distance-vector, node X's advertisement will change whenever it
   finds a better path to some node.  In link-state, node X's
   advertisement will change only if its link cost to one of its
   neighbors changes.  Link-state advertisements reflect the topology,
   not the routes.

----------------------------------------------------------------------
Correctness
----------------------------------------------------------------------

The correctness of both of these protocols rests on the following
statement:  Suppose the shortest path from X to Y goes through Z.
Then the sub-path from X to Z must be a shortest path.

That's a very intuitive statement; regardless, here's an informal
argument:

     We're told the shortest path from X to Y is p, which goes through
     Z:

          X ------ Z ------ Y      <-- path p

     We're asked to prove that the path from X to Z used above is a
     shortest path.  Suppose it's not, i.e., there exists another path
     from X to Z that is shorter.  Call that path q.

          X ------ Z ------ Y
          |        |    <-- path q
          ----------     

     Then the path q + Z --- Y, which goes from X to Y, is shorter
     than p.  This contradicts our assumption that p was the shortest
     path from X to Y.  Thus, X -- Z must be a shortest path.

With this statement, you can prove the correctness of distance-vector
via induction on the number of steps the algorithm takes (the number
of cycles in the for/while loops).  Formally, you are proving the
correctness of the Bellman-Ford algorithm; Distance-vector is just an
implementation of this algorithm.

  Aside: The proof easy to do this when all of the costs are greater
  than zero and a synchronous model is used; see the course notes.
  It's hard to do with a distributed, asynchronous model, but we won't
  use that in 6.02.

Proving the correctness of link-state boils down to proving the
correctness of Dijkstra's algorithm, which is pretty easy to do given
the statement I made above.

----------------------------------------------------------------------
Convergence Time
----------------------------------------------------------------------

A routing protocol converges once all nodes have the correct routing
table.  It's important that this happen as quickly as possible;
otherwise, some nodes will be using higher-cost routes than are
necessary (or even worse, will think that they *don't* have a route to
a node that is in fact connected).

For distance-vector, the convergence time is proportional to the
largest number of hops among all minimum-cost paths.  (A "hop" just
means going from one node to another.  X -- Y -- Z is a two-hop path.
X -- Y -- Z -- W is a three-hop path.)

For link-state, the convergence time is proportional to the amount of
time it takes to flood the link-state advertisements, plus the amount
of time it takes for all nodes to run Dijkstra's algorithm.

----------------------------------------------------------------------
Complexity of Dijkstra's Algorithm
----------------------------------------------------------------------

To that end, let's talk about the complexity of Dijkstra's algorithm.
Let N = the number of nodes in the network, and L = the number of
links in the network.

Dijkstra's Algorithm seems fast, but remember that there is a search
operation in each step to pick out u from V.  The complexity is:

	(The number of times the outer while-loop runs)*[(The complexity of finding u)
	+ (the number of times the inner for-loop runs)]

The outer loop runs N times, so:

    N * [(time to find u) + (number of times inner for-loop runs)]

I'm going to move this around:

    N * (time to find u) + N * (number of times inner for-loop runs)

The second term is going to be O(L): over the entire while-loop, we
will inspect 2L neighbors, because each link counts towards two
neighbors (if the link is X -- Y, then we see Y as a neighbor of X and
X as a neighbor of Y).  (It's kind of neat how we calculate the number
of neighbors as a function of L, not N.  So:

    N * (time to find u) + O(L)

The time to find u depends on what data structure we use.  If we just
use linear search, then:

    N * O(N) + O(L) = O(N^2) + O(L) = O(N^2 + L)

If we use a min-heap:

   N * O(log(N)) + O(L) = O(NlogN + L)

----------------------------------------------------------------------
Which protocol should I use?
----------------------------------------------------------------------

At this point, you may be wondering how we choose for a particular
network whether to use distance-vector or link-state?  They are both
good for certain scenarios, and we will talk more about this in the
next lecture when we discuss what happens when links fail.

======================================================================
Layering
======================================================================

At the end of the last lecture, I gave a simple layering mode:

[ Reliable Transport                ]
[ Packets                           ]
[ Bits                              ]
[ Physical layer: signal processing ]

(Note that we haven't covered reliable transport yet)

Where would routing fit in?  Around here:

[ Reliable Transport                ]
[ Routing                           ]
[ Packets                           ]
[ Bits                              ]
[ Physical layer: signal processing ]

We need for the network to be able to send data around before we worry
about reliable transport.  However, another valid model might be:

[ Reliable Transport                ]
[ Packets                           ]
[ Routing                           ]
[ Bits                              ]
[ Physical layer: signal processing ]

After all, we didn't really need packets to do routing, right?  We
could've done all of this with bits or frames.

Indeed, in standard layering models of the network, we say this:

[ Reliable Transport                ]
[ Packets/Addressing/Routing        ]
[ Bits                              ]
[ Physical layer: signal processing ]

This layer is called the "network layer"; its the glue that holds the
network together, and takes us from point-to-point connections to real
networks.