6.02 - Addressing, Forwarding, Routing without Failures Lecture #20 Katrina LaCurts, lacurts@mit.edu ====================================================================== Recap of networking so far ====================================================================== So far in 6.02 we have discussed sharing a network. This came about in two forms: 1. Sharing a channel, i.e., allowing multiple nodes to access the same communication medium at once. This is generally an issue in wireless networks. 2. Sharing a particular link within a network, i.e., multiplexing data (frames or packets) from multiple connections onto a single outgoing link. This is generally an issue that switches handle. In the context of the second point, we looked at two methods: circuit-switching and packet-switching. We settled on packet-switching as the better method for many cases because of the following: - Packet-switching can handle traffic that is bursty and/or arrives in variable rates - Because it uses packets, which have headers that indicate their destination, packet-switching can allow us to re-route around link failures (though we have not actually seen how to do that yet). To handle bursty-traffic, packet-switched networks use queues. The introduction of queues means that packets in these networks can be dropped, delayed, or re-ordered. These networks are "best effort" networks: they try hard, but make no guarantees that packets will be delivered in any particular amount of time. The problems that we have left to solve regarding packet-switched networks are the following: 1. How do nodes determine routes to destinations? 2. How can nodes discover if a link has failed, and route around that failure? 3. How can we reliably communicate in a best-effort network? The next three lectures will solve these problems; today we'll concentrate on the first. ====================================================================== The role of a switch in slightly more detail ====================================================================== The goal of routing is the following: To allow each switch to know, for every node in the network, a path to that node (if one exists). This means that when a switch gets a packet destined for node X, it will know what to do with said packet. ---------------------------------------------------------------------- Addressing ---------------------------------------------------------------------- We need to solve one problem first: how do we name a node? Does my machine send packets with a source address of "Katrina's Laptop" to a node with a destination address of, e.g., "George's Laptop"? No; there are a lot of problems with identifying nodes that way (for instance, what if I know two people named George who both have laptops?). This issue of naming nodes is known as addressing. We will not go into much detail in 6.02 about addressing, but you should know the following: - Addresses are globally unique. In 6.02, they'll be things like "A", "B", "C", etc. In the real world, they're more complicated, and might look like, e.g., 128.30.76.56. - We'll also give names to links. I'll often use the super-clever name "A -> B" to name the link from node A to node B, but we could use different names. For instance, we might call the link from A to B "L1". ---------------------------------------------------------------------- Link costs ---------------------------------------------------------------------- In last week's lecture I hid one part of networks from you. Each link in a network comes with an associated cost, like so: 4 A ---- B 2 | | 3 C ---- D 2 These costs represent how "expensive" it is to send a packet along that link. Notice that in the above example, A has two possible paths to D: 1. A -> B ; B -> D 2. A -> C ; C -> D A note about terminology: In networking, we often abuse terminology and use the terms "route" and "path" interchangeably. I am going to try to maintain a difference between the two: The route from A to D is the outgoing link that A uses to send packets to D (e.g., A -> C). The path from A to D is the actual path that those packets take (e.g., A -> C -> D). Aside: If you want to get technical, and allow paths to contain cycles, then there are infinitely-many possible paths from A to B. We're not going to allow cycles. Given these two options, A would rather use the one with the lower cost. To calculate the cost of a path, we simply sum the cost of the links along that path: 1. cost(A -> B ; B -> D) = cost(A -> B) + cost(B -> D) = 4 + 3 = 7 2. cost(A -> C ; C -> D) = cost(A -> C) + cost(C -> D) = 2 + 2 = 4 The second path is the lower-cost path, so A should use that, i.e., its route to D should be A -> C. (A is interested in finding the *minimum-cost path* to D.) You might be wondering what these link costs represent. They can represent a number of things: the delay on the link (higher delay = higher cost), the amount of other traffic using the link (more traffic = higher cost), the physical distance of the link (longer link = higher cost), a combination of these things, or other things. We won't develop cost metrics in this class, but the following three properties are important: 1. The costs cannot be negative. They can be zero, and they can also be non-integers, but they cannot be negative. 2. The costs do not have to obey the triangle inequality, i.e., we may have three nodes such that cost(A -> B) + cost (B -> C) <= cost(A -> C) 3. The cost of a particular link can change. This especially makes sense when the link costs are a function of, e.g., congestion on the network. ---------------------------------------------------------------------- Routing tables ---------------------------------------------------------------------- Now that we have a good sense of node addresses and link costs, we can think about routing in more detail. Consider this network, and node A in particular: 4 A ---- B 2 | | 3 C ---- D 2 We want a routing protocol such that, once it is finished, A knows that: - If A receives a packet destined for B, it sends it on link A -> B - If A receives a packet destined for C, it sends it on link A -> C - If A receives a packet destined for D, it sends it on link A -> C You can imagine storing this information in a table at node A, like such: routing_table[B] = A -> B ; 4 routing_table[C] = A -> C ; 2 routing_table[D] = A -> C ; 4 There might also be an entry for A itself: routing_table[A] = Self ; 0 And indeed, this process is what happens in a switch; we call this data structure the routing table. Here, I have included the cost in the routing table. It's also fine to keep two separate tables: a routing table and a cost table. Note that each node's routing table is not identical. C's, for instance, would be this: routing_table[A] = C -> A ; 2 routing_table[B] = C -> D ; 5 routing_table[C] = Self ; 0 routing_table[D] = C -> D ; 2 Now we can state the goal of a routing protocol clearly: For every node X, after the routing protocol is run, X's routing table should contain the *minimum-cost path* to every other reachable node. ---------------------------------------------------------------------- Forwarding ---------------------------------------------------------------------- We talked about forwarding last week -- the process wherein a switch examines a packet's header and decides what to do with it -- but now we can put the language of addressing and routing to that process: When a switch receives a packet: 1. Check the packet header for the destination address, dst_addr 2. Look up routing_table[dst_addr], which returns the appropriate outgoing link. 3. Add the packet to the queue for that outgoing link 4. Once the packet is at the head of the queue, send it In real networks, the forwarding step involves a few more steps. The four steps above are the core functions of forwarding, and they're all we need for right now. ====================================================================== Distributed Routing ====================================================================== There are two possible ways we could design a routing protocol: it could be centralized, or distributed. A centralized routing protocol means that a single node would determine all of the routes for all switches in the network, and then communicate those routes to the nodes somehow. A distributed routing protocol means that each node will run the routing protocol themselves, and determine their own routes. We will study distributed routing protocols; they scale better (remember, whenever we talk about networking, we have to think about scaling to the billions of nodes that make up the Internet). We are going to look at two different protocols: distance-vector and link-state. These protocols both follow the same general structure: 1. Nodes learn about their neighbors via the HELLO protocol. Each node determines its live neighbors, i.e., the nodes to which it is directly connected. This step is done with a HELLO protocol. Nodes send HELLO packets to each neighbor to let them know who is at the other end of the outgoing links. This gives us a mapping of neighbor addresses to links. E.g.: (B, A -> B), (C, A -> C) Or maybe: (B, L1), (C, L2) if we were naming differently. 2. Nodes learn about other reachable nodes via advertisements. Nodes send advertisements. A node's advertisement will contain information about some (possibly all) of the other nodes that it knows about in the network, as well as cost information associated with each of these other nodes. The exact format of the advertisements will depend on the routing protocol. As a node begins to receive advertisements from other nodes, it will begin to be able to determine connectivity and costs to reachable nodes. 3. Nodes determine minimum-cost routes. Nodes will compute their routing table using the information from the advertisements that they received (this step is commonly known as the "integrate" step). They will also deal with stale data. All of these steps happen periodically. In both distance-vector and link-state, step 1 (the HELLO protocol) is the same; we won't talk about the details, so from now on you can assume that nodes know their neighbors, and will be able to detect if a link to their neighbor goes down (this last bit will be important in the next lecture, when we discuss failures). The difference between distance-vector and link-state comes in the second and third steps. These protocols differ in - The type of advertisements they send - The nodes to whom they send advertisements. - The way nodes integrate advertisements. ====================================================================== Distance-vector routing protocol ====================================================================== 1. Advertisement format: Each node's advertisement is a lists of all the nodes it knows about, and its current costs to those nodes. Initially, this advertisement is just [(self, 0)]. 2. Nodes who receive an advertisement: Node X's advertisement will be received only by its neighbors. 3. Integrate step: When node X receives an advertisement from its neighbor Y,this advertisement will be a list of [(dst, cost)] pairs. Each cost represents Y's cost to dst. For each (dst, cost) in the advertisement, X needs to check for two things: - If X is already using Y to get to dst, update the cost information (remember, costs can change!) - If X is not already using Y to get to dst, see if Y could provide a better path; if so, update the routing and cost information The formal algorithm is: For each (dst, adv_cost) in the advertisement: - calculate X's cost to dst via Y: my_cost = (cost to Y) + (Y's cost to dst) = link_cost + adv_cost (remember we know link_cost because of the HELLO protocol) - check if our current route to dst is through Y - If yes, cost_table[dst] = my_cost - If no, check if my_cost < cost_table[dst] (i.e., if the cost to dst via Y is less than the current cost to dst). If so, route_table[dst] = X -> Y cost_table[dst] = my_cost We will discuss why this protocol is correct, and how fast it converges, after we discuss link-state routing protocol. ====================================================================== Link-state routing protocol ====================================================================== 1. Advertisement format: Each node's advertisement is a list of its neighbors and its link costs to those neighbors. 2. Nodes who receive an advertisement: Node X's will send its advertisement to its neighbors. However, when X receives an advertisement from another node, it will send *that* advertisement to *its* neighbors. In this way, every nodes advertisement is flooded through the network. Even though X only sends directly to its neighbors, its advertisement will reach everyone in the network. The technical details of this process: each advertisement contains a sequence number as well as the list of neighbors and costs. Nodes increase the sequence number every time they send out a new advertisement. When X receives an advertisement from Y, it compares the sequence number of that advertisement to the sequence number of the last advertisement it received from Y (which it has stored). If the number is greater, it floods the advertisement, and updates its stored sequence number. This process prevents nodes from forwarding around stale data, or forwarding around data that it has already sent. Once all advertisements are flooded, *each node has a complete map of the network.* 3. Integrate step: After flooding, X has a complete map of the network. It runs Dijkstra's shortest path algorithm over this map to determine the correct routes. ---------------------------------------------------------------------- Dijkstra's Shortest-path Algorithm ---------------------------------------------------------------------- In each step of Dijkstra's algorithm, we keep track of W, the set of nodes we haven't processed yet. Initially, W is all nodes in the network. We also keep track of the current costs and routes to all of the nodes. Initially: routing_table[self] = Self; routing_table[anyone else] = ? cost_table[self] = 0; cost_table[anyone else] = infinity This is the algorithm: While W is not empty: 1. Let u = the node in W we have the minimum cost to so far 2. Remove u from W 3. For every neighbor v of u: d = cost_table[u] + cost(u, v) # current cost to u plus # cost of link u -> v if d < cost_table[v] # we've found a shorter path to v via u cost_table[v] = d routing_table[v] = routing_table[u] ====================================================================== Comparison ====================================================================== Now that we know the two protocols, we'll compare them in a few ways ---------------------------------------------------------------------- Protocol behavior ---------------------------------------------------------------------- 1. In distance-vector, nodes advertise to their neighbors about all of the nodes they know. In link-state, nodes advertise to everyone (via flooding) about their neighbors. Aside: I like to make the less formal statement: "In distance-vector, nodes talk to their neighbors about everyone. In link-state, nodes talk to everyone about their neighbors." 2. In distance-vector, node X's advertisement will change whenever it finds a better path to some node. In link-state, node X's advertisement will change only if its link cost to one of its neighbors changes. Link-state advertisements reflect the topology, not the routes. ---------------------------------------------------------------------- Correctness ---------------------------------------------------------------------- The correctness of both of these protocols rests on the following statement: Suppose the shortest path from X to Y goes through Z. Then the sub-path from X to Z must be a shortest path. That's a very intuitive statement; regardless, here's an informal argument: We're told the shortest path from X to Y is p, which goes through Z: X ------ Z ------ Y <-- path p We're asked to prove that the path from X to Z used above is a shortest path. Suppose it's not, i.e., there exists another path from X to Z that is shorter. Call that path q. X ------ Z ------ Y | | <-- path q ---------- Then the path q + Z --- Y, which goes from X to Y, is shorter than p. This contradicts our assumption that p was the shortest path from X to Y. Thus, X -- Z must be a shortest path. With this statement, you can prove the correctness of distance-vector via induction on the number of steps the algorithm takes (the number of cycles in the for/while loops). Formally, you are proving the correctness of the Bellman-Ford algorithm; Distance-vector is just an implementation of this algorithm. Aside: The proof easy to do this when all of the costs are greater than zero and a synchronous model is used; see the course notes. It's hard to do with a distributed, asynchronous model, but we won't use that in 6.02. Proving the correctness of link-state boils down to proving the correctness of Dijkstra's algorithm, which is pretty easy to do given the statement I made above. ---------------------------------------------------------------------- Convergence Time ---------------------------------------------------------------------- A routing protocol converges once all nodes have the correct routing table. It's important that this happen as quickly as possible; otherwise, some nodes will be using higher-cost routes than are necessary (or even worse, will think that they *don't* have a route to a node that is in fact connected). For distance-vector, the convergence time is proportional to the largest number of hops among all minimum-cost paths. (A "hop" just means going from one node to another. X -- Y -- Z is a two-hop path. X -- Y -- Z -- W is a three-hop path.) For link-state, the convergence time is proportional to the amount of time it takes to flood the link-state advertisements, plus the amount of time it takes for all nodes to run Dijkstra's algorithm. ---------------------------------------------------------------------- Complexity of Dijkstra's Algorithm ---------------------------------------------------------------------- To that end, let's talk about the complexity of Dijkstra's algorithm. Let N = the number of nodes in the network, and L = the number of links in the network. Dijkstra's Algorithm seems fast, but remember that there is a search operation in each step to pick out u from V. The complexity is: (The number of times the outer while-loop runs)*[(The complexity of finding u) + (the number of times the inner for-loop runs)] The outer loop runs N times, so: N * [(time to find u) + (number of times inner for-loop runs)] I'm going to move this around: N * (time to find u) + N * (number of times inner for-loop runs) The second term is going to be O(L): over the entire while-loop, we will inspect 2L neighbors, because each link counts towards two neighbors (if the link is X -- Y, then we see Y as a neighbor of X and X as a neighbor of Y). (It's kind of neat how we calculate the number of neighbors as a function of L, not N. So: N * (time to find u) + O(L) The time to find u depends on what data structure we use. If we just use linear search, then: N * O(N) + O(L) = O(N^2) + O(L) = O(N^2 + L) If we use a min-heap: N * O(log(N)) + O(L) = O(NlogN + L) ---------------------------------------------------------------------- Which protocol should I use? ---------------------------------------------------------------------- At this point, you may be wondering how we choose for a particular network whether to use distance-vector or link-state? They are both good for certain scenarios, and we will talk more about this in the next lecture when we discuss what happens when links fail. ====================================================================== Layering ====================================================================== At the end of the last lecture, I gave a simple layering mode: [ Reliable Transport ] [ Packets ] [ Bits ] [ Physical layer: signal processing ] (Note that we haven't covered reliable transport yet) Where would routing fit in? Around here: [ Reliable Transport ] [ Routing ] [ Packets ] [ Bits ] [ Physical layer: signal processing ] We need for the network to be able to send data around before we worry about reliable transport. However, another valid model might be: [ Reliable Transport ] [ Packets ] [ Routing ] [ Bits ] [ Physical layer: signal processing ] After all, we didn't really need packets to do routing, right? We could've done all of this with bits or frames. Indeed, in standard layering models of the network, we say this: [ Reliable Transport ] [ Packets/Addressing/Routing ] [ Bits ] [ Physical layer: signal processing ] This layer is called the "network layer"; its the glue that holds the network together, and takes us from point-to-point connections to real networks.