6.033 Discussion Suggestions (DNS paper)

6.033--Computer System Engineering

Suggestions for classroom discussion

Topic: Paul V. Mockapetris and Kevin J. Dunlap. Development of the Domain Name System. Proceedings of SIGCOMM 1988 Symposium: Communications architectures and protocols, pages 123-133. Also published as Computer Communications Review 18, 4 (August, 1988).

By J. H. Saltzer, March 11, 1998

On page 123, the authors disparage the HOSTS.TXT system, arguing incommensurate scaling. What is the order of growth of the HOSTS.TXT system as the number of hosts grows?
(If there are N hosts in the net, the central database grows with N, so the time required to transmit it to one other host also grows with N. But every host needs a copy, so it is necessary to transmit it N times. So the time that the central site will spend sending copies will grow with N squared. Unfortunately, that isn't the end of the story. The frequency of changes required in name-to-address mappings will probably be proportional to the number of such mappings, which means that the mean time till a change that requires a redistribution will get smaller as N grows. Thus we are contemplating an N-cubed order of growth of transmission time. This last effect is probably self-limiting, though--once you get to where you are distributing a new copy every day, probably noone cares whether the update has 1, 10, or 100 changes in it, so you don't need to distribute it 100 times a day.)
p. 124: "While the DNS has not replaced the HOSTS.TXT mechanism in many older hosts,,," Do you believe this?
(The paper was written in 1988, ten years ago, five years before the WWW appeared, and 7 years before Newsweek began publishing URL's. At that time it was still possible to use HOSTS.TXT (page 127 says there were 5500 names in HOSTS.TXT). The centrally updated HOSTS.TXT fell by the wayside shortly after this paper was written.)
p. 124. "dynamic update...atomicity/voting/backup" Exactly what did they avoid, and how?
DNS is basically a read-only database. Once in a while someone sneaks in the back side of the server and replaces its tables with a whole new set, but there isn't a network protocol saying "change the RR for host 3-208.MIT.EDU". As a result you don't need to worry about what happens if two people send such a request a the same time, or about getting the update to the other two servers before the first one crashes. Atomicity is achieved by stuffing a new table in at night when noone is watching.)
p. 125. Octet? (the ISO standard term for byte.)
p. 125 "The rational[e] for this system is that it allows the sources of information to specifiy canonical case, but frees users from having to deal with case." Explain?
(See the sidebar on case sensitivity.)
p. 125: What is IN_ADDR.ARPA all about?
A cute hack. Suppose DNS maps the name PDOS.LCS.MIT.EDU to the IP address 18.31.0.14. In that case, there is also an entry made in another domain name table that maps the name 14.0.31.18.IN_ADDR.ARPA to the name PDOS.LCS.MIT.EDU .
Why are the components of the IP address backwards in that example?
(To allow the IN_ADDR domain to be divided up into zones that can be delegated, starting with the high-level byte of the IP address. If you don't understand this, you haven't figured out how zones and delegation really work.)
p. 126 What is this stuff about maximum-size packets?
(At that time there was a network-wide convention that packets smaller than 576 bytes (512 bytes of payload plus 64 bytes of header) would never be fragmented. DNS wanted to use datagrams, not connections. By designing DNS so that all requests and responses fit in 512 payload bytes, it guaranteed that no implementation would ever have to implement reassembly.)
Zones are mysterious. What are they, really?
(It is the unit of delegation that hasn't been sub-delegated. Every zone must have two or more identical, redundant, name servers. MIT runs a zone. strawberry.mit.edu and ATHENA.DIALUP.MIT.EDU are in the MIT zone. *.LCS.MIT.EDU is not in the MIT zone, it is a zone of its own.)
So what are "zone transfers"? What do they transfer between?
(For reliability, every zone is required to have more than one name server. When you make a change to the database in one of the name servers, you have to also copy the new version of the database to the other name servers for your zone. This copying is given the fancy name "zone transfer".)
p. 127. How can a server support multiple, non-contiguous zones?
(At this point it might be a good idea to draw a diagram on the board that recaps the lecture describing how a name resolution actually works. The important point is that when the EDU name server says "names ending in MIT.EDU are resolved by the server over there" there is no requirement that "over there" be at MIT. It could be a name server in Dallas that handles the MIT zone. And the Dallas name server could also handle Stanford's zone. All that matters is that it knows how to give out correct answers for computers located at MIT.)
Jump to p. 130 for a moment. "5.4...For example, when servers pass back the name of a host, they include its address (if available)..." What is the relation to non-contiguous zones?
(If the server for a zone is IN the zone, then the higher-level server will have, in addition to a record that says that that server resolves names for the zone, another record that gives the name-to-IP mapping for that server. If the server for a zone is NOT in the zone, then the name-to-IP mapping may not be here, and require a whole new search to find it. The statement that this feature cuts query traffic in half suggests that for most zones the name service is handled by a server located in that zone.)
What is the relation of TTL to caching? Why is it of interest?
(The idea is that cache entries have an expiry time, after which they are defined to be invalid. If there weren't an expiry time, then to change a binding between IP address and host name it would be necessary not only to change the DNS table in the authoritative name server, but also to track down every DNS cache that happens to contain the binding and invalidate it. Explicit invalidation works fine on a two-processor system, because you can easily find the other cache. But every internet host in the universe has a DNS cache and finding all those that contain cached copies of the RR you just changed doesn't look feasible.)
What "redundancy" are the authors referring to when they say that of the seven root servers, three are running JEEVES and four are running BIND?
(The redundancy is that of "N-version programming". The two implementations were done by different people, in different languages, on different operating systems. With some luck, a bug that crashes all four of the BIND implementations simultaneously won't also crash the three JEEVES implementations. But the two implementations were done to the same specifications, and noone prevented the programmers from talking to each other, so they could still have a common bug or two.)
What is the consequence for DNS servers, especially root servers, of the unexpected growth of traffic in the internet?
(Two-fold. First, more traffic means more name requests. Second, more traffic means more congestion and longer queues, so name requests start to time out and get repeated, which adds to the load on the server and also to the congestion on the net.)
What do the DNS servers do about duplicate requests?
(DNS is a stateless protocol. That means the server remembers nothing from one request to the next. A duplicate request is handled in the same way as the original: look it up and send back the value in the table. The value should be the same as it was last time.)
Many of the servers perform better as the load increases, due to fewer page faults. This is really mysterious. Explain.
(To figure this out you have to understand a hidden assumption in this paper: A server is probably running several different services, of which DNS is only one. If DNS requests are rare, it is likely that other services will receive requests in between DNS requests, and if a virtual memory is in use, the running of those other services will cause the system to page out the DNS program and its tables. But if DNS requests arrive thick and fast, the next request will get there before all of DNS has been paged out, so fewer page faults will be required to handle the second DNS request. If there are enough DNS requests, DNS may see no paging at all.
Today, it is typical to dedicate a computer to name service, and to configure it with enough RAM that DNS and the local zone tables can fit without paging. The MIT.EDU zone has long had at least three such dedicated name servers scattered in different parts of the campus.) Although not directly relevant to naming, this point is an opening to discuss independent failure and other virtues of modularity client/server systems. Early name servers usually were implemented as an extra service on some other server. What are the pros and cons?
(Pro: no need to acquire configure, operate, and maintain two or three more computers just to run DNS.
Cons:
- If one of the other services causes the server to crash, name service is disrupted, too.
- If the other services are heavy-weight, they may bog down the server and cause responses from the basically light-weight DNS to be delayed.
- If one of the other services requires good security (such as a file server) then either overall security for the server may be set so high as to interfere with frequent update of the DNS tables or maintenance of its configuration. On the other hand, if it is set low enough to expedite DNS maintenance, the file server security may be at risk. With dedicated servers you don't have to make this trade-off.
- real life problem noticed on Project Athena in the early days: power to campus fails. power comes back, all workstations and servers start to reboot. My PC comes up in a minute or so, and the print server down the hall also comes up in a minute or so. I want to send a file to the printer, but I need name service to find the printer. The name service is on the same server as the 100 GByte file server, and it is going to be running fsck over its many file systems for the next 20 minutes. I don't get name service till it finishes booting.
)
p. 129. Explain negative caching, and why it helps.
(If someone just asked for the misspelled name "amstredam.lcs.mit.edu" there is a good chance that they will ask for that same name again in short order. Might as well make a note of it in the cache, along with an entry saying "not found". Then, when the second request comes in, you can respond instantly without having to send packets all over the net to reestablish that the name is still misspelled. They noticed that repeated bad names are much more common than expected.)
p. 129. "DNS and NRS mutually encapsulate each other". What is encapsulation?
(Suppose that some idiot has created a completely different name system called FOO with its own syntax and conventions--it puts host names in reverse order, such as UK.Cambridge.Computer-Lab.zeus. You can "encapsulate" that name system in DNS by creating a domain name for it, such as "FOO-NAMES.weird-name-systems.net" Then, someone can ask DNS to resolve the name "UK.Cambridge.Computer-Lab.zeus.FOO-NAMES.weird-name-systems.net". DNS will start working from the right, as usual, and finally refer the name to the resolver for the FOO-NAMES zone. That resolver, which knows the conventions for the rest of the name, can begin nibbling on it left to right, and do its own thing.
This example is real. The British run their host names in reverse order. It may have something to do with driving on the wrong side of the road.)
130. What is the *security* problem created by caching?
(This appears to be an overstatement. caching doesn't create a security problem, but it can make it a lot worse. If I intercept a request that you send to a DNS server and send you back bogus information, I can get you to believe that a different IP address is associated with the name you asked for. Once I have done that, you might unwittingly open a connection to the wrong place. Caching means that once you have the bogus information, you remember it, and if you are an intermediate name server, you may even pass it along to other requesters. So the cache may allow an intruder to create long-lived and widely propagating havoc, whereas without the cache the intruder would have to wreak havoc one request at a time.)
p. 131. Why does transient failure affect upgrading to DNS?
(For the same reason that RPC doesn't really make programming distributed applications easier. There used to be a call to "lookup-in-HOSTS.TXT" It always used to return with either an answer or a not-found. I can't simply replace that call with a "send DNS inquiry"--I really need to also revise my program to deal with the possibility that I get no answer, because the authoritative name server is on the other side of a network partition. So the upgrade isn't simple, it requires rewriting the program to respond gracefully to this failure.)
Conclusions, p. 132 has several system-level observations that should not be missed: "maintainers fix things till they work, rather than until they work well". Why should they have required that delegatees demonstrate working name servers, rather than just telling them the names of the name servers?
(Because DNS lives in the real world. A promise to bring up two name servers is likely to be honored by bringing up one, and then putting implementation of the second one on a list of things we really ought to do someday. Result: lowered reliability for everyone else, and a bad reputation for DNS.)
Comments and suggestions: Saltzer@mit.edu