Reading 24: Networking
Software in 6.031
Objectives
In this reading we examine client/server communication over the network using the socket abstraction.
Network communication is inherently concurrent, so building clients and servers will require us to reason about their concurrent behavior and to implement them with thread safety. We must also design the wire protocol that clients and servers use to communicate, just as we design the operations that clients of an ADT use to work with it.
We will primarily be using Java in this reading, but at the end we will look at a TypeScript equivalent to these concepts – a higher-level abstraction called web sockets.
Client/server design pattern
In this reading (and in the problem set) we explore the client/server design pattern for communication with message passing.
In this pattern there are two kinds of processes: clients and servers. A client initiates the communication by connecting to a server. The client sends requests to the server, and the server sends replies back. Finally, the client disconnects. A server might handle connections from many clients concurrently, and clients might also connect to multiple servers.
Many Internet applications work this way: web browsers are clients for web servers, an email program like Outlook is a client for a mail server, etc.
On the Internet, client and server processes are often running on different machines, connected only by the network, but it doesn’t have to be that way — the server can be a process running on the same machine as the client.
Addresses and ports
We begin with some important concepts related to network communication.
IP addresses
A network interface is identified by an IP address. IP version 4 addresses are 32-bit numbers written in four 8-bit parts. For example (as of this writing):
104.47.42.36
is the address of a Microsoft Outlook email handler.127.0.0.1
is the loopback or localhost address: it always refers to the local machine. Technically, any address whose first octet is127
is a loopback address, but127.0.0.1
is standard.
You can ask Google for your current IP address. In general, as you carry around your laptop, every time you connect your machine to the network it can be assigned a new IP address.
Hostnames
Hostnames are names that can be translated into IP addresses. A single hostname can map to different IP addresses at different times; and multiple hostnames can map to the same IP address. For example:
web.mit.edu
is the name for MIT’s web server. You can translate this name to an IP address yourself usingdig
,host
, ornslookup
on the command line, e.g.:$ dig +short web.mit.edu 18.9.22.69
google.com
is exactly what you think it is. Try using one of the commands above to findgoogle.com
’s IP address. What do you see?mit-edu.mail.protection.outlook.com
is the name for MIT’s incoming email handler, a spam filtering system hosted by Microsoft.localhost
is a name for127.0.0.1
. When you want to talk to a server running on your own machine, talk tolocalhost
.
Translation from hostnames to IP addresses is the job of the Domain Name System (DNS). It’s super cool, but not part of our discussion today.
Port numbers
A single machine might have multiple server applications that clients wish to connect to, so we need a way to direct traffic on the same network interface to different processes.
Network interfaces have multiple ports identified by a 16-bit number. Port 0 is reserved, so port numbers effectively run from 1 to 65535.
A server process binds to a particular port — it is now listening on that port. A port can have only one listener at a time, so if some other server process tries to listen to the same port, it will fail.
Clients have to know which port number the server is listening on. There are some well-known ports that are reserved for system-level processes and provide standard ports for certain services. For example:
- Port 22 is the standard SSH port.
When you connect to
athena.dialup.mit.edu
using SSH, the software automatically uses port 22. - Port 25 is the standard email server port.
- Port 80 is the standard web server port.
When you connect to the URL
http://web.mit.edu
in your web browser, it connects to18.9.22.69
on port 80.
When the port is not a standard port, it is specified as part of the address.
For example, the URL http://128.2.39.10:9000
refers to port 9000 on the machine at 128.2.39.10
.
reading exercises
* see What if Dr. Seuss Did Technical Writing?, although the issue described in the first stanza is no longer relevant with the obsolescence of floppy disk drives
Wire protocols
When a client and server make a network connection, what do they pass back and forth over that connection? Unlike the in-memory objects sent and received using synchronized queues in Message-Passing, these low-level connections send and receive streams of bytes. Instead of choosing or designing an abstract data type for our messages, we will choose or design a protocol.
A protocol is a set of messages that can be exchanged by two communicating parties.
A wire protocol in particular is a set of messages represented as byte sequences, like hello world
and bye
(assuming we’ve agreed on a way to encode those characters into bytes).
Many Internet applications use simple ASCII-based wire protocols. You can use a program called Telnet to check them out.
Telnet client
telnet
is a utility that allows you to make a direct network connection to a listening server and communicate with it via a terminal interface.
Windows, Linux, and Mac OS X can all run telnet
, although more recent operating systems no longer have it installed by default.
You should first check if telnet is installed by running the command telnet
on the command line.
If you don’t have it, then look for instructions about how to install it (Linux, Windows, Mac OS).
On Windows, an alternative telnet client is PuTTY, which has a graphical user interface.
HTTP
Hypertext Transfer Protocol (HTTP) is the language of the World Wide Web. We already know that port 80 is the well-known port for speaking HTTP to web servers, so let’s talk to one on the command line.
Try using your telnet client with the commands below. User input is shown in green, and for input to the telnet connection, newlines (pressing enter) are shown with ↵. (If you are using PuTTY on Windows, you will enter the hostname and port in PuTTY's connection dialog, and you should also select Connection type: Raw, and Close window on exit: Never. The last option will prevent the window from disappearing as soon as the server closes its end of the connection.)
$ telnet www.eecs.mit.edu 80 Trying 18.25.4.17... Connected to www.eecs.mit.edu. Escape character is '^]'. GET /↵ <!DOCTYPE html> ... lots of output ... <title>Homepage | MIT EECS</title> ... lots more output ...
The GET
command gets a web page.
The /
is the path of the page you want on the site.
So this command fetches the page at http://www.eecs.mit.edu:80/
.
Since 80 is the default port for HTTP, this is equivalent to visiting http://www.eecs.mit.edu/ in your web browser.
The result is HTML code that your browser renders to display the EECS homepage.
Internet protocols are defined by RFC specifications (RFC stands for “request for comments”, and some RFCs are eventually adopted as standards). RFC 1945 defined HTTP version 1.0, and was superseded by HTTP 1.1 in RFC 2616. So for many web sites, you might need to speak HTTP 1.1 if you want to talk to them. For example:
$ telnet www.eecs.mit.edu 80 Trying 18.25.4.17... Connected to www.eecs.mit.edu. Escape character is '^]'. GET / HTTP/1.1↵ Host: www.eecs.mit.edu↵ ↵ HTTP/1.1 200 OK Date: Mon, 25 Jan 2021 17:36:21 GMT ... more headers ... a105 <!DOCTYPE html> ... more HTML ... <title>Homepage | MIT EECS</title> ... lots more HTML ... </html> 0
This time, your request must end with a blank line. HTTP version 1.1 requires the client to specify some extra information (called headers) with the request, and the blank line signals the end of the headers.
You will also more than likely find that telnet does not exit after making this request — this time, the server keeps the connection open so you can make another request right away.
To quit Telnet manually, type the escape character (probably Ctrl
-]
) to bring up the telnet>
prompt, and type quit
:
... lots more HTML ... </html> 0 Ctrl-] telnet> quit↵ Connection closed.
SMTP
Simple Mail Transfer Protocol (SMTP) is the protocol for sending email (different protocols are used for client programs that retrieve email from your inbox).
Because the email system was designed in a time before spam, modern email communication is fraught with traps and heuristics designed to prevent abuse.
But we can still try to speak SMTP.
Recall that the well-known SMTP port is 25, and MIT’s incoming email handler is mit-edu.mail.protection.outlook.com
.
You’ll need to fill in your-IP-address-here and your-username-here, and the ↵ indicate newlines for clarity. This will only work if you’re on MITnet, and even then your mail might be rejected for looking suspicious. So this example starts by logging into Athena so that you are on MITnet:
$ ssh your-username-here@athena.dialup.mit.edu ... authenticate yourself ... $ curl -w '\n' whatismyip.akamai.com athena-IP-address-here $ telnet mit-edu.mail.protection.outlook.com 25 Trying 104.47.40.36... Connected to mit-edu.mail.protection.outlook.com. Escape character is '^]'. 220 ABC123000.mail.protection.outlook.com Microsoft ESMTP MAIL Service HELO athena-IP-address-here↵ 250 ABC123000.mail.protection.outlook.com Hello [your-ip-address] MAIL FROM: <your-username-here@mit.edu>↵ 250 2.1.0 Sender OK RCPT TO: <your-username-here@mit.edu>↵ 250 2.1.5 Recipient OK DATA↵ 354 Start mail input; end with <CRLF>.<CRLF> From: <your-username-here@mit.edu>↵ To: <your-username-here@mit.edu>↵ Subject: testing↵ ↵ This is a hand-crafted artisanal email.↵ .↵ 250 2.6.0 <111111-22-33-44-55555555@ABC.eop-123.prod.protection.outlook.com> QUIT↵ 221 2.0.0 Service closing transmission channel Connection closed by foreign host.
SMTP is quite chatty in comparison to HTTP, even including human-readable instructions to tell the client how to submit their message.
Designing a wire protocol
When designing a wire protocol, apply the same rules of thumb you use for designing the operations of an abstract data type:
Keep the number of different messages small. It’s better to have a few commands and responses that can be combined rather than many complex messages.
Each message should have a well-defined purpose and coherent behavior.
The set of messages must be adequate for clients to make the requests they need to make and for servers to deliver the results.
Just as we demand representation independence from our types, we should aim for platform-independence in our protocols. HTTP can be spoken by any web server and any web browser on any operating system. The protocol doesn’t say anything about how web pages are stored on disk, how they are prepared or generated by the server, what algorithms the client will use to render them, etc.
We can also apply the three big ideas in this class:
-
The protocol should be easy for clients and servers to generate and parse. Simpler code for reading and writing the protocol (e.g. a parser generated automatically from a grammar, or simple regular expressions with a regular-expression-matching library) will have fewer opportunities for bugs.
Consider the ways a broken or malicious client or server could stuff garbage data into the protocol to break the process on the other end.
Email spam is one example: when we spoke SMTP above, the mail server asked us to say who was sending the email, and there’s nothing in SMTP to prevent us from lying outright. We’ve had to build systems on top of SMTP to try to stop spammers who lie about
From:
addresses.Security vulnerabilities are a more serious example. For example, protocols that allow a client to send requests with arbitrary amounts of data require careful handling on the server to avoid running out of buffer space, or worse.
Easy to understand: for example, choosing a text-based protocol means that we can debug communication errors by reading the text of the client/server exchange. It even allows us to speak the protocol “by hand” as we saw above.
Ready for change: for example, HTTP includes the ability to specify a version number, so clients and servers can agree with one another which version of the protocol they will use. If we need to make changes to the protocol in the future, older clients or servers can continue to work by announcing the version they will use.
Serialization is the process of transforming data structures in memory into a format that can be easily stored or transmitted. Rather than invent a new format for serializing your data between clients and servers, use an existing one. For example, JSON (JavaScript Object Notation) is a simple, widely-used format for serializing basic values, arrays, and maps with string keys.
Specifying a wire protocol
In order to precisely define for clients & servers what messages are allowed by a protocol, use a grammar.
For example, here is a very small part of the HTTP 1.1 request grammar from RFC 2616 section 5:
request ::= request-line
((general-header | request-header | entity-header) CRLF)*
CRLF
message-body?
request-line ::= method SPACE request-uri SPACE http-version CRLF
method ::= "OPTIONS" | "GET" | "HEAD" | "POST" | ...
...
Using the grammar, we can see that in this example request from earlier:
GET / HTTP/1.1
Host: www.eecs.mit.edu
GET
is themethod
: we’re asking the server to get a page for us./
is therequest-uri
: the description of what we want to get.HTTP/1.1
is thehttp-version
.Host: www.eecs.mit.edu
is some kind of header — we would have to examine the rules for each of the...-header
options to discover which one.- And we can see why we had to end the request with a blank line: since a single
request
can have multiple headers that end in CRLF (newline), we have another CRLF at the end to finish therequest
. - We don’t have any
message-body
— and since the server didn’t wait to see if we would send one, presumably that only applies for other kinds of requests.
The grammar is not enough: it fills a similar role to method signatures when defining an ADT. We still need the specifications:
What are the preconditions of a message? For example, if a particular field in a message is a string of digits, is any number valid? Or must it be the ID number of a record known to the server?
Under what circumstances can a message be sent? Are certain messages only valid when sent in a certain sequence?
What are the postconditions? What action will the server take based on a message? What server-side data will be mutated? What reply will the server send back to the client?
reading exercises
Consider this example wire protocol, specified using two grammars…
Messages from the client to the server
The client can turn lights, identified by numerical IDs, on and off. The client can also request help.
MESSAGE ::= ( ON | OFF | HELP_REQ ) NEWLINE
ON ::= "on " ID
OFF ::= "off " ID
HELP_REQ ::= "help"
NEWLINE ::= "\r"? "\n"
ID ::= [1-9][0-9]*
Messages from the server to the client
The server can report the status of the lights and provides arbitrary help messages.
MESSAGE ::= ( STATUS | HELP ) NEWLINE
STATUS ::= ONE_STATUS ( NEWLINE "and " ONE_STATUS )*
ONE_STATUS ::= ID " is " ( "on" | "off" )
HELP ::= [^\r\n]+
NEWLINE ::= "\r"? "\n"
ID ::= [1-9][0-9]*
We’ll use ↵ to represent a newline.
(missing explanation)
(missing explanation)
Web server in TypeScript
Let’s look at the nuts and bolts of writing a simple web server in TypeScript.
For the sake of introduction, we’ll look at a simple EchoServer
that just echoes what the client sends it.
You can see the full code for EchoServer
.
Some of the code snippets in this reading are simplified for presentation purposes.
Route handling
A web server typically divides up the website it serves into sections, called routes.
For example, the server for web.mit.edu
might have routes for:
/education
(accessible by the URLhttp://web.mit.edu/education
)/research
/campus-life
The control flow of a web server is an input loop waiting for incoming connections from web browsers.
When a new connection arrives, the server reads and parses the request (using the HTTP wire protocol).
The request is then routed to the handler registered for the route matching the request.
Typically only a prefix of the request has to match the route, so the request http://web.mit.edu/community/topic/arts.html
will be routed to the handler for /community
unless there is a more specific (longer prefix) route registered.
The handler is then responsible for constructing the response to the request.
The most popular router for Node is Express.
You can create a new Express Application
object like this:
import express from 'express';
const app = express();
Note that express
is a factory function that returns Application
objects.
Sometimes complex servers have more than one Application
, but most need only one.
You can make the application start listening for HTTP connections:
const PORT = 8000; // port on which the server will listen for incoming connections
app.listen(PORT);
and then add routes to the application using get()
:
app.get('/echo', function(request, response) {
...
});
Here, get
refers to the GET method of the HTTP protocol, as described above.
When you type a URL into a web browser’s address bar, you are telling the browser to issue a GET request.
The first argument is the route prefix, and the second argument is a callback function.
The app
object will call the callback function anytime an incoming request starts with the /echo
prefix.
The arguments to the callback are a Request object that provides observer methods giving information about the request, and a Response
object with mutator methods for generating a response that goes back to the web browser.
The body of the callback function (shown as ...
above) should examine the request and generate a response.
As an example of how it might examine the request, if the incoming request was http://localhost:8000/echo?greeting=hello
, then:
const greeting = request.query.greeting;
returns "hello"
, which came from the query part of the URL (the part starting with a question mark, ?greeting=hello
).
As an example of generating a response, the handler should first provide a status code (typically 200 when the request is successful), and the type of response it is returning (HTML, plain text, JSON, or something else):
response.status(200);
response.type('text');
And then write the response using the send
method:
response.send(greeting + ' to you too!');
The response is now complete, and the web browser should display the text passed to send
.
These mutator operations (status
, type
, and send
) are declared to return the Response
object itself rather than just void, so it is common to see them chained together like this:
response.status(200).type('text').send(greeting + ' to you too!');
response
.status(200)
.type('text')
.send(greeting + ' to you too!');
This kind of function-calling syntax – where the return value of a method is immediately used to call another method – is called method call chaining.
Here is the full code for the /echo
route:
app.get('/echo', function(request, response) {
const greeting = request.query.greeting;
response
.status(200)
.type('text')
.send(greeting + ' to you too!');
});
Once again, this lambda function is an example of a callback: a piece of code that we as clients are handing to the Application
object, for it to call whenever an event occurs, in this case the arrival of a request matching the route.
Clients
Since the Specifications reading, we’ve used the term client to refer to code that uses a class. In this reading, we started using client to mean one party in a network client/server communication. With Express, we need to keep track of both specific meanings of client:
A client of the web server is a web browser, sending it requests over the network. For example, a client of our Echo web server sends an HTTP request to
GET /echo?greeting=hello
A client of the Express
Application
object is code we’ve written against the API that it provides. For example, a client ofApplication
callsget()
and passes in a callback function that is ready to handle requests for the/echo
route.
Error handling
Although the full topic of error handling in Express is outside the scope of this course, fortunately the simplest kind of error handling is easy: throwing an exception in the callback function will send an error back to the browser showing the stack trace. For example:
app.get('/bad', function(request, response) {
throw new Error('always fail');
});
displays something like this when a web browser visits http://localhost:8000/bad
:
Error: always fails
at ./server.ts:30:11
at Layer.handle [as handle_request] (./node_modules/express/lib/router/layer.js:95:5)
at next (./node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (./node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (./node_modules/express/lib/router/layer.js:95:5)
at ./node_modules/express/lib/router/index.js:281:22
at Function.process_params (./node_modules/express/lib/router/index.js:335:12)
at next (./node_modules/express/lib/router/index.js:275:10)
at expressInit (./node_modules/express/lib/middleware/init.js:40:5)
at Layer.handle [as handle_request] (./node_modules/express/lib/router/layer.js:95:5)
Displaying a stack trace to the user wouldn’t be appropriate for a production web server, but it’s a good way to get started.
Using await
functions in a callback
A web server often has to do some work in response to a web request: reading files from the filesystem, making queries to a database, writing data somewhere.
If that work involves calling asynchronous functions, then we may need to use await
in the callback function, which means in turn that the callback function must itself be defined with the async
keyword:
app.get('/lookup', async function(request, response) {
...
const data = await database.lookup(query);
...
});
In the current version of Express (Express 4), this code mostly works, except that the automatic exception-handling shown in the previous section doesn’t work anymore, because Express 4 wasn’t written to catch errors from a Promise
returned by the callback function. It only catches exceptions thrown by the initial call of the callback function.
Fortunately, there is a simple module that patches this, by wrapping your async
function with a function that handles the promise correctly:
import asyncHandler from 'express-async-handler';
app.get('/lookup', asyncHandler(async function(request, response) {
...
const data = await database.lookup(query);
...
}));
Express 5 (the next version, currently in alpha release) fixes this problem.
Summary
In the client/server design pattern, concurrency is inevitable: multiple clients and multiple servers are connected on the network, sending and receiving messages simultaneously, and expecting timely replies. A server that blocks waiting for one slow client when there are other clients waiting to connect to it or to receive replies will not make those clients happy.
All the challenges of making concurrent code safe from bugs, easy to understand, and ready for change apply when we design network clients and servers. These processes run concurrently with one another (often on different machines), and any server that wants to talk to multiple clients concurrently (or a client that wants to talk to multiple servers) must manage that concurrency.