With Microscope and Tweezers: The Program

With Microscope and Tweezers:

An Analysis of the Internet Virus of November 1988

The Program

This Appendix describes the virus program subroutine by subroutine. For reference, the flow of information among the subroutines is shown in Figure [collectfig].


flow chart
The structure of the attacking engine.

Names

The core of the virus is a pair of binary modules, one for the VAX architecture and the other for the Sun architecture. These are linkable modules, and thus have name lists for their internal procedures. Many of the original names are included here with the descriptions of the functions the routines performed.

It is surprising that the names are included, and astonishing that they are meaningful. Some simple techniques, such as randomizing the procedure names, would have removed a number of clues to the function of the virus.

main

The main module, the starting point of any C language program, does some initialization, processes its command line, and then goes off into the loop which organizes all of the real work.

Initialization
The program first takes some steps to hide itself. It changes the ``zeroth'' argument, which is the process name, to sh. Thus, no matter how the program was invoked, it would show up in the process table with the same name as the Bourne Shell, a program which often runs legitimately.

The program also sets the maximum core dump size to zero blocks. If the program crashed it would not leave a core dump behind to help investigators. It also turns off handling of write errors on pipes, which normally cause the program to exit.

The next step is to read the clock, store the current time in a local variable, and use that value to seed the random number generator.

Command line argument processing
The virus program itself takes an optional argument -p which must be followed by a decimal number, which seems to be a process id of the parent which spawned it. It uses this number later to kill that process, probably to ``close the door'' behind it.

The rest of the command line arguments are ``object names''. These are names of files it tries to load into its address space. If it can't load one of them, it quits. If the -p argument is given, it also deletes the object files, and later tries to remove the disk image of running virus, as well as the file /tmp/.dumb. (This file is not referenced anywhere else in the virus, so it is unclear why it is deleted.)

The program then tried a few further steps, exiting (``bailing out'') if any of them failed:

If the ``-p'' argument was given, the program closes all file descriptors, in case there are any connections open to the parent.

The program then erases the text of the argument array, to further obscure how it was started (perhaps to hide anything if one were to get a core image of the running virus.)

It scans all of the network interfaces on the machine, gets the flags and addresses of each interface. It tries to get the point-to-point address of the interface, skipping the loopback address. It also stores the netmask for that network [rfc950].

Finally, it kills off the process id given with the ``-p'' option. It also changes the current process group, so that it doesn't die when the parent exits. Once this is cleaned up, it falls into the doit routine which performs the rest of the work.


Footnotes:

doit routine

This routine is where the program spends most of its time.

Initialization
Like the main routine, it seeds the random number generator with the clock, and stores the clock value to later measure how long the virus has been running on this system.

It then tries hg. If that fails, it tries hl. If that fails, it tries ha.

It then tries to check if there is already a copy of the virus running on this machine. Errors in this code contributed to the large amounts of computer time taken up by the virus. Specifically:

The remainder of the initialization routine seems designed to send a single byte to address 128.32.137.13, which is ernie.berkeley.edu, on port 11357. This never happens, since the author used the sendto function on a TCP stream connection, instead of a UDP datagram socket. We have no explanation for this; it only tries to send this packet with a one in fifteen random chance.

Main loop
An infinite loop comprises the main active component of the virus. It calls the cracksome routine which tries to find some hosts that it can break in to. Then it waits 30 seconds, listening for other virus programs attempting to break in, and tries to break into another batch of machines.

After this round of attacks, it forks, creating two copies of the virus; the original (parent) dies, leaving the fresh copy. The child copy has all of the information the parent had, while not having the accumulated CPU usage of the parent. It also has a new process id, making it hard to find.

Next, the hg, hl, and ha routines search for machines to infect (see Appendix [hroutines]). The program sleeps for 2 minutes, and then checks to see if it has been running for more than 12 hours, cleaning up some of the entries in the host list if it has.

Finally, before repeating, it checks the global variable pleasequit. If it is set, and if it has tried more than 10 words from its own dictionary against existing passwords, it quits. Thus forcing pleasequit to be set in the system libraries will do very little to stem the progress of this virus.


Footnotes:

Cracking routines

This collection of routines is the ``brain'' of the virus. cracksome, the main switch, chooses which of four strategies to execute. It is would be the central point for adding new strategies if the virus were to be further extended. The virus works each strategy through completely, then switches to the next one. Each pass through the cracking routines only performs a small amount of work, but enough state is remembered in each pass to continue the next time around.

cracksome
The cracksome routine is the central switching routine of the cracking code. It decides which of the cracking strategies is actually exercised next. Again, note that this routine was named in the global symbol table. It could have been given a confusing or random name, but it was actually clearly labelled, which lends some credence to the idea that the virus was released prematurely.

Phase 0
The first phase of the cracksome routines reads through the /etc/hosts.equiv file to find machine names that would be likely targets. While this file indicates what hosts the current machine trusts, it is fairly common to find systems where all machines in a cluster trust each other, and at the very least it is likely that people with accounts on this machine will have accounts on the other machines mentioned in /etc/hosts.equiv.

It also reads the /.rhosts file, which lists the set of machines that this machine trusts root access from. Note that it does not take advantage of the trust itself [baldwin] but merely uses the names as a list of additional machines to attack. Often, system managers will deny read access to this file to any user other than root itself, to avoid providing any easy list of secondary targets that could be used to subvert the machine; this practice would have prevented the virus from discovering those names, although /.rhosts is very often a subset of /etc/hosts.equiv.

The program then reads the entire local password file, /etc/passwd. It uses this to find personal .forward files, and reads them in search of names of other machines it can attack. It also records the user name, encrypted password, and GECOS information string, all of which are stored in the /etc/passwd file. Once the program scanned the entire file, it advanced to Phase 1.

Phase 1
This phase of the cracking code attacked passwords on the local machine. It chose several likely passwords for each user, which were then encrypted and compared against the encryptions obtained in Phase 0 from /etc/passwd:

All of these attacks are applied to fifty passwords at a time from those collected in Phase 0. Once it had tried to guess the passwords for all local accounts, it advanced to Phase 2.

Phase 2
Phase 2 takes the internal word list distributed as part of the virus (see Appendix
[dict]) and shuffles it. Then it takes the words one at a time and decodes them (the high bit is set on all of the characters to obscure them) and tries them against all collected passwords. It maintains a global variable nextw as an index into this table. The main loop uses this to prevent pleasequit from causing the virus to exit until at least ten of the words have been checked against all of the encryptions in the collected list.

Again, when the word list is exhausted the virus advances to Phase 3.

Phase 3
Phase 3 looks at the local /usr/dict/words file, a 24474 word list distributed with 4.3BSD (and other UNIX systems) as a spelling dictionary. The words are stored in this file one word per line. One word at a time is tried against all encrypted passwords. If the word begins with an upper case letter, the letter is converted to lower case and the word is tried again.

When the dictionary runs out, the phase counter is again advanced to 4 (thus no more password cracking is attempted).

H routines

The ``h routines'' are a collection of routines with short names, such as hg, ha, hi, and hl, which search for other hosts to attack.

hg
The hg routine calls rt_init (if it has not already been called) to scan the routing table, and records all gateways except the loopback address in a special list. It then tries a generic attack routine to attack via rsh, finger, and SMTP. It returns after the first successful attack.

ha
The ha routine goes through the gateway list and connects to TCP port 23, the telnet port, looking for gateways which are running telnet listeners. It randomizes the order of such gateways and calls hn (our name) with the network number of each gateway. The ha returns after hn reports that it has succeeded broken into a host.

hl
The hl routine iterates through all the addresses for the local machine calling hn with the network number for each one. It returns if hn indicates success in breaking into a host.

hi
The hi routine goes through the internal host list (see section
[phase0]) and tries to attack each host via rsh, finger, and SMTP. It returns if when one host is infected.

hn
The hn routine (our name) followed hi takes a network number as an argument. Surprisingly it returns if the network number supplied is the same as the network number of any of the interfaces on the local machine. For Class A addresses it uses the Arpanet IMP convention to create possible addresses to attack (net.[1-8].0.[1-255]). For all other networks it guesses hosts number one through 255 on that network. It randomizes the order of this list of possible hosts and tries to attack up to twenty of them using rsh, finger, and SMTP. If a host does not accept connections on TCP port 514, the rsh port, hn will not try to attack it. If a host is successfully attacked hn returns.

Usage
The ``h routines'' are called in groups in the main loop; if the first routine succeedes in finding a vulnerable host the remaining routines are not called in the current pass. Each routine returns after it finds one vulnerable host. The hg routine is always called first, which indicates the virus really wanted to infect gateway machines. Next comes hi which tried to infect normal hosts found via cracksome. If hi fails, ha is called, which seemed to try breaking into hosts with randomly guessed addresses on the far side of gateways. This assumes that all the addresses for gateways had been obtained (which is not trivial to verify from the convoluted code in {rt init}), and implies that the virus would prefer to infect a gateway and from there reach out to the gateway's connected networks, rather than trying to hop the gateway directly. If hg, hi, and ha all failed to infect a host, then hl is called which is similar to ha but uses for local interfaces for a source of networks.

It is not clear that ha and hl worked. Because hn returns if the address is local, hl appears to have no chance of succeeding. If alternate addresses for gateways are indeed obtained by other parts of the virus then ha could work. But if only the addresses in the routing table were used it could not work, since by definition these addresses must be on a directly connected network. Also, in our monitoring we never detected an attack on a randomly generated address. These routines do not seem to have been functional.

Attack routines

There are a collection of attack routines, all of which try to obtain a Bourne Shell running on the targeted machine. See Appendix
[hook] for a description of the l1.c program, used by all the attack routines.

hu1
The hu1 routine is called by the Phase 1 and Phase 3 cracksome subroutines. Once a password for user name guessed correctly, this routine is called with a host name read from either the user's .forward or .rhosts files. In order to assume the user's id it then tries to connect to the local machine's rexec server using the guessed name and password. If successful it runs an rsh to the target machine, trying to execute a Bourne Shell, which it uses to send over and compile the l1.c infection program.

Hit SMTP
This routine make a connection to TCP port 25, the SMTP port, of a remote machine and used it to take advantage of the sendmail bug. It attempts to use the debug option to make sendmail run a command (the ``recipient'' of the message), which transfers the l1.c program included in the body of the message.

Hit finger
The ``hit finger'' routine tries to make a connection to TCP port 79, the finger port, of the remote machine. Then it creates a ``magic packet'' which consists of

Note that the piece of code is VAX code, and the stack frame is a VAX frame, in the wrong order for the Sun. Thus, although the Sun finger daemon has the same bug as the VAX one, this piece of code cannot exploit it.


        pushl   $68732f         push '/sh'
        pushl   $6e69622f       push '/bin'
        movl    sp,r10          save address of start of string
        pushl   $0              push 0 (arg 3 to execve)
        pushl   $0              push 0 (arg 2 to execve)
        pushl   r10             push string addr (arg 1 to execve)
        pushl   $3              push argument count
        movl    sp,ap           set argument pointer
        chmk    $3b             do "execve" kernel call.

VAX intructions for the finger attack.

The attack on the finger daemon is clearly a lysogenetic ``viral'' attack (see Section [rose]), since although a worm doesn't modify the host machine at all, the finger attack does modify the running finger daemon process. The ``injected DNA'' component of the virus contained the VAX instructions shown in Figure [vaxcode].

The execve system call causes the current process to be replaced with an invocation of the named program; /bin/sh is the Bourne Shell, a UNIX command interpreter. In this case, the shell winds up running with its input coming from, and its output going to, the network connection. The virus then sends over the l1.c bootstrap program.

Hit rsh
This unlabeled routine tries rsh to the target host (assuming it can get in as the current user). It tries three different names for the rsh binary, If one of them succeeds, it tries to resynchronize (see Appendix [resynch]) the connection; if that doesn't succeed within thirty seconds it kills off the child process. If successful the connection can then be used to launch the l1.c ``grappling hook'' program at the victim.

Note that this infection method doesn't specify a user name to attack; if it gets into the remote account, it is because the user that the virus is running as also has an account on the other machine which trusts the originating machine.

Hit rexec
The hit rexec routine uses the remote execution system which is similar to rsh, but designed for use by programs. It connects and sends the user name, the password, and /bin/sh as the command to execute.

makemagic
This routine tries to make a telnet connection to each of the available addresses for the current victim. It broke the connections immediately, often producing error reports from the telnet daemon, which were recorded, and provide some of the earliest reports of attack attempts.

If it succeedes in reaching the host, it creates a TCP listener on a random port number which the infected machine would eventually connect back to.


Footnotes:

Grappling Hook

A short program, named l1.c, is the common grappling hook that all of the attack routines use to pull over the rest of the virus. It is robustly written, and fairly portable. It ran on a number of machines which were neither VAX or Sun, loading them down as well, but only making them peripheral victims of the virus.

The first thing it does is delete the binary it was running from. It checks that it has three arguments (exiting if there aren't three of them). It closes all file descriptors and then forks, exiting if the fork fails. If it succeeds, the parent exits; this leaves no connection from the child to the infection route.

Next, it creates a TCP connection back to the address given as the first argument, and the port given as the second. Then it sends over the magic number given as the third. The text of each argument is erased immediately after it is used. The stream connection is then reused as the program's standard input and output.

A loop reads in a length (as a network byte order 32-bit integer) and then a filename. The file is unlinked and opened for write, and then the file itself is read in (using the number of bytes read in earlier.) On any error, all of the files are unlinked. If the length read in is -1, the loop exits, and a Bourne Shell is executed (replacing the l1 program, and getting its input from the same source.)

Install Routines

There are a variety of routines used to actually move the virus from one machine to the other. They deal with the ``virus protocol'' connection made by the l1.c injected program or with the shell that it spawns.

resynch
The resynch routine sends commands to a remote shell, requesting that it echo back a specific randomly chosen number. It then waits a certain amount of time for a response. This routine is used to indicate when the various subprograms of the infection procedure have compiled or executed and a Bourne Shell prompt is available again.

waithit
This routine does much of the high level work. It waits (up to 2 minutes) for a return connection from a victim (which has had l1.c injected into it.) It then tries to read a magic number (which had been previously sent to that victim as a command line argument to the l1 program) and gives up after ten seconds.

After the connection is established, all of the current ``objects'' in storage in the virus are fed down the connection into the victim. Then it tries to resynchronize, and if it succeeds, sends down commands to

Thus, to add another machine type, the virus merely needs to be started with a new object binary as a command line option, which will then be propagated to the next infected host and tried.

Note that the path used here was PATH= bin: /usr/bin: /usr/ucb which is certainly reasonable on most systems. This protects systems with ``unusual'' filesystem layouts, and suggests that complete consistency among systems makes them more vulnerable.


Footnotes:

Host modules

These are a set of routines designed to collect names and addresses of target hosts in a master list. Each entry contains up to six addresses, up to twelve names, and a flags field.

Name to host
This routine searches the host list for a given named host, returns the list entry describing it, and optionally adds it to the list if it isn't there already.

Address to host
This routine searches the host list for a given host address, returns the list entry describing it, and optionally adds it to the list if it isn't there already.

Add address/name
These two routines added an address or a name to a host list entry, checking to make sure that the address or name was not already known.

Clean up table
This routine cycles through the host list, and removes any hosts which only have flag bits 1 and 2 set (and clears those bits.) Bit 1 is set when a resynchronize (in waithit) fails, probably indicating that this host ``got lost''. Bit 2 is set when a named host has no addresses, or when several different attack attempts fail. Bit 3 is set when Phase 0 of the crack routines successfully retrieves an address for the host.

Get addresses
This routine takes an entry in the host table and tries to fill in the the gaps. It looks up an address for a name it has, or looks up a name for the addresses it has. It also includes any aliases it can find.

Object routines

These routines are what the system uses to pull all of its pieces into memory when it starts (after the host has been infected) and then to retrieve them to transmit to any host it infects.

Load object
This routine opens a file, determines its length, allocating the appropriate amount of memory, reads it in as one block, decodes the block of memory (with XOR). If the object name contains a comma, it moves past it and starts the name there.

Get object by name
This routine returns a pointer to the requested object. This is used to find the pieces to download when infecting another host.

Other initialization routines

if init
This routine scans the array of network interfaces. It gets the flags for each interface, and makes sure the interface is UP and RUNNING (specific fields of the flag structure). If the entry is a point to point type interface, the remote address is saved and added to the host table. It then tries to enter the router into the list of hosts to attack.

rt init
This routine runs netstat -r -n as a subprocess. This shows the routing table, with the addresses listed numerically. It gives up after finding 500 gateways. It skips the default route, as well as the loopback entry. It checks for redundant entries, and checks to see if this address is already an interface address. If not, it adds it to the list of gateways.

After the gateway list is collected, it shuffles it and enters the addresses in the host table.

Interlock routines

The two routines checkother and othersleep are at the heart of the excessive propagation of the virus. It is clear that the author intended for the virus to detect that a machine was already infected, and if so to skip it. The code is actually fraught with timing flaws and design errors which lead it to permit multiple infections, probably more often than the designer intended.

An active infection uses the othersleep routine for two purposes, first to sleep so that it doesn't use much processor time, and second to listen for requests from ``incoming'' viruses. The virus which is running othersleep is referred to as the ``listener'' and the virus which is running checkother is referred to as the ``tester''.

Checkother
The tester tries to connect to port 23357 on the local machine (using the loopback address, 127.0.0.1) to see if it can connect to a listener. If any errors occur during this check, the virus assumes that no listener is present, and tries to become a listener itself.

If the connection is successful, the checker sends a magic number, and listens (for up to 300 seconds) for a magic number from the listener. If the magic number is wrong, the checker assumes it is being spoofed and continues to run.

The checker then picks a random number, shifts it right by three (throwing away the lower three bits) and sends it to the listener. It expects a number back within ten seconds, which it adds to the one sent. If this sum is even, the sender increments pleasequit, which (as noted in section [pleasequit]) does very little.

Once it has finished communicating (or failing to communicate) with the listener, the checker sleeps for five seconds and tries to become a listener. It creates a TCP stream socket, sets the socket options to indicate that it should allow multiple binds to that address (in case the listener still hasn't exited, perhaps?) and then binds the socket to port 23357, and listens on it (permitting a backlog of up to ten pending connections.)

Othersleep
The othersleep routine is run when the main body of the virus wants to idle for a period of time. This was apparently intended to help the virus ``hide'' so that it wouldn't use enough processor time to be noticed. While the main program sleeps, the listener code waits to see if any checkers have appeared and queried for the existence of a listener, as a simple ``background task'' of the main virus.

The routine first checks to see if it has been set up as a listener; if not, it calls the normal sleep function to sleep for the requested number of seconds, and returns.

If it is set up as a listener, it listens on the checking port with a timeout. If it times out, it returns, otherwise it deals with the connection and subtracts the elapsed real time from the time out value.

The body of the listener ``accepts'' the connection, and sends a magic number to the checker. It then listens (for up to 10 seconds) for the checker's magic number, and picks a random number. It shifts the random number right by three, discarding the lower bits, and sends it up to the checker; it then listens (for up to 10 seconds) for a random number from the checker. If any of these steps fail, the connection is closed and the checker is ignored.

Once the exchanges have occurred, the address of the incoming connection is compared with the loopback address. If it is not from the loopback address, the attempt is ignored. If it is, then if the sum of the exchanged random numbers is odd, the listener increments pleasequit (with little effect, as noted in section [pleasequit]) and closes the listener connection.


Footnotes: