With Microscope and Tweezers: Strategies

With Microscope and Tweezers:

An Analysis of the Internet Virus of November 1988

Strategies

Attacks

This virus attacked several things, directly and indirectly. It picked out some deliberate targets, such as specific network daemons through which to infect the remote host. There were also less direct targets, such as mail service and the flow of information about the virus.

Sendmail Debug Mode
The virus exploited the ``debug'' function of sendmail, which enables debugging mode for the duration of the current connection. Debugging mode has many features, including the ability to send a mail message with a program as the recipient (i.e. the program would run, with all of its input coming from the body of the message). This is inappropriate and rumor[nyt] has it that the author included this feature to allow him to circumvent security on a machine he was using for testing. It certainly exceeds the intended design of the Simple Mail Transfer Protocol (SMTP) [smtp].

Specification of a program to execute when mail is received is normally allowed in the sendmail aliases file or users' .forward files directly, for vacation, mail archive programs, and personal mail sorters. It is not normally allowed for incoming connections. In the virus, the ``recipient'' was a command to strip off the mail headers and pass the remainder of the message to a command interpreter. The body was a script that created a C program, the ``grappling hook,'' which transfered the rest of the modules from the originiating host, and the commands to link and execute them. Both VAX and Sun binaries were transfered and both would be tried in turn, no attempt to determine the machine type was made. On other architectures the programs would not run, but would use resources in the linking process. All other attacks used the same ``grappling hook'' mechanism, but used other flaws to inject the ``grappling hook'' into the target machine.

The fact that debug was enabled by default was reported to Berkeley by several sources during the 4.2BSD release. The 4.3BSD release as well as Sun releases still had this option enabled by default [smb]. The then current release of Ultrix did not have debug mode enabled, but the beta test version of the newest release did have debug enabled (it was disabled before finally being shipped). MIT's Project Athena was among a number of sites which went out of its way to disable debug mode; however, it is unlikely that many binary-only sites were able to be as diligent.

Finger Daemon Bug
The virus hit the finger daemon (fingerd) by overflowing a buffer which was allocated on the stack. The overflow was possible because fingerd used a library function which did not do range checking. Since the buffer was on the stack, the overflow allowed a fake stack frame to be created, which caused a small piece of code to be executed when the procedure returned. The library function in question turns out to be a backward-compatibility routine, which should not have been needed after 1979 [geoff].

Only 4.3BSD VAX machines were attacked this way. The virus did not attempt a Sun specific attack on finger and its VAX attack failed when invoked on a Sun target. Ultrix was not vulnerable to this since production releases did not include a fingerd.

Rexec and Passwords
The virus attacked using the Berkeley remote execution protocol, which required the user name and plaintext password to be passed over the net. The program only used pairs of user names and passwords which it had already tested and found to be correct on the local host. A common, world readable file (/etc/passwd) that contains the user names and encrypted passwords for every user on the system facilitated this search. Specifically: The principle of ``least privilege'' [saltzer] is violated by the existence of this password file. Typical programs have no need for a list of user names and password strings, so this privileged information should not be available to them. For example, Project Athena's network authentication system, Kerberos [kerberos], keeps passwords on a central server which logs authentication requests, thus hiding the list of valid user names. However, once a name is found, the authentication ``ticket'' is still vulnerable to dictionary attack.

Rsh and Trust
The virus attempted to use the Berkeley remote shell program (called rsh) to attack other machines without using passwords. The remote shell utility is similar in function to the remote execution system, although it is ``friendlier'' since the remote end of the connection is a command interpreter, instead of the exec function. For convenience, a file /etc/hosts.equiv can contain a list of hosts trusted by this host. The .rhosts file provides similar functionality on a per-user basis. The remote host can pass the user name from a trusted port (one which can only be opened by root) and the local host will trust that as proof that the connection is being made for the named user.

This system has an important design flaw, which is that the local host only knows the remote host by its network address, which can often be forged. It also trusts the machine, rather than any property of the user, leaving the account open to attack at all times rather than when the user is present [kerberos]. The virus took advantage of the latter flaw to propagate between accounts on trusted machines. Least privilege would also indicate that the lists of trusted machines be only accessible to the daemons who need to decide to whether or not to grant access.

Information Flow
When it became clear that the virus was propagating via sendmail, the first reaction of many sites was to cut off mail service. This turned out to be a serious mistake, since it cut off the information needed to fix the problem. Mailer programs on major forwarding nodes, such as relay.cs.net, were shut down delaying some critical messages by as long as twenty hours. Since the virus had alternate infection channels (rexec and finger), this made the isolated machine a safe haven for the virus, as well as cutting off information from machines further ``downstream'' (thus placing them in greater danger) since no information about the virus could reach them by mail. Thus, by attacking sendmail, the virus indirectly attacked the flow of information that was the only real defense against its spread.


Footnotes:

Self Protection

The virus used a number of techniques to evade detection. It attempted both to cover it tracks and to blend into the normal UNIX environment using camouflage. These techniques had had varying degrees of effectiveness.

Covering Tracks
The program did a number of things to cover its trail. It erased its argument list, once it had finished processing the arguments, so that the process status command would not show how it was invoked.

It also deleted the executing binary, which would leave the data intact but unnamed, and only referenced by the execution of the program. If the machine were rebooted while the virus was actually running, the file system salvager would recover the file after the reboot. Otherwise the program would vanish after exiting.

The program also used resource limit functions to prevent a core dump. Thus, it prevented any bugs in the program from leaving tell-tale traces behind.

Camouflage
It was compiled under the name sh, the same name used by the Bourne Shell, a command interpreter which is often used in shell scripts and automatic commands. Even a diligent system manager would probably not notice a large number of shells running for short periods of time.

The virus forked, splitting into a parent and child, approximately every three minutes. The parent would then exit, leaving the child to continue from the exact same place. This had the effect of ``refreshing'' the process, since the new fork started off with no resources used, such as CPU time or memory usage. It also kept each run of the virus short, making the virus a more difficult to seize, even when it had been noticed.

All the constant strings used by the program were obscured by XOR'ing each character with the constant 81 (base 16). This meant that one could not simply look at the binary to determine what constants the virus refered to (e.g. what files it opened). But it was a weak method of hiding the strings; it delayed efforts to understand the virus, but not for very long.

Flaws

The virus also had a number of flaws, ranging from the subtle to the clumsy. One of the later messages from Berkeley posted fixes for some of the more obvious ones, as a humorous gesture.

Reinfection prevention
The code for preventing reinfection of an actively infected machine harbored some major flaws. These flaws turned out to be critical to the ultimate ``failure'' of the virus, as reinfection drove up the load of many machines, causing it to be noticed and thus counterattacked.

The code had several timing flaws which made it unlikely to work. While written in a ``paranoid'' manner, using weak authentication (exchanging ``magic'' numbers) to determine whether the other end of the connection is indeed a copy of the virus, these routines would often exit with errors (and thus not attempt to quit) if:

Note that ``at once'' means ``within a 5-20 second window'' in most cases, and is sometimes looser.

A critical weakness in the interlocking code is that even when it does decide to quit, all it does is set the variable pleasequit. This variable does not have an effect until the virus has gone through

Since the virus was careful to clean up temporary files, its presence alone didn't interfere with reinfection.

Also, a multiply infected machine would spread the virus faster, perhaps proportionally to the number of infections it was harboring, since

Thus, the virus spread much more quickly than the perpetrator expected, and was noticed for that very reason. The MIT Media Lab, for example, cut themselves completely off from the network because the computer resources absorbed by the virus were detracting from work in progress, while the lack of network service was a minor problem.

Heuristics
One attempt to make the program not waste time on non-UNIX systems was to sometimes try to open a telnet or rsh connection to a host before trying to attack it and skipping that host if it refused the connection. If the host refused telnet or rsh connections, it was likely to refuse other attacks as well. There were several problems with this heuristic:

Vulnerabilities not used
The virus did not exploit a number of obvious opportunities.

Defenses

There were many attempts to stop the virus. They varied in inconvenience to the end users of the vulnerable systems, in the amount of skill required to implement them, and in their effectiveness.

After the virus was analyzed, a tool which duplicated the password attack (including the virus' internal dictionary) was posted to the network. This tool allowed system administrators to analyze the passwords in use on their system. The spread of this virus should be effective in raising the awareness of users (and administrators) to the importance of choosing ``difficult'' passwords. Lawrence Livermore National Laboratories went as far as requiring all passwords be changed, and modifying the password changing program to test new passwords against the lists that include the passwords attacked by the virus [ncsc].