_A _T_o_u_r _o_f _t_h_e _W_o_r_m _D_o_n_n _S_e_e_l_e_y 9 Department of Computer Science University of Utah 9 _A_B_S_T_R_A_C_T On the evening of November 2, 1988, a self-replicating program was released upon the Internet[1]. This program (a _w_o_r_m) invaded VAX and Sun-3 computers running versions of Berkeley UNIX, and used their resources to attack still more computers[2]. Within the space of hours this program had spread across the U.S., infecting hundreds or thousands of computers and making many of them unusable due to the burden of its activity. This paper provides a chronology for the outbreak and presents a detailed description of the internals of the worm, based on a C version produced by decompiling. _1. _I_n_t_r_o_d_u_c_t_i_o_n There is a fine line between helping administrators pro- tect their systems and providing a cookbook for bad guys. [Grampp and Morris, ``UNIX Operating System Security''] November 3, 1988 is already coming to be known as Black Thursday. System administrators around the country came to work on that day and discovered that their networks of computers were laboring under a huge load. If they were able to log in and gen- erate a system status listing, they saw what appeared to be dozens or hundreds of ``shell'' (command interpreter) processes. If they tried to kill the processes, they found that new processes appeared faster than they could kill them. Rebooting ____________________ 9 [1] The Internet is a logical network made up of many physical networks, all running the IP class of network protocols. [2] VAX and Sun-3 are models of computers built by Digital Equipment Corp. and Sun Microsystems Inc., respectively. UNIX is a Registered Bell of AT&T Trademark Laboratories. 9 Tour of the Worm 2 the computer seemed to have no effect--within minutes after starting up again, the machine was overloaded by these mysterious processes. These systems had been invaded by a _w_o_r_m. A worm is a pro- gram that propagates itself across a network, using resources on one machine to attack other machines. (A worm is not quite the same as a _v_i_r_u_s, which is a program fragment that inserts itself into other programs.) The worm had taken advantage of lapses in security on systems that were running 4.2 or 4.3 BSD UNIX or derivatives like SunOS. These lapses allowed it to connect to machines across a network, bypass their login authentication, copy itself and then proceed to attack still more machines. The massive system load was generated by multitudes of worms trying to propagate the epidemic. The Internet had never been attacked in this way before, although there had been plenty of speculation that an attack was in store. Most system administrators were unfamiliar with the concept of worms (as opposed to viruses, which are a major affl- iction of the PC world) and it took some time before they were able to establish what was going on and how to deal with it. This paper is intended to let people know exactly what happened and how it came about, so that they will be better prepared when it happens the next time. The behavior of the worm will be exam- ined in detail, both to show exactly what it did and didn't do, and to show the dangers of future worms. The epigraph above is now ironic, for the author of the worm used information in that paper to attack systems. Since the information is now well known, by virtue of the fact that thousands of computers now have copies of the worm, it seems unlikely that this paper can do similar damage, but it is definitely a troubling thought. Opin- ions on this and other matters will be offered below. _2. _C_h_r_o_n_o_l_o_g_y Remember, when you connect with another computer, you're connecting to every computer that computer has connected to. [Dennis Miller, on NBC's _S_a_t_u_r_d_a_y _N_i_g_h_t _L_i_v_e] 9 Here is the gist of a message I got: I'm sorry. [Andy Sudduth, in an anonymous posting to the TCP-IP list on behalf of the author of the worm, 11/3/88] Many details of the chronology of the attack are not yet available. The following list represents dates and times that we are currently aware of. Times have all been rendered in Pacific Standard Time for convenience. 11/2: 1800 (approx.) This date and time were seen on worm files found on _p_r_e_p._a_i._m_i_t._e_d_u, a VAX 11/750 at the MIT Artificial 9 Tour of the Worm 3 Intelligence Laboratory. The files were removed later, and the precise time was lost. System log- ging on _p_r_e_p had been broken for two weeks. The system doesn't run accounting and the disks aren't backed up to tape: a perfect target. A number of ``tourist'' users (individuals using public accounts) were reported to be active that evening. These users would have appeared in the session log- ging, but see below. 11/2: 1824 First known West Coast infection: _r_a_n_d._o_r_g at Rand Corp. in Santa Monica. 11/2: 1904 _c_s_g_w._b_e_r_k_e_l_e_y._e_d_u is infected. This machine is a major network gateway at UC Berkeley. Mike Karels and Phil Lapsley discover the infection shortly afterward. 11/2: 1954 _m_i_m_s_y._u_m_d._e_d_u is attacked through its _f_i_n_g_e_r server. This machine is at the University of Mary- land College Park Computer Science Department. 11/2: 2000 (approx.) Suns at the MIT AI Lab are attacked. 11/2: 2028 First _s_e_n_d_m_a_i_l attack on mimsy. 11/2: 2040 Berkeley staff figure out the _s_e_n_d_m_a_i_l and _r_s_h attacks, notice _t_e_l_n_e_t and _f_i_n_g_e_r peculiarities, and start shutting these services off. 11/2: 2049 _c_s._u_t_a_h._e_d_u is infected. This VAX 8600 is the cen- tral Computer Science Department machine at the University of Utah. The next several entries fol- low documented events at Utah and are representa- tive of other infections around the country. 11/2: 2109 First _s_e_n_d_m_a_i_l attack at _c_s._u_t_a_h._e_d_u. 11/2: 2121 The load average on _c_s._u_t_a_h._e_d_u reaches 5. The ``load average'' is a system-generated value that represents the average number of jobs in the run queue over the last minute; a load of 5 on a VAX 8600 noticeably degrades response times, while a load over 20 is a drastic degradation. At 9 PM, the load is typically between 0.5 and 2. 11/2: 2141 The load average on _c_s._u_t_a_h._e_d_u reaches 7. 11/2: 2201 The load average on _c_s._u_t_a_h._e_d_u reaches 16. 11/2: 2206 The maximum number of distinct runnable processes (100) is reached on _c_s._u_t_a_h._e_d_u; the system is Tour of the Worm 4 unusable. 11/2: 2220 Jeff Forys at Utah kills off worms on _c_s._u_t_a_h._e_d_u. Utah Sun clusters are infected. 11/2: 2241 Re-infestation causes the load average to reach 27 on _c_s._u_t_a_h._e_d_u. 11/2: 2249 Forys shuts down _c_s._u_t_a_h._e_d_u. 11/3: 2321 Re-infestation causes the load average to reach 37 on _c_s._u_t_a_h._e_d_u, despite continuous efforts by Forys to kill worms. 11/2: 2328 Peter Yee at NASA Ames Research Center posts a warning to the TCP-IP mailing list: ``We are currently under attack from an Internet VIRUS. It has hit UC Berkeley, UC San Diego, Lawrence Liver- more, Stanford, and NASA Ames.'' He suggests turn- ing off _t_e_l_n_e_t, _f_t_p, _f_i_n_g_e_r, _r_s_h and SMTP services. He does not mention _r_e_x_e_c. Yee is actually at Berkeley working with Keith Bostic, Mike Karels and Phil Lapsley. 11/3: 0034 At another's prompting, Andy Sudduth of Harvard anonymously posts a warning to the TCP-IP list: ``There may be a virus loose on the internet.'' This is the first message that (briefly) describes how the _f_i_n_g_e_r attack works, describes how to defeat the SMTP attack by rebuilding _s_e_n_d_m_a_i_l, and explicitly mentions the _r_e_x_e_c attack. Unfor- tunately Sudduth's message is blocked at _r_e_l_a_y._c_s._n_e_t while that gateway is shut down to combat the worm, and it does not get delivered for almost two days. Sudduth acknowledges authorship of the message in a subsequent message to TCP-IP on Nov. 5. 11/3: 0254 Keith Bostic sends a fix for _s_e_n_d_m_a_i_l to the news- group comp.bugs.4bsd.ucb-fixes and to the TCP-IP mailing list. These fixes (and later ones) are also mailed directly to important system adminis- trators around the country. 11/3: early morning The _w_t_m_p session log is mysteriously removed on _p_r_e_p._a_i._m_i_t._e_d_u. 11/3: 0507 Edward Wang at Berkeley figures out and reports the _f_i_n_g_e_r attack, but his message doesn't come to Mike Karels' attention for 12 hours. 9 9 Tour of the Worm 5 11/3: 0900 The annual Berkeley Unix Workshop commences at UC Berkeley. 40 or so important system administrators and hackers are in town to attend, while disaster erupts at home. Several people who had planned to fly in on Thursday morning are trapped by the crisis. Keith Bostic spends much of the day on the phone at the Computer Systems Research Group offices answering calls from panicked system administrators from around the country. 11/3: 1500 (approx.) The team at MIT Athena calls Berkeley with an exam- ple of how the _f_i_n_g_e_r server bug works. 11/3: 1626 Dave Pare arrives at Berkeley CSRG offices; disassembly and decompiling start shortly afterward using Pare's special tools. 11/3: 1800 (approx.) The Berkeley group sends out for calzones. People arrive and leave; the offices are crowded, there's plenty of excitement. Parallel work is in progress at MIT Athena; the two groups swap code. 11/3: 1918 Keith Bostic posts a fix for the _f_i_n_g_e_r server. 11/4: 0600 Members of the Berkeley team, with the worm almost completely disassembled and largely decompiled, finally take off for a couple hours' sleep before returning to the workshop. 11/4: 1236 Theodore Ts'o of Project Athena at MIT publicly announces that MIT and Berkeley have completely disassembled the worm. 11/4: 1700 (approx.) A short presentation on the worm is made at the end of the Berkeley UNIX Workshop. 11/8: National Computer Security Center meeting to dis- cuss the worm. There are about 50 attendees. 11/11: 0038 Fully decompiled and commented worm source is installed at Berkeley. _3. _O_v_e_r_v_i_e_w What exactly did the worm do that led it to cause an epi- demic? The worm consists of a 99-line bootstrap program written in the C language, plus a large relocatable object file that comes in VAX and Sun-3 flavors. Internal evidence showed that the object file was generated from C sources, so it was natural to decompile the binary machine language into C; we now have over Tour of the Worm 6 3200 lines of commented C code which recompiles and is mostly complete. We shall start the tour of the worm with a quick over- view of the basic goals of the worm, followed by discussion in depth of the worm's various behaviors as revealed by decompila- tion. The activities of the worm break down into the categories of attack and defense. Attack consists of locating hosts (and accounts) to penetrate, then exploiting security holes on remote systems to pass across a copy of the worm and run it. The worm obtains host addresses by examining the system tables /_e_t_c/_h_o_s_t_s._e_q_u_i_v and /._r_h_o_s_t_s, user files like ._f_o_r_w_a_r_d and ._r_h_o_s_t_s, dynamic routing information produced by the _n_e_t_s_t_a_t pro- gram, and finally randomly generated host addresses on local net- works. It ranks these by order of preference, trying a file like /_e_t_c/_h_o_s_t_s._e_q_u_i_v first because it contains names of local machines that are likely to permit unauthenticated connections. Penetration of a remote system can be accomplished in any of three ways. The worm can take advantage of a bug in the _f_i_n_g_e_r server that allows it to download code in place of a finger request and trick the server into executing it. The worm can use a ``trap door'' in the _s_e_n_d_m_a_i_l SMTP mail service, exercising a bug in the debugging code that allows it to execute a command interpreter and download code across a mail connection. If the worm can penetrate a local account by guessing its password, it can use the _r_e_x_e_c and _r_s_h remote command interpreter services to attack hosts that share that account. In each case the worm arranges to get a remote command interpreter which it can use to copy over, compile and execute the 99-line bootstrap. The bootstrap sets up its own network connection with the local worm and copies over the other files it needs, and using these pieces a remote worm is built and the infection procedure starts over again. Defense tactics fall into three categories: preventing the detection of intrusion, inhibiting the analysis of the program, and authenticating other worms. The worm's simplest means of hiding itself is to change its name. When it starts up, it clears its argument list and sets its zeroth argument to _s_h, allowing it to masquerade as an innocuous command interpreter. It uses _f_o_r_k() to change its process I.D., never staying too long at one I.D. These two tactics are intended to disguise the worm's presence on system status listings. The worm tries to leave as little trash lying around as it can, so at start-up it reads all its support files into memory and deletes the tell-tale filesystem copies. It turns off the generation of _c_o_r_e files, so if the worm makes a mistake, it doesn't leave evidence behind in the form of core dumps. The latter tactic is also designed to block analysis of the program--it prevents an administrator from sending a software signal to the worm to force it to dump a core file. There are other ways to get a core file, however, so the worm carefully alters character data in memory to prevent it from being extracted easily. Copies of disk files are encoded by Tour of the Worm 7 repeatedly exclusive-or'ing a ten-byte code sequence; static strings are encoded byte-by-byte by exclusive-or'ing with the hexadecimal value 81, except for a private word list which is encoded with hexadecimal 80 instead. If the worm's files are somehow captured before the worm can delete them, the object files have been loaded in such a way as to remove most non- essential symbol table entries, making it harder to guess at the purposes of worm routines from their names. The worm also makes a trivial effort to stop other programs from taking advantage of its communications; in theory a well-prepared site could prevent infection by sending messages to ports that the worm was listen- ing on, so the worm is careful to test connections using a short exchange of random ``magic numbers''. When studying a tricky program like this, it's just as important to establish what the program _d_o_e_s _n_o_t do as what it does do. The worm _d_o_e_s _n_o_t _d_e_l_e_t_e _a _s_y_s_t_e_m'_s _f_i_l_e_s: it only removes files that it created in the process of bootstrapping. The program does not attempt to incapacitate a system by deleting important files, or indeed any files. It does not remove log files or otherwise interfere with normal operation other than by consuming system resources. The worm _d_o_e_s _n_o_t _m_o_d_i_f_y _e_x_i_s_t_i_n_g _f_i_l_e_s: it is not a virus. The worm propagates by copying itself and compiling itself on each system; it does not modify other programs to do its work for it. Due to its method of infection, it can't count on sufficient privileges to be able to modify pro- grams. The worm _d_o_e_s _n_o_t _i_n_s_t_a_l_l _t_r_o_j_a_n _h_o_r_s_e_s: its method of attack is strictly active, it never waits for a user to trip over a trap. Part of the reason for this is that the worm can't afford to waste time waiting for trojan horses--it must reproduce before it is discovered. Finally, the worm _d_o_e_s _n_o_t _r_e_c_o_r_d _o_r _t_r_a_n_s_m_i_t _d_e_c_r_y_p_t_e_d _p_a_s_s_w_o_r_d_s: except for its own static list of favorite passwords, the worm does not propagate cracked passwords on to new worms nor does it transmit them back to some home base. This is not to say that the accounts that the worm penetrated are secure merely because the worm did not tell anyone what their passwords were, of course--if the worm can guess an account's password, certainly others can too. The worm _d_o_e_s _n_o_t _t_r_y _t_o _c_a_p_t_u_r_e _s_u_p_e_r_u_s_e_r _p_r_i_v_i_l_e_g_e_s: while it does try to break into accounts, it doesn't depend on having particular privileges to propagate, and never makes special use of such privileges if it somehow gets them. The worm _d_o_e_s _n_o_t _p_r_o_p_a_g_a_t_e _o_v_e_r _u_u_c_p or X.25 or DECNET or BITNET: it specifically requires TCP/IP. The worm _d_o_e_s _n_o_t _i_n_f_e_c_t _S_y_s_t_e_m _V _s_y_s_t_e_m_s unless they have been modified to use Berkeley network programs like _s_e_n_d_m_a_i_l, _f_i_n_g_e_r_d and _r_e_x_e_c. _4. _I_n_t_e_r_n_a_l_s Now for some details: we shall follow the main thread of control in the worm, then examine some of the worm's data struc- tures before working through each phase of activity. 9 9 Tour of the Worm 8 _4._1. _T_h_e _t_h_r_e_a_d _o_f _c_o_n_t_r_o_l When the worm starts executing in _m_a_i_n(), it takes care of some initializations, some defense and some cleanup. The very first thing it does is to change its name to _s_h. This shrinks the window during which the worm is visible in a system status listing as a process with an odd name like _x_9_8_3_4_7_5_3. It then initializes the random number generator, seeding it with the current time, turns off core dumps, and arranges to die when remote connections fail. With this out of the way, the worm processes its argument list. It first looks for an option -_p $$, where $$ represents the process I.D. of its parent process; this option indicates to the worm that it must take care to clean up after itself. It proceeds to read in each of the files it was given as arguments; if cleaning up, it removes each file after it reads it. If the worm wasn't given the bootstrap source file _l_1._c as an argument, it exits silently; this is perhaps intended to slow down people who are experimenting with the worm. If cleaning up, the worm then closes its file descriptors, tem- porarily cutting itself off from its remote parent worm, and removes some files. (One of these files, /_t_m_p/._d_u_m_b, is never created by the worm and the unlinking seems to be left over from an earlier stage of development.) The worm then zeroes out its argument list, again to foil the system status program _p_s. The next step is to initialize the worm's list of network interfaces; these interfaces are used to find local networks and to check for alternate addresses of the current host. Finally, if cleaning up the worm resets its process group and kills the process that helped to bootstrap it. The worm's last act in _m_a_i_n() is to call a function we named _d_o_i_t(), which contains the main loop of the worm. _d_o_i_t() runs a short prologue before actually starting the main loop. It (redundantly) seeds the random number generator with the current time, saving the time so that it can tell how long it has been running. The worm then attempts its first infection. It initially attacks gateways that it found with the _n_e_t_s_t_a_t network status program; if it can't infect one of these hosts, then it checks random host numbers on local networks, then it tries random host numbers on networks that are on the far side of gateways, in each case stopping if it succeeds. (Note that this sequence of attacks differs from the sequence the worm uses after it has entered the main loop.) After this initial attempt at infection, the worm calls the routine _c_h_e_c_k_o_t_h_e_r() to check for another worm already on the local machine. In this check the worm acts as a client to an existing worm which acts as a server; they may exchange ``popula- tion control'' messages, after which one of the two worms will eventually shut down. One odd routine is called just before entering the main loop. We named this routine _s_e_n_d__m_e_s_s_a_g_e(), but it really Tour of the Worm 9 _________________________________________________________________ 9 doit() { _s_e_e_d _t_h_e _r_a_n_d_o_m _n_u_m_b_e_r _g_e_n_e_r_a_t_o_r _w_i_t_h _t_h_e _t_i_m_e _a_t_t_a_c_k _h_o_s_t_s: _g_a_t_e_w_a_y_s, _l_o_c_a_l _n_e_t_s, _r_e_m_o_t_e _n_e_t_s checkother(); send_message(); for (;;) { cracksome(); other_sleep(30); cracksome(); _c_h_a_n_g_e _o_u_r _p_r_o_c_e_s_s _I_D _a_t_t_a_c_k _h_o_s_t_s: _g_a_t_e_w_a_y_s, _k_n_o_w_n _h_o_s_t_s, _r_e_m_o_t_e _n_e_t_s, _l_o_c_a_l _n_e_t_s other_sleep(120); _i_f _1_2 _h_o_u_r_s _h_a_v_e _p_a_s_s_e_d, _r_e_s_e_t _h_o_s_t_s _t_a_b_l_e if (pleasequit && nextw > 10) exit(0); } } ``_C'' _p_s_e_u_d_o-_c_o_d_e _f_o_r _t_h_e _d_o_i_t() _f_u_n_c_t_i_o_n _________________________________________________________________ doesn't send anything at all. It looks like it was intended to cause 1 in 15 copies of the worm to send a 1-byte datagram to a port on the host _e_r_n_i_e._b_e_r_k_e_l_e_y._e_d_u, which is located in the Com- puter Science Department at UC Berkeley. It has been suggested that this was a feint, designed to draw attention to _e_r_n_i_e and away from the author's real host. Since the routine has a bug (it sets up a TCP socket but tries to send a UDP packet), nothing gets sent at all. It's possible that this was a deeper feint, designed to be uncovered only by decompilers; if so, this wouldn't be the only deliberate impediment that the author put in our way. In any case, administrators at Berkeley never detected any process listening at port 11357 on ernie, and we found no code in the worm that listens at that port, regardless of the host. The main loop begins with a call to a function named _c_r_a_c_k_- _s_o_m_e() for some password cracking. Password cracking is an activity that the worm is constantly working at in an incremental fashion. It takes a break for 30 seconds to look for intruding copies of the worm on the local host, and then goes back to cracking. After this session, it forks (creates a new process running with a copy of the same image) and the old process exits; this serves to turn over process I.D. numbers and makes it harder to track the worm with the system status program _p_s. At this 9 Tour of the Worm 10 point the worm goes back to its infectious stage, trying (in order of preference) gateways, hosts listed in system tables like /_e_t_c/_h_o_s_t_s._e_q_u_i_v, random host numbers on the far side of gateways and random hosts on local networks. As before, if it succeeds in infecting a new host, it marks that host in a list and leaves the infection phase for the time being. After infection, the worm spends two minutes looking for new local copies of the worm again; this is done here because a newly infected remote host may try to reinfect the local host. If 12 hours have passed and the worm is still alive, it assumes that it has had bad luck due to networks or hosts being down, and it reinitializes its table of hosts so that it can start over from scratch. At the end of the main loop the worm checks to see if it is scheduled to die as a result of its population control features, and if it is, and if it has done a sufficient amount of work cracking passwords, it exits. _4._2. _D_a_t_a _s_t_r_u_c_t_u_r_e_s The worm maintains at least four interesting data struc- tures, and each is associated with a set of support routines. The _o_b_j_e_c_t structure is used to hold copies of files. Files are encrypted using the function _x_o_r_b_u_f() while in memory, so that dumps of the worm won't reveal anything interesting. The files are copied to disk on a remote system before starting a new worm, and new worms read the files into memory and delete the disk copies as part of their start-up duties. Each structure contains a name, a length and a pointer to a buffer. The func- tion _g_e_t_o_b_j_e_c_t_b_y_n_a_m_e() retrieves a pointer to a named object structure; for some reason, it is only used to call up the bootstrap source file. The _i_n_t_e_r_f_a_c_e structure contains information about the current host's network interfaces. This is mainly used to check for local attached networks. It contains a name, a network address, a subnet mask and some flags. The interface table is initialized once at start-up time. The _h_o_s_t structure is used to keep track of the status and addresses of hosts. Hosts are added to this list dynamically, as the worm encounters new sources of host names and addresses. The list can be searched for a particular address or name, with an option to insert a new entry if no matching entry is found. Flag bits are used to indicate whether the host is a gateway, whether it was found in a system table like /_e_t_c/_h_o_s_t_s._e_q_u_i_v, whether the worm has found it impossible to attack the host for some reason, and whether the host has already been successfully infected. The bits for ``can't infect'' and ``infected'' are cleared every 12 hours, and low priority hosts are deleted, to be accumulated again later. The structure contains up to 12 names (aliases) and up to 6 distinct network addresses for each host. 9 9 Tour of the Worm 11 In our sources, what we've called the _m_u_c_k structure is used to keep track of accounts for the purpose of password cracking. (It was awarded the name _m_u_c_k for sentimental reasons, although _p_w or _a_c_c_t might be more mnemonic.) Each structure contains an account name, an encrypted password, a decrypted password (if available) plus the home directory and personal information fields from the password file. _4._3. _P_o_p_u_l_a_t_i_o_n _g_r_o_w_t_h The worm contains a mechanism that seems to be designed to limit the number of copies of the worm running on a given system, but beyond that our current understanding of the design goals is itself limited. It clearly does not prevent a system from being overloaded, although it does appear to pace the infection so that early copies can go undetected. It has been suggested that a simulation of the worm's population control features might reveal more about its design, and we are interested writing such a simu- lation. The worm uses a client-and-server technique to control the number of copies executing on the current machine. A routine _c_h_e_c_k_o_t_h_e_r() is run at start-up time. This function tries to connect to a server listening at TCP port 23357. The connection attempt returns immediately if no server is present, but blocks if one is available and busy; a server worm periodically runs its server code during time-consuming operations so that the queue of connections does not grow large. After the client exchanges magic numbers with the server as a trivial form of authentica- tion, the client and the server roll dice to see who gets to sur- vive. If the exclusive-or of the respective low bits of the client's and the server's random numbers is 1, the server wins, otherwise the client wins. The loser sets a flag _p_l_e_a_s_e_q_u_i_t that eventually allows it to exit at the bottom of the main loop. If at any time a problem occurs--a read from the server fails, or the wrong magic number is returned--the client worm returns from the function, becoming a worm that never acts as a server and hence does not engage in population control. Perhaps as a pre- caution against a cataleptic server, a test at the top of the function causes 1 in 7 worms to skip population control. Thus the worm finishes the population game in _c_h_e_c_k_o_t_h_e_r() in one of three states: scheduled to die after some time, with _p_l_e_a_s_e_q_u_i_t set; running as a server, with the possibility of losing the game later; and immortal, safe from the gamble of population control. A complementary routine _o_t_h_e_r__s_l_e_e_p() executes the server function. It is passed a time in seconds, and it uses the Berke- ley _s_e_l_e_c_t() system call to wait for that amount of time accept- ing connections from clients. On entry to the function, it tests to see whether it has a communications port with which to accept connections; if not, it simply sleeps for the specified amount of time and returns. Otherwise it loops on _s_e_l_e_c_t(), decrementing its time remaining after serving a client until no more time is Tour of the Worm 12 left and the function returns. When the server acquires a client, it performs the inverse of the client's protocol, eventu- ally deciding whether to proceed or to quit. _o_t_h_e_r__s_l_e_e_p() is called from many different places in the code, so that clients are not kept waiting too long. Given the worm's elaborate scheme for controlling re- infection, what led it to reproduce so quickly on an individual machine that it could swamp it? One culprit is the 1 in 7 test in _c_h_e_c_k_o_t_h_e_r(): worms that skip the client phase become immor- tal, and thus don't risk being eliminated by a roll of the dice. Another source of system loading is the problem that when a worm decides it has lost, it can still do a lot of work before it actually exits. The client routine isn't even run until the newly born worm has attempted to infect at least one remote host, and even if a worm loses the roll, it continues executing to the bottom of the main loop, and even then it won't exit unless it has gone through the main loop several times, limited by its pro- gress in cracking passwords. Finally, new worms lose all of the history of infection that their parents had, so the children of a worm are constantly trying to re-infect the parent's host, as well as the other children's hosts. Put all of these factors together and it comes as no surprise that within an hour or two after infection, a machine may be entirely devoted to executing worms. _4._4. _L_o_c_a_t_i_n_g _n_e_w _h_o_s_t_s _t_o _i_n_f_e_c_t One of the characteristics of the worm is that all of its attacks are active, never passive. A consequence of this is that the worm can't wait for a user to take it over to another machine like gum on a shoe--it must search out hosts on its own. The worm has a very distinct list of priorities when hunting for hosts. Its favorite hosts are gateways; the _h_g() routine tries to infect each of the hosts it believes to be gateways. Only when all of the gateways are known to be infected or infection-proof does the worm go on to other hosts. _h_g() calls the _r_t__i_n_i_t() function to get a list of gateways; this list is derived by running the _n_e_t_s_t_a_t network status program and parsing its output. The worm is careful to skip the loopback device and any local interfaces (in the event that the current host is a gateway); when it finishes, it randomizes the order of the list and adds the first 20 gateways to the host table to speed up the initial searches. It then tries each gateway in sequence until it finds a host that can be infected, or it runs out of hosts. After taking care of gateways, the worm's next priority is hosts whose names were found in a scan of system files. At the start of password cracking, the files /_e_t_c/_h_o_s_t_s._e_q_u_i_v (which contains names of hosts to which the local host grants user per- missions without authentication) and /._r_h_o_s_t_s (which contains names of hosts from which the local host permits remote Tour of the Worm 13 privileged logins) are examined, as are all users' ._f_o_r_w_a_r_d files (which list hosts to which mail is forwarded from the current host). These hosts are flagged so that they can be scanned ear- lier than the rest. The _h_i() function is then responsible for attacking these hosts. When the most profitable hosts have been used up, the worm starts looking for hosts that aren't recorded in files. The rou- tine _h_l() checks local networks: it runs through the local host's addresses, masking off the host part and substituting a random value. _h_a() does the same job for remote hosts, checking alter- nate addresses of gateways. Special code handles the ARPAnet practice of putting the IMP number in the low host bits and the actual IMP port (representing the host) in the high host bits. The function that runs these random probes, which we named _h_a_c_k__n_e_t_o_f(), seems to have a bug that prevents it from attacking hosts on local networks; this may be due to our own misunder- standing, of course, but in any case the check of hosts from sys- tem files should be sufficient to cover all or nearly all of the local hosts anyway. Password cracking is another generator of host names, but since this is handled separately from the usual host attack scheme presented here, it will be discussed below with the other material on passwords. _4._5. _S_e_c_u_r_i_t_y _h_o_l_e_s The first fact to face is that Unix was not developed with security, in any realistic sense, in mind... [Dennis Ritchie, ``On the Security of Unix''] This section discusses the TCP services used by the worm to penetrate systems. It's a touch unfair to use the quote above when the implementation of the services we're about to discuss was distributed by Berkeley rather than Bell Labs, but the senti- ment is appropriate. For a long time the balance between secu- rity and convenience on Unix systems has been tilted in favor of convenience. As Brian Reid has said about the break-in at Stan- ford two years ago: ``Programmer convenience is the antithesis of security, because it is going to become intruder convenience if the programmer's account is ever compromised.'' The lesson from that experience seems to have been forgotten by most people, but not by the author of the worm. _4._5._1. _R_s_h _a_n_d _r_e_x_e_c These notes describe how the design of TCP/IP and the 4.2BSD implementation allow users on untrusted and possi- bly very distant hosts to masquerade as users on trusted hosts. [Robert T. Morris, ``A Weakness in the 4.2BSD Unix TCP/IP Software''] Tour of the Worm 14 _R_s_h and _r_e_x_e_c are network services which offer remote com- mand interpreters. _R_e_x_e_c uses password authentication; _r_s_h relies on a ``privileged'' originating port and permissions files. Two vulnerabilities are exploited by the worm--the likel- ihood that a remote machine that has an account for a local user will have the same password as the local account, allowing pene- tration through _r_e_x_e_c, and the likelihood that such a remote account will include the local host in its _r_s_h permissions files. Both of these vulnerabilities are really problems with laxness or convenience for users and system administrators rather than actual bugs, but they represent avenues for infection just like inadvertent security bugs. The first use of _r_s_h by the worm is fairly simple: it looks for a remote account with the same name as the one that is (unsuspectingly) running the worm on the local machine. This test is part of the standard menu of hacks conducted for each host; if it fails, the worm falls back upon _f_i_n_g_e_r, then _s_e_n_d_- _m_a_i_l. Many sites, including Utah, already were protected from this trivial attack by not providing remote shells for pseudo- users like _d_a_e_m_o_n or _n_o_b_o_d_y. A more sophisticated use of these services is found in the password cracking routines. After a password is successfully guessed, the worm immediately tries to penetrate remote hosts associated with the broken account. It reads the user's ._f_o_r_w_a_r_d file (which contains an address to which mail is forwarded) and ._r_h_o_s_t_s file (which contains a list of hosts and optionally user names on those hosts which are granted permission to access the local machine with _r_s_h bypassing the usual password authentica- tion), trying these hostnames until it succeeds. Each target host is attacked in two ways. The worm first contacts the remote host's _r_e_x_e_c server and sends it the account name found in the ._f_o_r_w_a_r_d or ._r_h_o_s_t_s files followed by the guessed password. If this fails, the worm connects to the local _r_e_x_e_c server with the local account name and uses that to contact the target's _r_s_h server. The remote _r_s_h server will permit the connection pro- vided the name of the local host appears in either the /_e_t_c/_h_o_s_t_s._e_q_u_i_v file or the user's private ._r_h_o_s_t_s file. Strengthening these network services is far more problematic than fixing _f_i_n_g_e_r and _s_e_n_d_m_a_i_l, unfortunately. Users don't like the inconvenience of typing their password when logging in on a trusted local host, and they don't want to remember different passwords for each of the many hosts they may have to deal with. Some of the solutions may be worse than the disease--for example, a user who is forced to deal with many passwords is more likely to write them down somewhere. _4._5._2. _F_i_n_g_e_r _g_e_t_s was removed from our [C library] a couple days ago. [Bill Cheswick at AT&T Bell Labs Research, private com- Tour of the Worm 15 munication, 11/9/88] Probably the neatest hack in the worm is its co-opting of the TCP _f_i_n_g_e_r service to gain entry to a system. _F_i_n_g_e_r reports information about a user on a host, usually including things like the user's full name, where their office is, the number of their phone extension and so on. The Berkeley[3] version of the _f_i_n_g_e_r server is a really trivial program: it reads a request from the originating host, then runs the local _f_i_n_g_e_r program with the request as an argument and ships the output back. Unfortunately the _f_i_n_g_e_r server reads the remote request with _g_e_t_s(), a stan- dard C library routine that dates from the dawn of time and which does not check for overflow of the server's 512 byte request buffer on the stack. The worm supplies the finger server with a request that is 536 bytes long; the bulk of the request is some VAX machine code that asks the system to execute the command interpreter _s_h, and the extra 24 bytes represent just enough data to write over the server's stack frame for the main routine. When the main routine of the server exits, the calling function's program counter is supposed to be restored from the stack, but the worm wrote over this program counter with one that points to the VAX code in the request buffer. The program jumps to the worm's code and runs the command interpreter, which the worm uses to enter its bootstrap. Not surprisingly, shortly after the worm was reported to use this feature of _g_e_t_s(), a number of people replaced all instances of _g_e_t_s() in system code with sensible code that checks the length of the buffer. Some even went so far as to remove _g_e_t_s() from the library, although the function is apparently mandated by the forthcoming ANSI C standard[4]. So far no one has claimed to have exercised the finger server bug before the worm incident, but in May 1988, students at UC Santa Cruz apparently penetrated security using a different finger server with a similar bug. The system administrator at UCSC noticed that the Berkeley finger server had a similar bug and sent mail to Berkeley, but the seri- ousness of the problem was not appreciated at the time (Jim Haynes, private communication). One final note: the worm is meticulous in some areas but not in others. From what we can tell, there was no Sun-3 version of the _f_i_n_g_e_r intrusion even though the Sun-3 server was just as vulnerable as the VAX one. Perhaps the author had VAX sources ____________________ 9 [3] Actually, like much of the code in the Berkeley distribu- tion, the _f_i_n_g_e_r server was contributed from elsewhere; in this case, it appears that MIT was the source. [4] See for example Appendix B, section 1.4 of the second edi- tion of _T_h_e _C _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e by Kernighan and Ritchie. 9 Tour of the Worm 16 available but not Sun sources? _4._5._3. _S_e_n_d_m_a_i_l [T]he trap door resulted from two distinct `features' that, although innocent by themselves, were deadly when combined (kind of like binary nerve gas). [Eric Allman, personal communication, 11/22/88] The _s_e_n_d_m_a_i_l attack is perhaps the least preferred in the worm's arsenal, but in spite of that one site at Utah was sub- jected to nearly 150 _s_e_n_d_m_a_i_l attacks on Black Thursday. _S_e_n_d_- _m_a_i_l is the program that provides the SMTP mail service on TCP networks for Berkeley UNIX systems. It uses a simple character- oriented protocol to accept mail from remote sites. One feature of _s_e_n_d_m_a_i_l is that it permits mail to be delivered to processes instead of mailbox files; this can be used with (say) the _v_a_c_a_- _t_i_o_n program to notify senders that you are out of town and are temporarily unable to respond to their mail. Normally this feature is only available to recipients. Unfortunately a little loophole was accidentally created when a couple of earlier secu- rity bugs were being fixed--if _s_e_n_d_m_a_i_l is compiled with the _D_E_B_U_G flag, and the sender at runtime asks that _s_e_n_d_m_a_i_l enter debug mode by sending the _d_e_b_u_g command, it permits senders to pass in a command sequence instead of a user name for a reci- pient. Alas, most versions of _s_e_n_d_m_a_i_l are compiled with _D_E_B_U_G, including the one that Sun sends out in its binary distribution. The worm mimics a remote SMTP connection, feeding in /_d_e_v/_n_u_l_l as the name of the sender and a carefully crafted string as the recipient. The string sets up a command that deletes the header of the message and passes the body to a command interpreter. The body contains a copy of the worm bootstrap source plus commands to compile and run it. After the worm finishes the protocol and closes the connection to _s_e_n_d_m_a_i_l, the bootstrap will be built on the remote host and the local worm waits for its connection so that it can complete the process of building a new worm. Of course this is not the first time that an inadvertent loophole or ``trap door'' like this has been found in sendmail, and it may not be the last. In his Turing Award lecture, Ken Thompson said: ``You can't trust code that you did not totally create yourself. (Especially code from companies that employ people like me.)'' In fact, as Eric Allman says, ``[Y]ou can't even trust code that you did totally create yourself.'' The basic problem of trusting system programs is not one that is easy to solve. _4._6. _I_n_f_e_c_t_i_o_n The worm uses two favorite routines when it decides that it wants to infect a host. One routine that we named _i_n_f_e_c_t() is used from host scanning routines like _h_g(). _i_n_f_e_c_t() first Tour of the Worm 17 checks that it isn't infecting the local machine, an already infected machine or a machine previously attacked but not suc- cessfully infected; the ``infected'' and ``immune'' states are marked by flags on a host structure when attacks succeed or fail, respectively. The worm then makes sure that it can get an address for the target host, marking the host immune if it can't. Then comes a series of attacks: first by _r_s_h from the account that the worm is running under, then through _f_i_n_g_e_r, then through _s_e_n_d_m_a_i_l. If _i_n_f_e_c_t() fails, it marks the host as immune. The other infection routine is named _h_u_1() and it is run from the password cracking code after a password has been guessed. _h_u_1(), like _i_n_f_e_c_t(), makes sure that it's not re- infecting a host, then it checks for an address. If a potential remote user name is available from a ._f_o_r_w_a_r_d or ._r_h_o_s_t_s file, the worm checks it to make sure it is reasonable--it must contain no punctuation or control characters. If a remote user name is unavailable the worm uses the local user name. Once the worm has a user name and a password, it contacts the _r_e_x_e_c server on the target host and tries to authenticate itself. If it can, it proceeds to the bootstrap phase; otherwise, it tries a slightly different approach--it connects to the local _r_e_x_e_c server with the local user name and password, then uses this command inter- preter to fire off a command interpreter on the target machine with _r_s_h. This will succeed if the remote host says it trusts the local host in its /_e_t_c/_h_o_s_t_s._e_q_u_i_v file, or the remote account says it trusts the local account in its ._r_h_o_s_t_s file. _h_u_1() ignores _i_n_f_e_c_t()'s ``immune'' flag and does not set this flag itself, since _h_u_1() may find success on a per-account basis that _i_n_f_e_c_t() can't achieve on a per-host basis. Both _i_n_f_e_c_t() and _h_u_1() use a routine we call _s_e_n_d_w_o_r_m() to do their dirty work[5]. _s_e_n_d_w_o_r_m() looks for the _l_1._c bootstrap source file in its objects list, then it uses the _m_a_k_e_m_a_g_i_c() routine to get a communication stream endpoint (a _s_o_c_k_e_t), a ran- dom network port number to rendezvous at, and a magic number for authentication. (There is an interesting side effect to _m_a_k_e_m_a_g_i_c()--it looks for a usable address for the target host by trying to connect to its TCP _t_e_l_n_e_t port; this produces a charac- teristic log message from the _t_e_l_n_e_t server.) If _m_a_k_e_m_a_g_i_c() was successful, the worm begins to send commands to the remote com- mand interpreter that was started up by the immediately preceding attack. It changes its directory to an unprotected place (/_u_s_r/_t_m_p), then it sends across the bootstrap source, using the UNIX stream editor _s_e_d to parse the input stream. The bootstrap ____________________ 9 [5] One minor exception: the _s_e_n_d_m_a_i_l attack doesn't use _s_e_n_d_w_o_r_m() since it needs to handle the SMTP protocol in addition to the command interpreter interface, but the principle is the same. 9 Tour of the Worm 18 source is compiled and run on the remote system, and the worm runs a routine named _w_a_i_t_h_i_t() to wait for the remote bootstrap to call back on the selected port. The bootstrap is quite simple. It is supplied the address of the originating host, a TCP port number and a magic number as arguments. When it starts, it unlinks itself so that it can't be detected in the filesystem, then it calls _f_o_r_k() to create a new process with the same image. The old process exits, permitting the originating worm to continue with its business. The bootstrap reads its arguments then zeroes them out to hide them from the system status program; then it is ready to connect over the network to the parent worm. When the connection is made, the bootstrap sends over the magic number, which the parent will check against its own copy. If the parent accepts the number (which is carefully rendered to be independent of host byte order), it will send over a series of filenames and files which the bootstrap writes to disk. If trouble occurs, the bootstrap removes all these files and exits. Eventually the transaction completes, and the bootstrap calls up a command interpreter to finish the job. In the meantime, the parent in _w_a_i_t_h_i_t() spends up to two minutes waiting for the bootstrap to call back; if the bootstrap fails to call back, or the authentication fails, the worm decides to give up and reports a failure. When a connection is success- ful, the worm ships all of its files across followed by an end- of-file indicator. It pauses four seconds to let a command interpreter start on the remote side, then it issues commands to create a new worm. For each relocatable object file in the list of files, the worm tries to build an executable object; typically each file contains code for a particular make of computer, and the builds will fail until the worm tries the proper computer type. If the parent worm finally gets an executable child worm built, it sets it loose with the -_p option to kill the command interpreter, then shuts down the connection. The target host is marked ``infected''. If none of the objects produces a usable child worm, the parent removes the detritus and _w_a_i_t_h_i_t() returns an error indication. When a system is being swamped by worms, the /_u_s_r/_t_m_p direc- tory can fill with leftover files as a consequence of a bug in _w_a_i_t_h_i_t(). If a worm compile takes more than 30 seconds, resyn- chronization code will report an error but _w_a_i_t_h_i_t() will fail to remove the files it has created. On one of our machines, 13 MB of material representing 86 sets of files accumulated over 5.5 hours. _4._7. _P_a_s_s_w_o_r_d _c_r_a_c_k_i_n_g A password cracking algorithm seems like a slow and bulky item to put in a worm, but the worm makes this work by being per- sistent and efficient. The worm is aided by some unfortunate Tour of the Worm 19 statistics about typical password choices. Here we discuss how the worm goes about choosing passwords to test and how the UNIX password encryption routine was modified. _4._7._1. _G_u_e_s_s_i_n_g _p_a_s_s_w_o_r_d_s For example, if the login name is ``abc'', then ``abc'', ``cba'', and ``abcabc'' are excellent candidates for passwords. [Grampp and Morris, ``UNIX Operating System Security''] The worm's password guessing is driven by a little 4-state machine. The first state gathers password data, while the remaining states represent increasingly less likely sources of potential passwords. The central cracking routine is called _c_r_a_c_k_s_o_m_e(), and it contains a switch on each of the four states. The routine that implements the first state we named _c_r_a_c_k__0(). This routine's job is to collect information about hosts and accounts. It is only run once; the information it gathers persists for the lifetime of the worm. Its implementa- tion is straightforward: it reads the files /_e_t_c/_h_o_s_t_s._e_q_u_i_v and /._r_h_o_s_t_s for hosts to attack, then reads the password file look- ing for accounts. For each account, the worm saves the name, the encrypted password, the home directory and the user information fields. As a quick preliminary check, it looks for a ._f_o_r_w_a_r_d file in each user's home directory and saves any host name it finds in that file, marking it like the previous ones. We unimaginatively called the function for the next state _c_r_a_c_k__1(). _c_r_a_c_k__1() looks for trivially broken passwords. These are passwords which can be guessed merely on the basis of information already contained in the password file. Grampp and Morris report a survey of over 100 password files where between 8 and 30 percent of all passwords were guessed using just the literal account name and a couple of variations. The worm tries a little harder than this: it checks the null password, the account name, the account name concatenated with itself, the first name (extracted from the user information field, with the first letter mapped to lower case), the last name, and the account name reversed. It runs through up to 50 accounts per call to _c_r_a_c_k_s_o_m_e(), saving its place in the list of accounts and advancing to the next state when it runs out of accounts to try. The next state is handled by _c_r_a_c_k__2(). In this state the worm compares a list of favorite passwords, one password per call, with all of the encrypted passwords in the password file. The list contains 432 words, most of which are real English words or proper names; it seems likely that this list was generated by stealing password files and cracking them at leisure on the worm author's home machine. A global variable _n_e_x_t_w is used to count the number of passwords tried, and it is this count (plus a loss Tour of the Worm 20 in the population control game) that controls whether the worm exits at the end of the main loop--_n_e_x_t_w must be greater than 10 before the worm can exit. Since the worm normally spends 2.5 minutes checking for clients over the course of the main loop and calls _c_r_a_c_k_s_o_m_e() twice in that period, it appears that the worm must make a minimum of 7 passes through the main loop, taking more than 15 minutes[6]. It will take at least 9 hours for the worm to scan its built-in password list and proceed to the next state. The last state is handled by _c_r_a_c_k__3(). It opens the UNIX online dictionary /_u_s_r/_d_i_c_t/_w_o_r_d_s and goes through it one word at a time. If a word is capitalized, the worm tries a lower-case version as well. This search can essentially go on forever: it would take something like four weeks for the worm to finish a typical dictionary like ours. When the worm selects a potential password, it passes it to a routine we called _t_r_y__p_a_s_s_w_o_r_d(). This function calls the worm's special version of the UNIX password encryption function _c_r_y_p_t() and compares the result with the target account's actual encrypted password. If they are equal, or if the password and guess are the null string (no password), the worm saves the cleartext password and proceeds to attack hosts that are con- nected to this account. A routine we called _t_r_y__f_o_r_w_a_r_d__a_n_d__r_h_o_s_t_s() reads the user's ._f_o_r_w_a_r_d and ._r_h_o_s_t_s files, calling the previously described _h_u_1() function for each remote account it finds. ____________________ 9 [6] For those mindful of details: The first call to _c_r_a_c_k- _s_o_m_e() is consumed reading system files. The worm must spend at least one call to _c_r_a_c_k_s_o_m_e() in the second state attacking trivial passwords. This accounts for at least one pass through the main loop. In the third state, _c_r_a_c_k_s_o_m_e() tests one pass- word from its list of favorites on each call; the worm will exit if it lost a roll of the dice and more than ten words have been checked, so this accounts for at least six loops, two words on each loop for five loops to reach 10 words, then another loop to pass that number. Altogether this amounts to a minimum of 7 loops. If all 7 loops took the maximum amount of time waiting for clients, this would require a minimum of 17.5 minutes, but the 2-minute check can exit early if a client connects and the server loses the challenge, hence 15.5 minutes of waiting time plus runtime overhead is the minimum lifetime. In this period a worm will attack at least 8 hosts through the host infection rou- tines, and will try about 18 passwords for each account, attack- ing more hosts if accounts are cracked. 9 Tour of the Worm 21 _4._7._2. _F_a_s_t_e_r _p_a_s_s_w_o_r_d _e_n_c_r_y_p_t_i_o_n The use of encrypted passwords appears reasonably secure in the absence of serious attention of experts in the field. [Morris and Thompson, ``Password Security: A Case History''] Unfortunately some experts in the field have been giving serious attention to fast implementations of the UNIX password encryption algorithm. UNIX password authentication works without putting any readable version of the password onto the system, and indeed works without protecting the encrypted password against reading by users on the system. When a user types a password in the clear, the system encrypts it using the standard _c_r_y_p_t() library routine, then compares it against a saved copy of the encrypted password. The encryption algorithm is meant to be basically impossible to invert, preventing the retrieval of pass- words by examining only the encrypted text, and it is meant to be expensive to run, so that testing guesses will take a long time. The UNIX password encryption algorithm is based on the Federal Data Encryption Standard (DES). Currently no one knows how to invert this algorithm in a reasonable amount of time, and while fast DES encoding chips are available, the UNIX version of the algorithm is slightly perturbed so that it is impossible to use a standard DES chip to implement it. Two problems have been mitigating against the UNIX implemen- tation of DES. Computers are continually increasing in speed-- current machines are typically several times faster than the machines that were available when the current password scheme was invented. At the same time, ways have been discovered to make software DES run faster. UNIX passwords are now far more suscep- tible to persistent guessing, particularly if the encrypted pass- words are already known. The worm's version of the UNIX _c_r_y_p_t() routine ran more than 9 times faster than the standard version when we tested it on our VAX 8600. While the standard _c_r_y_p_t() takes 54 seconds to encrypt 271 passwords on our 8600 (the number of passwords actually contained in our password file), the worm's _c_r_y_p_t() takes less than 6 seconds. The worm's _c_r_y_p_t() algorithm appears to be a compromise between time and space: the time needed to encrypt one password guess versus the substantial extra table space needed to squeeze performance out of the algorithm. Curiously, one performance improvement actually saves a little space. The traditional UNIX algorithm stores each bit of the password in a byte, while the worm's algorithm packs the bits into two 32-bit words. This per- mits the worm's algorithm to use bit-field and shift operations on the password data, which is immensely faster. Other speedups include unrolling loops, combining tables, precomputing shifts and masks, and eliminating redundant initial and final permuta- tions when performing the 25 applications of modified DES that Tour of the Worm 22 the password encryption algorithm uses. The biggest performance improvement comes as a result of combining permutations: the worm uses expanded arrays which are indexed by groups of bits rather than the single bits used by the standard algorithm. Matt Bishop's fast version of _c_r_y_p_t() does all of these things and also precomputes even more functions, yielding twice the perfor- mance of the worm's algorithm but requiring nearly 200 KB of ini- tialized data as opposed to the 6 KB used by the worm and the less than 2 KB used by the normal _c_r_y_p_t(). How can system administrators defend against fast implemen- tations of _c_r_y_p_t()? One suggestion that has been introduced for foiling the bad guys is the idea of shadow password files. In this scheme, the encrypted passwords are hidden rather than pub- lic, forcing a cracker to either break a privileged account or use the host's CPU and (slow) encryption algorithm to attack, with the added danger that password test requests could be logged and password cracking discovered. The disadvantage of shadow password files is that if the bad guys somehow get around the protections for the file that contains the actual passwords, all of the passwords must be considered cracked and will need to be replaced. Another suggestion has been to replace the UNIX DES implementation with the fastest available implementation, but run it 1000 times or more instead of the 25 times used in the UNIX _c_r_y_p_t() code. Unless the repeat count is somehow pegged to the fastest available CPU speed, this approach merely postpones the day of reckoning until the cracker finds a faster machine. It's interesting to note that Morris and Thompson measured the time to compute the old M-209 (non-DES) password encryption algorithm used in early versions of UNIX on the PDP-11/70 and found that a good implementation took only 1.25 milliseconds per encryption, which they deemed insufficient; currently the VAX 8600 using Matt Bishop's DES-based algorithm needs 11.5 milliseconds per encryp- tion, and machines 10 times faster than the VAX 8600 at a cheaper price will be available soon (if they aren't already!). _5. _O_p_i_n_i_o_n_s The act of breaking into a computer system has to have the same social stigma as breaking into a neighbor's house. It should not matter that the neighbor's door is unlocked. [Ken Thompson, 1983 Turing Award Lecture] 9 [Creators of viruses are] stealing a car for the purpose of joyriding. [R H Morris, in 1983 Capitol Hill tes- timony, cited in the New York Times 11/11/88] I don't propose to offer definitive statements on the moral- ity of the worm's author, the ethics of publishing security information or the security needs of the UNIX computing commun- ity, since people better (and less) qualified than I are still copiously flaming on these topics in the various network 9 Tour of the Worm 23 newsgroups and mailing lists. For the sake of the mythical ordi- nary system administrator who might have been confused by all the information and misinformation, I will try to answer a few of the most relevant questions in a narrow but useful way. _D_i_d _t_h_e _w_o_r_m _c_a_u_s_e _d_a_m_a_g_e? The worm did not destroy files, intercept private mail, reveal passwords, corrupt databases or plant trojan horses. It did compete for CPU time with, and even- tually overwhelm, ordinary user processes. It used up limited system resources such as the open file table and the process text table, causing user processes to fail for lack of same. It caused some machines to crash by operating them close to the lim- its of their capacity, exercising bugs that do not appear under normal loads. It forced administrators to perform one or more reboots to clear worms from the system, terminating user sessions and long-running jobs. It forced administrators to shut down network gateways, including gateways between important nation- wide research networks, in an effort to isolate the worm; this led to delays of up to several days in the exchange of electronic mail, causing some projects to miss deadlines and others to lose valuable research time. It made systems staff across the country drop their ongoing hacks and work 24-hour days trying to corner and kill worms. It caused members of management in at least one institution to become so frightened that they scrubbed all the disks at their facility that were online at the time of the infection, and limited reloading of files to data that was verif- iably unmodified by a foreign agent. It caused bandwidth through gateways that were still running after the infection started to become substantially degraded--the gateways were using much of their capacity just shipping the worm from one network to another. It penetrated user accounts and caused it to appear that a given user was disturbing a system when in fact they were not responsible. It's true that the worm could have been far more harmful that it actually turned out to be: in the last few weeks, several security bugs have come to light which the worm could have used to thoroughly destroy a system. Perhaps we should be grateful that we escaped incredibly awful consequences, and perhaps we should also be grateful that we have learned so much about the weaknesses in our systems' defenses, but I think we should share our gratefulness with someone other than the worm's author. _W_a_s _t_h_e _w_o_r_m _m_a_l_i_c_i_o_u_s? Some people have suggested that the worm was an innocent experiment that got out of hand, and that it was never intended to spread so fast or so widely. We can find evidence in the worm to support and to contradict this hypothesis. There are a number of bugs in the worm that appear to be the result of hasty or careless programming. For example, in the worm's _i_f__i_n_i_t() routine, there is a call to the block zero function _b_z_e_r_o() that incorrectly uses the block itself rather than the block's address as an argument. It's also possi- ble that a bug was responsible for the ineffectiveness of the population control measures used by the worm. This could be seen Tour of the Worm 24 as evidence that a development version of the worm ``got loose'' accidentally, and perhaps the author originally intended to test the final version under controlled conditions, in an environment from which it would not escape. On the other hand, there is con- siderable evidence that the worm was designed to reproduce quickly and spread itself over great distances. It can be argued that the population control hacks in the worm are anemic by design: they are a compromise between spreading the worm as quickly as possible and raising the load enough to be detected and defeated. A worm will exist for a substantial amount of time and will perform a substantial amount of work even if it loses the roll of the (imaginary) dice; moreover, 1 in 7 worms become immortal and can't be killed by dice rolls. There is ample evi- dence that the worm was designed to hamper efforts to stop it even after it was identified and captured. It certainly suc- ceeded in this, since it took almost a day before the last mode of infection (the _f_i_n_g_e_r server) was identified, analyzed and reported widely; the worm was very successful in propagating itself during this time even on systems which had fixed the _s_e_n_d_- _m_a_i_l debug problem and had turned off _r_e_x_e_c. Finally, there is evidence that the worm's author deliberately introduced the worm to a foreign site that was left open and welcome to casual out- side users, rather ungraciously abusing this hospitality. He apparently further abused this trust by deleting a log file that might have revealed information that could link his home site with the infection. I think the innocence lies in the research community rather than with the worm's author. _W_i_l_l _p_u_b_l_i_c_a_t_i_o_n _o_f _w_o_r_m _d_e_t_a_i_l_s _f_u_r_t_h_e_r _h_a_r_m _s_e_c_u_r_i_t_y? In a sense, the worm itself has solved that problem: it has pub- lished itself by sending copies to hundreds or thousands of machines around the world. Of course a bad guy who wants to use the worm's tricks would have to go through the same effort that we went through in order to understand the program, but then it only took us a week to completely decompile the program, so while it takes fortitude to hack the worm, it clearly is not greatly difficult for a decent programmer. One of the worm's most effec- tive tricks was advertised when it entered--the bulk of the _s_e_n_d_- _m_a_i_l hack is visible in the log file, and a few minutes' work with the sources will reveal the rest of the trick. The worm's fast password algorithm could be useful to the bad guys, but at least two other faster implementations have been available for a year or more, so it isn't very secret, or even very original. Finally, the details of the worm have been well enough sketched out on various newsgroups and mailing lists that the principal hacks are common knowledge. I think it's more important that we understand what happened, so that we can make it less likely to happen again, than that we spend time in a futile effort to cover up the issue from everyone but the bad guys. Fixes for both source and binary distributions are widely available, and anyone who runs a system with these vulnerabilities needs to look into these fixes immediately, if they haven't done so already. 9 9 Tour of the Worm 25 _6. _C_o_n_c_l_u_s_i_o_n It has raised the public awareness to a considerable de- gree. [R H Morris, quoted in the New York Times 11/5/88] This quote is one of the understatements of the year. The worm story was on the front page of the New York Times and other newspapers for days. It was the subject of television and radio features. Even the _B_l_o_o_m _C_o_u_n_t_y comic strip poked fun at it. Our community has never before been in the limelight in this way, and judging by the response, it has scared us. I won't offer any fancy platitudes about how the experience is going to change us, but I will say that I think these issues have been ignored for much longer than was safe, and I feel that a better understanding of the crisis just past will help us cope better with the next one. Let's hope we're as lucky next time as we were this time. _A_c_k_n_o_w_l_e_d_g_m_e_n_t_s No one is to blame for the inaccuracies herein except me, but there are plenty of people to thank for helping to decompile the worm and for helping to document the epidemic. Dave Pare and Chris Torek were at the center of the action during the late night session at Berkeley, and they had help and kibitzing from Keith Bostic, Phil Lapsley, Peter Yee, Jay Lepreau and a cast of thousands. Glenn Adams and Dave Siegel provided good information on the MIT AI Lab attack, while Steve Miller gave me details on Maryland, Jeff Forys on Utah, and Phil Lapsley, Peter Yee and Keith Bostic on Berkeley. Bill Cheswick sent me a couple of fun anecdotes from AT&T Bell Labs. Jim Haynes gave me the run-down on the security problems turned up by his busy little undergrads at UC Santa Cruz. Eric Allman, Keith Bostic, Bill Cheswick, Mike Hibler, Jay Lepreau, Chris Torek and Mike Zeleznik provided many useful review comments. Thank you all, and everyone else I for- got to mention. Matt Bishop's paper ``A Fast Version of the DES and a Pass- word Encryption Algorithm'', 8c91987 by Matt Bishop and the Univer- sities Space Research Association, was helpful in (slightly) parting the mysteries of DES for me. Anyone wishing to under- stand the worm's DES hacking had better look here first. The paper is available with Bishop's _d_e_s_z_i_p distribution of software for fast DES encryption. The latter was produced while Bishop was with the Research Institute for Advanced Computer Science at NASA Ames Research Center; Bishop is now at Dartmouth College (_b_i_s_h_o_p@_b_e_a_r._d_a_r_t_m_o_u_t_h._e_d_u). He sent me a very helpful note on the worm's implementation of _c_r_y_p_t() which I leaned on heavily when discussing the algorithm above. 9 9 Tour of the Worm 26 The following documents were also referenced above for quotes or for other material: _D_a_t_a _E_n_c_r_y_p_t_i_o_n _S_t_a_n_d_a_r_d, FIPS PUB 46, National Bureau of Stan- dards, Washington D.C., January 15, 1977. F. T. Grampp and R. H. Morris, ``UNIX Operating System Secu- rity,'' in the _A_T&_T _B_e_l_l _L_a_b_o_r_a_t_o_r_i_e_s _T_e_c_h_n_i_c_a_l _J_o_u_r_n_a_l, October 1984, Vol. 63, No. 8, Part 2, p. 1649. Brian W. Kernighan and Dennis Ritchie, _T_h_e _C _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, Second Edition, Prentice Hall: Englewood Cliffs, NJ, 8c91988. John Markoff, ``Author of computer `virus' is son of U.S. Elec- tronic Security Expert,'' p. 1 of the _N_e_w _Y_o_r_k _T_i_m_e_s, November 5, 1988. John Markoff, ``A family's passion for computers, gone sour,'' p. 1 of the _N_e_w _Y_o_r_k _T_i_m_e_s, November 11, 1988. Robert Morris and Ken Thompson, ``Password Security: A Case His- tory,'' dated April 3, 1978, in the _U_N_I_X _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l, in the _S_u_p_p_l_e_m_e_n_t_a_r_y _D_o_c_u_m_e_n_t_s or the _S_y_s_t_e_m _M_a_n_a_g_e_r'_s _M_a_n_u_a_l, depending on where and when you got your manuals. Robert T. Morris, ``A Weakness in the 4.2BSD Unix TCP/IP Software,'' AT&T Bell Laboratories Computing Science Technical Report #117, February 25, 1985. This paper actually describes a way of spoofing TCP/IP so that an untrusted host can make use of the _r_s_h server on any 4.2 BSD UNIX system, rather than an attack based on breaking into accounts on trusted hosts, which is what the worm uses. Brian Reid, ``Massive UNIX breakins at Stanford,'' RISKS-FORUM Digest, Vol. 3, Issue 56, September 16, 1986. Dennis Ritchie, ``On the Security of UNIX,'' dated June 10, 1977, in the same manual you found the Morris and Thompson paper in. Ken Thompson, ``Reflections on Trusting Trust,'' 1983 ACM Turing Award Lecture, in the _C_o_m_m_u_n_i_c_a_t_i_o_n_s _o_f _t_h_e _A_C_M, Vol. 27, No. 8, p. 761, August 1984. 9 9