Security Topics

CSC 150 Chapter 13: Selected Security Topics and Case Studies
information resources: An Invitation to Computer Science Third Edition, G. Michael Schneider and Judith L. Gersting, Thompson Course Technology, 2007.
Computer Networking: A Top-Down Approach Featuring the Internet, James Kurose and Keith Ross, Addison-Wesley, 2000.
Reflections on Trusting Trust, Ken Thompson and Dennis Ritchie, Communications of the ACM, August 1984

[ previous | schedule | next ]

Topic Index
[ cryptography and authentication | virus | worm | Trojan horse ]

Network Security: Cryptography

Cryptography: designing ciphers
Cipher: system for transforming plain text into secret code and back through use of an algorithm and key

Symmetric cryptography

a.k.a secret key

Traditional methods fall into this category, which include:

Transposition (re-order symbols)
Substitutions (disguise symbols)
Combination of the two

Requirements:

decryption algorithm exists and is symmetric to encryption
shared and secret key (must be known by both sender and receiver but no one else!)

Example: Caesar cipher

Earliest known encryption.
Substitution method, where letter substituted is x letters further along in the alphabet.
E.g. if 2 letters further on, then a is encrypted as c, b is encrypted as d, ... x as z, y as a and z as b. The key in this case is 2.
The symmetry of decryption is obvious.
How secure is this? How many attempts required to break it?

Example: DES (Data Encryption Standard)

An example of a block cipher -- data are operated on as blocks of bits, regardless of character boundaries.
Developed by IBM, and widely used.
Each 64-bit block of text is encrypted using a 64 bit key (effectively 56 bits because each byte contains a parity bit).
Works by repeated transposition (permutation) of data.
Encryption process is two permutations sandwiched around 16 rounds of applying a specific function, each using a different transposition of the key.
Cracking it requires brute force but this was accomplished in 4 months using a network of computers.
DES has since been strengthened into AES (Advanced)

Asymmetric cryptography

a.k.a public key

most such methods are variation of public key cryptography

revolutionary, eliminates need for both parties to know same key.
Originally developed by Hellman and Diffie at Stanford (1976).
Innovated by Rivest, Shamir and Adelman (RSA), the 2002 Turing award winners.
each party has two keys, one publicly known and one privately held.
Key comes from large number which is the product of two large primes
Can only be broken by brute force, factoring of a very large number into its prime components is very time consuming

Essence of the codes:

Encryption key E is public
Decryption key D is private
Plaintext message to be securely transmitted is M.
D is "very difficult" to deduce from E
E cannot be broken by plaintext attack
D( E(M) ) = M

Procedure for secure transmission:
1. Sender encrypts M using receiver's algorithm and public key E_R
2. Sender transmits E_R(M)
3. Receiver decrypts using its private key: D_R
4. Result is D_R(E_R(M)) = M

Authentication

Guaranteeing that an entity really is who it claims to be.
passwords are simple example of authentication for purposes of gaining system access

digital signature is a type of authentication which uses cryptography

goal: assure message comes from who it says it does
can be accomplished using public key encryption
requires one additional property: E( D(M) ) = M

Example procedure (there are other methods):
1. sender applies its private D_S to M
2. sender transmits D_S(M)
3. receiver applies sender's public E_S to get E_S(D_S(M)) = M
If the message had come from a different sender, last step would fail. Such signatures are legally recognized.

Application

Public key cryptography is very computationally expensive.

A typical application will use public key cryptography for the exchange of DES keys, followed by data communication using DES

These technologies are utilized in:

Digital Certificates, which are issued by trusted Certificate Authorities.
PGP - Pretty Good Privacy, used for secure e-mail transmission
SSL (secure sockets layer) -- a secure TCP transport service used for secure HTTP transmission (https)
many, many more

Network Security: Viruses and Worms

Viruses, Worms and Trojan Horses

Virus requires host, and uses host facilities to replicate itself. Prior to mid-90s, most were spread by diskette. Now, most are spread by email.
Worm is stand-alone and actively seeks out its victims.
Distinction between the two is not clearly drawn in the literature. Generally a worm is autonomous, i.e., does not require any action by its victims in order to replicate and spread.
Trojan horse is trusted program which contains hidden malicious feature. This is not necessarily a network security threat but is a real system security threat.

Following are 4 case studies:

The “I Love You” virus of May 2000.
The Morris Internet Worm of November 1988.
The Slammer Internet Worm of January 2003.
Ken Thompson’s C compiler Trojan Horse described in 1984 lecture.

The “I Love You” virus of May 2000.

McAfee calls it a “worm with virus qualities”. It is a VBScript virus which arrives attached to an email message with this format:

Subject "ILOVEYOU"
Message "kindly check the attached LOVELETTER coming from me."
Attachment "LOVE-LETTER-FOR-YOU.TXT.vbs"

Opening the attachment will cause it to run. Here is a summary of what it does:
1. It makes a few copies of itself in Windows system folders.
2. It also modifies the Windows system registry in order to run the virus at system startup.
3. It then searches all drives connected to the system and :
a. replaces JPEG files with copies of itself and it adds the extension .VBS to the original filename.
b. replaces files having certain other file name extensions with copies of itself and changes the extension to .VBS.
c. makes MP3 files hidden without destroying them, and makes copies of itself with the same filenames only adding extension .VBS to the original filename.
4. It then sends copies of itself to everyone in the system’s Microsoft Outlook’s address book. This is how it propagates.
5. Tries to download and install a password stealing program called WIN-BUGSFIX.EXE, This program will email captured passwords to an address in the Philippines.

Worms

Here are two classic Internet Worm cases. The first occurred in 1988 and the second in 2003. The 1988 Morris Internet Worm was the first great Internet worm. It epitomizes worm behavior and exploitation through buffer overflow. The 2003 Slammer Internet worm is more recent but no more interesting. It demonstrates that, sadly, we still haven't learned our lessons and Internet services are still susceptible to buffer overflow attacks. The Morris worm will be covered in greater detail.

Morris' Internet Worm of November 1988.

This account is digested from several web pages, including one containing the source code for the worm! Do a google search on Morris Internet Worm and you'll get lots of information. There is an especially comprehensive account at http://www.snowplow.org/tom/worm/worm.html

On November 2, 1988, Robert Morris, Jr., a graduate student in Computer Science at Cornell, wrote an experimental self-propagating program called a worm and injected it into the Internet.

It spread very rapidly. Due to a bug in the worm, it also spawned many new processes on each infected system, slowing system response to a crawl. The Internet bogged down, and many system administrators simply disconnected.

Researchers at U.C. Berkeley and other institutions “reverse engineered” the worm’s code and developed a fix within a few hours. The damage, however, was already widespread.

Morris was convicted of violating the computer Fraud and Abuse Act, and sentenced to three years of probation, 400 hours of community service, and a fine of $10,000. Ironically, his father was chief scientist at the National Computer Security Center at the time of the worm attack.

Significance: first wake-up call for Internet users that security measures need to be taken.

The worm attacked several “holes” in BSD Unix, which was developed at U.C. Berkeley and distributed free to educational institutions. Most machines connected to the Internet at that time ran BSD Unix.

I will describe how the worm attacked one hole in particular.

One mode of attack was through a bug in BSD Unix fingerd

finger is a Unix command to request public information about a user
e.g. finger pete for info about user pete on local system
e.g. finger pete@csc.smsu.edu for info about user pete on remote system
the second example triggers connection between local (client) and remote (server)
fingerd is the name of the server program that handles finger requests. “d” is short for daemon, a process that awakens periodically to handle service requests.

When finger for remote user is issued, client-server connection established.
Client follows this sequence:
1. sends user name (e.g. pete) to server.
2. Waits for user info from server.
3. When user info received, closes connection.

Server (fingerd) follows this sequence:
1. Waits for user name from client.
2. When user name received, stores it into a 512-byte buffer (array)
3. runs its finger program to get user info
4. sends user info back to the client
5. closes connection

Here’s what Morris’ worm did:
1. issued a finger command directed to a targeted remote machine.
2. This established a client-server connection.
3. Instead of sending user name, it sent a 536- byte long string especially crafted to overflow the server’s 512-byte buffer.
4. The string was crafted so the overflow replaced the code to run finger (server step 3) with the code to run sh, the command interpreter.
5. Thus, when the server got to its step 3, it started up a command interpreter.
6. Because the client-server connection was still active, the command interpreter expected to get its input from that connection instead of from a keyboard.
7. The client (worm) issued the necessary commands to transmit the worm files to the server (targeted) machine then run the worm on that machine. Self-propagation!

Postscript: the possibility of "buffer overflow" attack remains to this day...just read on.

The Slammer Internet Worm of January 2003

My source for this account is the July 2003 Wired magazine article "Slammed! An inside view of the worm that crashed the Internet in 15 minutes." They caught some grief for publishing the disassembled code of the worm but it is instructive.

Just after midnight on January 25, 2003, the Internet was overwhelmed in 15 minutes by UDP packets targeting Microsoft's SQL Server 2000. The worm took advantage of a buffer overflow vulnerability.

Here's how the worm was constructed and operated

The vulnerability: SQL Server offers a SQL Server Monitor service on UDP port 1434. This service expects a request packet to begin with a one byte service number and the "04" service in question then expects a maximum 16 byte long database name. Since names can vary in length, the server expects it to be "null terminated" -- the name must end with a byte of 0's.
The Buffer Overflow: Instead of a null-terminated database name, Slammer provides a long sequence of "01" bytes (00000001 in binary) followed by a sequence of binary-encoded instructions. By the time the server sees the null byte the packet contents have overflowed the buffer and into the adjacent runtime stack entry.
Taking Control: The overflow part of the packet contains not data but instructions. The runtime stack entry into which they flow, controls what code the computer will execute next. After the server reads the packet, it then looks to the runtime stack entry to determine what to do next. But this has been corrupted by the overflow, which directs the computer to execute its own instructions. The Worm now has control of the computer.
Replication: The Worm reads the 32-bit system clock and uses that as a 32-bit IP address. It generates a UDP packet replicating the packet it arrived in and sends it off to the host at that IP address. It then repeatedly permutes the 32-bit IP address and with each permutation fires off another worm packet. I do not know what, if anything, causes it to stop doing this. After a few minutes the Internet was so congested the packets could no longer be transmitted anyway.

UDP is a prime carrier for such an attack because each packet travels the Internet on its own and arrives at the receiving host unexpected. By contrast, the more robust TCP protocol establishes a client-server connection prior to sending any data packets. TCP has its own vulnerabilities, but that's a different story...

As far as I've been able to determine, the author and source of Slammer have never been identified.

Ken Thompson's C compiler Trojan Horse

Ken Thompson and Dennis Ritchie developed the Unix operating system at Bell Labs. They were later given an ACM Turing Award for this accomplishment. Ken Thompson presented this lecture at the award ceremony. In it, he describes how he planted a Trojan Horse into an early version of the C compiler (which he helped develop in conjunction with Unix). The lecture is published as "Reflections on Trusting Trust" by Dennis Ritchie and Ken Thompson, Communications of the ACM, August 1984.

Since Unix was written mostly in C, the operating system itself needed to be compiled during the installation process.
He bugged the C compiler so it would recognize the section of Unix source code that processes logins. When this code was detected, the compiler would generate machine code which allowed him (Thompson) to log into the system as any user -- a Trojan Horse.
Yet if one inspected the source code of the compiler itself there would be no trace of the Trojan Horse!

How did he do this? Hang onto your hats….

Three concepts are required.
1. Compiler is written in the same language that it compiles.
2. learning compiler.
3. self-reproducing code.

1. Compiler is written in the same language that it compiles.
C compiler is written in C. Classic chicken-and-egg, right? Kinda.

Start with a bare-bones compiler (a bootstrap compiler, if you will) written in a different language, which compiles a small subset of C.
Write a small C compiler using the subset of C that this bootstrap compiler recognizes.
Compile it using the bootstrap compiler and now you have a small C compiler written in C (version 0.1).
Add to Version 0.1 the code to recognize and compile a couple additional features. Compile it using itself to produce Version 0.2.
Version 0.2 recognizes the additional features, so you can use them when writing Version 0.3 of the compiler, which will compile a couple more additional features.
Continue in this fashion until the entire language is compilable.

2. Learning compiler
This is described in general by steps 4 and 5 above. The example given in the lecture, along with code, is this: the compiler recognizes ‘\n’ as the newline character but does not recognize ‘\t’ as tab character. Teach it to do so. The example source code is in the C compiler. I will take some poetic license with C syntax to make the code more understandable.

// Version X. recognizes ‘\\’ (backslash) and ‘\n’ (newline)
char c;
c = next( ); // get next character from program being compiled
if ( c == ‘\’) { // first character of ‘\n’ or ‘\\’
    c = next( );
    if (c == ‘\’) // it is ‘\\’ which compiles into backslash
       return (‘\\’);
    if (c == ‘n’) // it is ‘\n’ which compiles into newline
       return (‘\n’);
}

// Version X+1. Teach to recognize ‘\t’ (tab) ASCII code 9
char c;
c = next( );
if ( c == ‘\’) {
    c = next( );
    if (c == ‘\’) // it is ‘\\’ which compiles into backslash
       return (‘\\’);
    if (c == ‘n’) // it is ‘\n’ which compiles into newline
       return (‘\n’);
    if (c == ‘t’) // it is ‘\t’ which compiles into ASCII 9
       return (9);
}

// Version X+2. Recognizes ‘\\’, ‘\n’ and ‘\t’
char c;
c = next( ); // get next character from program being compiled
if ( c == ‘\’) { // first character of ‘\n’ or ‘\\’ or ‘\t’
    c = next( );
    if (c == ‘\’) // it is ‘\\’ which compiles into backslash
       return (‘\\’);
    if (c == ‘n’) // it is ‘\n’ which compiles into newline
       return (‘\n’);
    if (c == ‘t’) // it is ‘\t’ which compiles into tab
       return (‘\t’);
}

The remarkable thing is this: once the compiler is taught, in Version X+1, that ‘\t’ compiles into ASCII code 9, we can remove that fact from the compiler in Version X+2 yet the compiler will forever after know that ‘\t’ compiles into 9!

3. Self-reproducing code
This is a program whose output, when run, is an exact copy of its own source code! Trust me that this can be done, The lecture includes an example.

How are these 3 concepts applied to plant this untraceable Trojan Horse?

Again, I will take some poetic license with C syntax to make the code more understandable.

1. Suppose the compiler includes a function which compiles one line of code. Its definition would look something like this:

void compile(String lineOfCode) {
. . .
}

2. Bug the C compiler to recognize a specific line of code and replace it with something else.

void compile(String lineOfCode) {
   if (lineOfCode == “pattern”) {
      compile(“bug”);
      return;
   }
. . .
}

where “pattern” represents a specific line of source code from the Unix login command, and “bug” represents source code of the Trojan Horse, which allows Thompson to login as anyone. The match can only occur when compiling the Unix operating system. This is not too hard to do, but cannot be left in the compiler because it will be discovered. We have to go further.

3. Incorporate a second similar bug into the C compiler, shown here as the second IF statement.

void compile(String lineOfCode) {
   if (lineOfCode == “pattern”) {
      compile(“bug”);
      return;
   }
   if (lineOfCode == “pattern2”) {
      compile(“bug2”);
      return;
   }
. . .
}

where “pattern” and “bug” are the same as before. “pattern2” is a line of code that exists only in the C compiler (thus this match only kicks in when compiling the compiler). “bug2” is a self-reproducing program which will insert both Trojan Horses into the compiled code.

4. The initial C compiler machine code is clean; use it to compile the “dirty” source version from #3. Afterward, both the source code and the machine code of the C compiler are “dirty”.

5. Edit the C compiler source code to remove the two IF statements. The source code is once again clean, as in #1.

6. Compile the newly-cleansed C compiler source code using the dirty C compiler. The dirty C compiler will still match the “pattern2” code because we did not remove it. When this match occurs, the dirty compiler will compile “bug2”, which again inserts both Trojan Horses into the compiler machine code! The machine code of the compiler is forever bugged, unless someone “reverse engineers” it and discovers the skullduggery.

Even if you don’t understand all the nuances of the steps above, understand this: the code that inserted the Trojan Horse into the C compiler is written into the compiler source code one time and then removed after compiling it. Thereafter, any time the compiler is compiled, the Trojan Horse will be propagated to the new version because the code was self-reproducing!

Thompson’s moral of the story: You can’t trust code that you did not totally create yourself.

[ C SC 150 | Peter Sanderson | Math Sciences server | Math Sciences home page | Otterbein ]

Last updated:
Peter Sanderson (PSanderson@otterbein.edu)