Chapter 7 Hidden Files and Antiforensics

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

It should be no surprise to you that criminals wish to hide evidence. There are several ways they can do this. One of the more obvious methods involves encrypting data, photos, or other evidence. There are also techniques for hiding evidence in other files (steganography) and simply altering logs to hide the evidence of a breach in security. Some criminals will even opt to completely wipe the suspect drive to prevent evidence from being gathered. It is important for any forensic analyst to have a good understanding of these methods.

Cryptography

The first thing to learn about cryptography is the terminology. Many people mistakenly use terms like cryptography and cryptology interchangeably. Cryptography is the study of methods for encrypting and decrypting a message. Cryptanalysis is the study of methods to break cryptography. Cryptology includes both cryptography and cryptanalysis.

An algorithm is the process or steps needed to accomplish some task. In cryptography, another term for a cryptographic algorithm is a cipher. The text you wish to encrypt is called plain text; the numeric input needed to make the cipher work is called the key. And finally, the output of a cryptographic algorithm is called cipher text.

Many people avoid studying cryptography, believing it is too hard…too much math. Well, yes, there is some math. And if you want to become a cryptologist, then yes, you will need to have a very good understanding of number theory, discrete math, and a few other mathematical disciplines. However, it does not take very much math at all to gain a fundamental understanding of how cryptography works. I have found the best way to teach cryptography is to begin with historical methods that are simple to grasp. That is how we will start here.

The History of Encryption

Secure communication is not a new concept. From ancient generals to conspirators in the Middle Ages, people have had a need to communicate in such a way that others could not read what they wrote. While in modern times, cryptography means mathematics, and usually computers, in ancient times, the methods were simpler.

Before we begin exploring a few of these methods, keep a few facts in mind. First of all, in ancient times, a significant portion of the population was illiterate. Simply writing something down kept it from most people. The second fact to keep in mind, which is related to that first fact, is that historical methods are not secure today. You cannot use these methods in modern times. A computer would crack any of these within seconds. However, these methods contain the elements of cryptography and make an excellent place to begin our study of cryptography.

The Caesar Cipher

One of the oldest recorded ciphers is the Caesar cipher. This is also one of the most widely known ancient ciphers. It is featured in a number of security-related certifications. Historians claim this was used by Julius Caesar. The method is simple: You choose some number by which to shift each letter of a text. For example, if the text is

A cat

And you choose to shift by three letters, then the message becomes

D gdw

Or, if you choose to shift by one letter to the left, it becomes

Z bzs

In this example, you can choose any shifting pattern you wish. You can shift either to the right or left by any number of spaces you like. Because this is a very simple method to understand, it makes a good place to start our study of encryption. It is, however, extremely easy to crack. You see, any language has a certain letter and word frequency, meaning that some letters are used more frequently than others. In the English language, the most common single-letter word is a, with I a close second. The most common three-letter word is the, with and being a close second. The most common two-letter combinations are oo and ll . Those three rules alone could help you decrypt a Caesar cipher. For example, if you saw a string of seemingly nonsense letters and noticed that a three-letter word was frequently repeated in the message, you might easily surmise that this word was the, and use that to begin decrypting the message.

Furthermore, if you frequently noticed a single-letter word in the text, it is most likely the letter a . You now have found the substitution scheme for a, t, h, and e . You can now either translate all of those letters in the message and attempt to surmise the rest or simply analyze the substitute letters used for a, t, h, and e and derive the substitution cipher that was used for this message. Decrypting a message of this type does not even require a computer. It could be done in less than ten minutes using pen and paper by someone with no background in cryptography.

Images
NOTE When I teach cryptography or security classes in person, I actually have the class do exactly that. Create a message, encrypt it with a Caesar cipher, and then pass it to a classmate to see how long it takes them to break it. Usually, about half the class breaks the cipher in about ten minutes.

Caesar ciphers belong to a class of ciphers known as substitution ciphers. The name simply means that each letter of the plain text is substituted for one letter in the cipher text. The particular substitution scheme used (a becomes c, b becomes d, etc.) is called a substitution alphabet. Because one letter always substitutes for one other letter, the Caesar cipher is sometimes called a monoalphabet substitution method, meaning that it uses a single substitution for the encryption.

ROT 13

This is another single-alphabet substitution cipher. It is very much like the Caesar cipher, except it has a fixed shift. All characters are rotated 13 characters through the alphabet.

The phrase

A CAT

Becomes

N PNG

ROT 13 is a single-substitution cipher.

Scytale

This was a cylinder used by ancient Greeks, particularly the Spartans. You would wrap a leather strap around a cylinder and create the message. The recipient needed to have a cylinder of the exact same diameter to read the message.

Atbash Cipher

This cipher was used by ancient Hebrew scholars. It involves substituting the first letter of the alphabet for the last and the second letter for the second to the last, etc. It simply reverses the alphabet. While they used Hebrew, we can see how this would work in English:

A becomes Z, B becomes Y, C becomes X, etc.

This is a single-substitution cipher, just like ROT 13 and Caesar.

Multialphabet Substitution

Eventually, an improvement in single-substitution ciphers was developed, called multialphabet substitution. In this scheme, you select multiple numbers by which to shift letters (i.e., multiple substitution alphabets). For example, if you select three substitution alphabets (+1, –1, +2), this means you shift the first letter right one, the second letter left one, then the third letter right two, and then repeat. The fourth letter is shifted right one, the fifth left one, and the sixth shifted right by two. Thus

A CAT

becomes

B BCU

Notice that this disrupts the letter and word frequency. At one point in our short message, A becomes B; in another, it becomes C; and since we have three substitution alphabets, it could, in a longer message, become Z. This makes it more difficult to decipher the underlying text.

Vigenere Cipher

One of the most widely known multialphabet ciphers was the Vigenere cipher. This cipher was invented in 1553 by Giovan Battista Bellaso. It is a method of encrypting text by using a series of different monoalphabet ciphers selected based on the letters of a keyword. This algorithm was later misattributed to Blaise de Vigenère, and so it is now known as the “Vigenére cipher,” even though Vigenére did not really invent it.

You use the table shown in Figure 7-1, along with a keyword you have selected. Match the letter of your keyword on the top with the letter of your plain text on the left to find the cipher text. For example if the plain text letter is E and the keyword letter is F, the cipher text will be J, as shown in Figure 7-1.

Images

Figure 7-1 The Vigenere cipher

Using the previous chart, if you are encrypting the word “cat” and your keyword is “horse,” then the cipher text is “jok.”

Some sources view the Vigenere cipher as just a series of Caesar ciphers, and that is a reasonably accurate view.

Images
EXAM TIP The CCFP (as well as many other security certification tests) is most likely to ask you about Caesar and Vigenere, if you get any questions about historical ciphers. So make sure you fully understand these two.

Transposition Ciphers

In addition to substituting, one can use transposition. With transposition, you simply move the text around. For example, you might decide to take every three-letter block and switch it with the next block. So the phrase “I like single malt scotch” becomes

Kes ili gle sin tso mal tch

The most widely known transposition cipher was the rail fence cipher. With this cipher, you would write out your plain text on two lines, then combine the lines to make the cipher text. For example, the plain text message “attack at dawn” is written out on two rows

And the cipher text becomes

Atcadwtaktan

Enigma Machine

Most readers have heard of this machine. It was a polyalphabet cipher typewriter. It had 26 different substitution alphabets, and each time a user typed a key, the machine would rotate to the next alphabet. This was the height of cryptographic technology during World War II.

Images
NOTE There are many more historical ciphers, and more to learn about cryptology in general, than we can cover in this book. If you are fascinated by this topic, you might find www.CryptoCorner.com to be a good place to start.

Modern Cryptography

After reading the first portion of this chapter, you should have a pretty good understanding of historical cryptography. It is now time for us to move on to discuss modern cryptography. Modern cryptography is split into two main branches: symmetric and asymmetric. Symmetric cryptography means that the same key is used to decrypt a message as was used to encrypt it. With asymmetric cryptography, the key used to encrypt a message cannot decrypt it; you need a second key.

Symmetric cryptography can be further broken down into two subgroups block ciphers and stream ciphers. With block ciphers, the plain text is divided into blocks (usually 64 or 128 bits) and each block is encrypted. With stream ciphers, the plain text is encrypted in a stream, one bit at a time. All modern block ciphers include binary operations.

Binary Operations

If you have a background in computer science, particularly programming, you should be familiar with binary numbers and binary operations. But for those readers not familiar with them, a brief explanation follows.

First, a binary number is just a number written in base 2. You are familiar with base 10 numbers. In base 10, you don’t really have a number 10—you have 0 through 9. After 9, the next number is really saying “a 1 in the tens place and a 0 in the ones place.” The same is true with base 2. You don’t really have a number 2. Instead, 10 means “a one in the twos place and a 0 in the ones place.” This chart might clarify that a bit:

Images

When working with binary numbers, there are three operations not found in normal math: AND, OR, and XOR operations. Each is illustrated next:

• AND To perform the AND operation, you take two binary numbers and compare them one place at a time. If both numbers have a 1 in both places, then the resultant number is a 1. If not, then the resultant number is a 0, as you see here:

Images

• OR The OR operation checks to see whether there is a 1 in either or both numbers in a given place. If so, then the resultant number is 1. If not, the resultant number is 0, as you see here:

Images

• XOR The XOR operation affects your study of encryption the most. It checks to see whether there is a 1 in a number in a given place, but not in both numbers at that place. If it is in one number but not the other, then the resultant number is 1. If not, the resultant number is 0, as you see here:

Images

XORing has a very interesting property in that it is reversible. If you XOR the resultant number with the second number, you get back the first number. And if you XOR the resultant number with the first number, you get the second number.

Images

The fact that the XOR operation is reversible makes it useful in cryptography. One could argue that simply converting text to binary format (first converting it to ASCII codes and then the binary equivalents) and XORing it with a random stream of bits would constitute encryption, and technically, that would be correct. However, that XOR alone would be very weak. But it is actually a part of many modern symmetric ciphers.

Symmetric Encryption

As we mentioned previously, symmetric encryption refers to those methods where the same key is used to encrypt and decrypt the plain text.

Data Encryption Standard

Data Encryption Standard (DES) was developed by IBM in the early 1970s. DES uses a symmetric key system. Recall from our earlier discussion that this means the same key is used to encrypt and decrypt the message. DES uses short keys and relies on complex procedures to protect its information. The actual algorithm is quite complex. The basic concept, however, is as follows (Federal Information Processing Standards, 1993):

1. The data is divided into 64-bit blocks, and those blocks are then transposed.

2. Transposed data is then manipulated by 16 separate steps of encryption, involving substitutions, bit-shifting, and logical operations using a 56-bit key.

3. The data is then further scrambled using a swapping algorithm.

4. Finally, the data is transposed one last time.

More information about DES is available at the Federal Information Processing Standards website at www.itl.nist.gov/fipspubs/fip46-2.htm.

DES is no longer considered secure due to the short key. A computer can attempt a brute-force attack, which means simply trying every possible key. With a 56-bit key, that means 2^56 possible keys. That means 72,057,594,037,927,936 possible keys. That may seem impossible, but remember that computers can attempt a lot of keys, very fast, and it is unlikely you will have to try all of them before you get a match. But assuming you do, and assuming the computer trying to break DES can try 1 billion keys a second, it would take 2.2 years to try all the possible keys. There is a principle in cryptography based on the Birthday paradox. The math behind this is a bit more than we can cover here, but put simply, the Birthday paradox asks how many people would need to be in a room to have a strong likelihood (not a guarantee, just a strong probability) of two of those people having the same birth month/day (but not year). You might think that it would take 365, and for a guaranteed match, you would be right. But just to have a good chance of a match, you would only need 23, or the square root of 365. Applying that to DES, you have a good chance of having a match on the key after trying only 84,886,744 random keys. And as you can see, if our computer can try 1 billion keys a second, then it is likely to get a match in only one second!

One variation of DES is called triple-DES or 3-DES. In 3-DES, you get three separate keys and encrypt the plain text three times with DES, each time using a different key. First you encrypt the plain text with key 1, then encrypt that result with key 2, and then encrypt that result with key 3. There are variations of 3-DES that actually use only two keys.

Images
EXAM TIP Though DES is outdated, CCFP and other security certifications like to ask questions about it.

Feistel Ciphers

DES is part of a class of ciphers called Feistel ciphers. This function is named after its inventor, the German-born physicist and cryptographer Horst Feistel.

At the heart of many block ciphers is a Feistel function. This makes it one of the most influential developments in symmetric block ciphers. It is also known as a Feistel network. Figure 7-2 shows how it works.

Images

Figure 7-2 The Feistel cipher

1. This function starts by splitting the block of plain text data (often 64 bits) into two parts (traditionally termed L₀ and R₀).

2. The round function F is applied to one of the halves. The term “round function” simply means a function performed with each iteration, or round, of the Feistel cipher. The details of the round function F can vary with different implementations. Usually, these are relatively simple functions to allow for increased speed of the algorithm.

3. The output of each round function F is then XORed with the other half. What this means is that, for example, you take L₀, pass it through the round function F, then take the result and XOR it with R₀.

4. Then the halves are transposed. L₀ is moved to the right and R₀ is moved to the left.

5. This process is repeated a given number of times. The main difference between cryptography algorithms is the exact nature of the round function F and the number of iterations.

Images
NOTE Many symmetric ciphers, including DES, GOST, and AES, use what is called an s-box. That is short for “substitution box.” Basically, a certain number of bits are input into the box and substituted for something else. So, for example, 1101 might come into the box and the output be 0110. There are many variations of the s-box, such as p-boxes (or permutation boxes).

Blowfish

Blowfish is a symmetric block cipher. This means that it uses a single key to both encrypt and decrypt the message and works on “blocks” of the message at a time. It uses a variable-length key ranging from 32 to 448 bits. Blowfish was designed in 1993 by Bruce Schneier. It has been analyzed extensively by the cryptography community and has gained wide acceptance. It is also a noncommercial (i.e., free of charge) product, thus making it attractive to budget-conscious organizations.

AES

AES stands for Advanced Encryption Standard. This standard uses the Rijndael algorithm. This algorithm was part of a contest seeking a replacement for DES. Fifteen different algorithms made it to the second-to-last round, then five in the last round, and ultimately, Rijndael won out.

The AES specifies three key sizes: 128, 192, and 256 bits. By comparison, DES keys are 56 bits long, and Blowfish allows varying lengths up to 448 bits. AES uses a block cipher. Interested readers can find detailed specifications for this algorithm, including a detailed discussion of the mathematics, at http://csrc.nist.gov/archive/aes/rijndael/Rijndael-ammended.pdf.

This algorithm is widely used and is considered very secure, and therefore, is a good choice for many encryption scenarios.

IDEA Encryption

IDEA is another block cipher. The acronym IDEA stands for International Data Encryption Algorithm. This particular algorithm works with 64-bit blocks of data, two at a time, and uses a 128-bit key. The procedure is fairly complicated and uses subkeys generated from the key to carry out a series of modular arithmetic and XOR operations on segments of the 64-bit plain text block. The encryption scheme uses a total of 52 16-bit subkeys. These are generated from the 128-bit subkey with the following procedure:

1. The 128-bit key is split into eight 16-bit keys, which are the first eight subkeys.

2. The digits of the 128-bit key are shifted 25 bits to the left to make a new key, which is then split into the next eight 16-bit subkeys.

3. The second step is repeated until the 52 subkeys have been generated. The encryption consists of eight rounds.

GOST

GOST is a DES-like algorithm developed by the Soviets in the 1970s. It was originally classified, but was released to the public in 1994. It uses a 64-bit block and a key of 256 bits. It is a 32-round Feistel cipher.

The round function is as follows:

1. Add the subkey modulo 2.

2. Put the result through s-boxes.

3. Rotate the result 11 bits.

The key schedule involves the following:

1. Divide the 256-bit key into eight 32-bit subkeys.

2. Each subkey is used four times.

This cipher uses s-boxes that take in 4 bits and output 4 bits.

Serpent

Serpent has a block size of 128 bits and can have a key size of 128, 192, or 256 bits, much like AES. The algorithm is also a substitution-permutation network like AES. It uses 32 rounds working with a block of four 32-bit words. Each round applies one of eight 4-bit to 4-bit c-boxes 32 times in parallel. Serpent was designed so that all operations can be executed in parallel.

Skipjack

This algorithm was developed by the National Security Agency (NSA) and was designed for the clipper chip. It was originally classified. The clipper chip was a chip with built-in encryption; however, the decryption key would be kept in a key escrow in case law enforcement needed to decrypt data without the computer owner’s cooperation. This feature made the process highly controversial. Skipjack uses an 80-bit key to encrypt or decrypt 64-bit data blocks. It is an unbalanced Feistel network with 32 rounds. Unbalanced Feistel simply means a Feistel cipher wherein the two halves of plain text for each block are not the same size. For example, a 64-bit block might be divided into a 48-bit half and a 16-bit half rather than two 32-bit halves.

RC 4

Ron Rivest created this algorithm in 1987. The RC stands for Ron’s Cipher. It is the most widely used software stream cipher. The algorithm is used identically for encryption and decryption, as the data stream is simply XORed with the key.

RC 4 uses a variable-length key, from 1 to 256 bytes. That key constitutes a state table that is used for subsequent generation of pseudo-random bytes and then to generate a pseudo-random stream, which is XORed with the plain text to produce the cipher text.

The permutation is initialized with a variable-length key, typically between 40 and 256 bits, using the key-scheduling algorithm (KSA). Once this has been completed, the stream of bits is generated using the pseudo-random generation algorithm (PRGA).

Asymmetric Cryptography

Asymmetric cryptography (also called public key cryptography) is essentially the opposite of single-key encryption. With any public key encryption algorithm, one key (called the public key) is used to encrypt a message and another (called the private key) is used to decrypt the message. You can freely distribute your public key so that anyone can encrypt a message to send to you, but only you have the private key and only you can decrypt the message. This is why asymmetric cryptography is sometimes called public key cryptography.

For some reason, all security books and cryptography books like to use the fictitious characters Alice, Bob, and Eve to explain how asymmetric cryptography works, and I will continue that tradition.

Let’s assume Alice would like to send Bob a message. But Alice is concerned that Eve might eavesdrop (thus her name!) on the communication. With public key/asymmetric cryptography, Alice will get Bob’s public key and use that to encrypt the message she sends to Bob. Now should Eve intercept the message and have access to Bob’s public key, it is okay. That key won’t decrypt the message. Only Bob’s private key will, and this he safeguards. You can see this in Figure 7-3.

Images

Figure 7-3 Alice and Bob use asymmetric crypto

Images
EXAM TIP Expect questions regarding what key Alice or Bob will need to use to encrypt a message and to decrypt it. This is also true for many other certification tests, such as the CISSP.

RSA

The RSA is perhaps the most widely known asymmetric algorithm. This is a public key method developed in 1977 by three mathematicians: Ron Rivest, Adi Shamir, and Len Adleman. The name RSA is derived from the first letter of each mathematician’s last name. The algorithm is based on prime numbers and the difficulty of factoring a large number into its prime factors.

Images
EXAM TIP The CCFP and other security certifications will definitely ask you about RSA. However, you just need to know the basic facts already listed. You don’t need to know the details of the algorithm. In fact, no certification test asks about the details of any of the cryptography algorithms.

While not critical for the CCFP exam, it might help your understanding to look at the RSA algorithm. First, to create the key, you start by generating two large random primes, p and q, of approximately equal size. Now you need to pick two numbers that, when multiplied together, create a product that is the size you want (i.e., 2048 bits, 4096 bits, etc.)

Now multiply p and q to get n.

Let n = pq.

The next step is to multiply Euler’s totient for each of these primes. Now you are probably asking, what is Euler’s totient? Don’t let the name scare you—the concept is pretty simple. First, let’s define the term co-prime. Two numbers are considered co-prime if they have no common factors. For example, if the original number is 8, then 8 and 9 would be co-prime. 8’s factors are 2 and 4; 9’s factors are 3—there is nothing in common (1 is not considered). Well, the famous mathematician Euler (pronounced “oiler”) asked a simple question: Given a number X, how many numbers smaller than X are co-prime to X? We call that number Euler’s totient, or just the totient. It just so happens that for prime numbers, this is always the number minus 1. For example, 7 has 6 numbers that are co-prime to it.

When you multiply two primes together, you get a composite number, and there is no easy way to determine the Euler’s totient of a composite number. Euler noticed something else interesting. If you multiply any two prime numbers together, the Euler’s totient of that product is the Euler’s totient of each prime multiplied together. So our next step is:

Let m = (p – 1)(q – 1)

So basically, m is the Euler’s totient of n.

Now we are going to select another number—we will call this number e . We want to pick e so that it is co-prime to m.

We are almost done generating a key. Now we just find a number d that, when multiplied by e and modulo m, would yield a 1 (Note: Modulo means to divide two numbers and return the remainder. For example, 8 modulo 3 would be 2). In other words:

Find d, such that de % m = 1.

Now you will publish e and n as the public key and keep d and n as the secret key. To encrypt, you simply take your message raised to the e power and modulo n :

= M ^e % n

To decrypt, you take the cipher text and raise it to the d power modulo n :

P = C ^d % n

Now don’t panic. Most people without a background in number theory have to study this for some time before they are comfortable with it. And you don’t need to know the RSA algorithm in depth to pass the CCFP. I presented it here so you would understand how public key/asymmetric cryptography works.

Let’s look at an example that might help you understand. Of course, RSA would be done with very large integers. To make the math easy to follow, we will use small integers in this example, which is from Wikipedia¹:

1. Choose two distinct prime numbers, such as p = 61 and q = 53.

2. Compute n = pq giving n = 61 × 53 = 3233.

3. Compute the totient of the product as φ(n) = (p − 1)(q − 1) giving φ(3233) = (61 − 1)(53 − 1) = 3120.

4. Choose any number 1 <e < 3120 that is co-prime to 3120. Choosing a prime number for e leaves us only to check that e is not a divisor of 3120. Let e = 17.

5. Compute d , the modular multiplicative inverse of yielding d = 2753.

The public key is (n = 3233, e = 17). For a padded plain text message m, the encryption function is m ¹⁷ (mod 3233).

The private key is (n = 3233, d = 2753). For an encrypted cipher text c, the decryption function is c ²⁷⁵³ (mod 3233).

RSA is based on large prime numbers. Now you might think, couldn’t someone take the public key and use factoring to derive the private key? Well, hypothetically, yes. However, it turns out that factoring really large numbers into their prime factors is pretty difficult. There is no efficient algorithm for doing it. And when we say large numbers, RSA can use 1024-, 2048-, 4096-bit, and larger keys. Those make for some huge numbers. Of course, should anyone ever invent an efficient algorithm that will factor a large number into its prime factors, RSA will be dead.

Diffie-Hellman

Diffie-Hellman, which was the first publicly described asymmetric algorithm, is a cryptographic protocol that allows two parties to establish a shared key over an insecure channel. In other words, Diffie-Hellman is often used to allow parties to exchange a symmetric key through some unsecure medium, such as the Internet. It was developed by Whitfield Diffie and Martin Hellman in 1976.

One problem with working in cryptology is that much of the work is classified. You could labor away and create something wonderful that you cannot tell anyone about. Then, to make matters worse, years later, someone else might develop something similar and release it, getting all the credit. This is exactly the situation with Diffie-Hellman. It turns out that a similar method had been developed a few years earlier by Malcolm J. Williamson of the British Intelligence Service, but it was classified.

Images
EXAM TIP CCFP, like so many security certifications, has a few algorithms it focuses on. If you get public key/asymmetric questions on your test, there is a good chance they will be about RSA or Diffie-Hellman.

DSA

Digital Signature Algorithm (DSA) is a patented algorithm (U.S. Patent 5,231,668) invented by David W. Kravitz. It was adopted by the U.S. government in 1993, with FIPS (Federal Information Processing Standard) 186 as the preferred standard for digital signatures. While any asymmetric algorithm can be used for digital signatures, this algorithm was designed specifically for that purpose. This, of course, begs the question of what is a digital signature.

Remember that cryptographic algorithms are about protecting confidentiality, ensuring that only the intended recipient can read the message. Well, digital signatures take asymmetric cryptography and reverse it so that they can protect integrity. Put another way, assume your boss sends you an e-mail telling you that you have done such a great job, he thinks you should take the next week off with pay. It would probably be a good thing to verify that this is legitimate e-mail, really sent from him, and not a prank. What a digital signature does is take the sender’s private key and encrypt either the entire message or a portion of it (like the signature block). Now anyone with the sender’s public key can decrypt that. So let us return to Alice and Bob to see how this works.

Bob wants to send Alice a message and make certain she knows it’s from him. So he signs it with his private key. When Alice uses Bob’s public key, the message decrypts and she can read it. Now suppose that Bob did not really send this. Instead, Eve sent it, pretending to be Bob. Since Eve does not have Bob’s private key, she had to use some other key to sign this message. When Alice tries to decrypt it (i.e., verify the signature) with Bob’s public key, she will get back gibberish, nonsense. This is shown in Figure 7-4.

Images

Figure 7-4 Digital signatures

Elliptic Curve

This algorithm was first described in 1985 by Victor Miller (IBM) and Neal Koblitz (University of Washington). Elliptic curve cryptography (ECC) is based on the fact that finding the discrete logarithm of a random elliptic curve element with respect to a publicly known base point is difficult to the point of being impractical to do. There are a number of variations, such as ECC-DH (ECC Diffie-Hellman) and ECC-DSA (ECC Digital Signature Algorithm). The real strength of ECC base systems is that you can get just as much security with a smaller key than with other systems, like RSA. For example, a 384-bit ECC key is as strong as a 2048-bit RSA key.

Cryptographic Hash

A cryptographic hash function has three properties. The first is that it is one-way. That means it cannot be “unhashed.” The second is that a variable-length input produces a fixed-length output. That means that no matter what size of input you have, you will get the same size output. Finally, there should be few or no collisions. That means if you hash two different inputs, you should not get the same output.

In previous chapters, we have discussed hashing an image of a drive to ensure that nothing was missed or altered when imaging the original drive. This is where the “variable-length input produces fixed-length output” and “few or no collisions” comes in. If you have a terabyte drive and you image it, you want to make sure your image is an exact copy. If you make a hash of the original and the image and compare them, you can verify that everything is copied exactly into the image. If it were not the case that a variable-length input produced a fixed-length output, then you would have hashes that were humongous and took forever to compute. Also, if two different inputs could produce the same output, then you could not verify the image was an exact copy.

MD5

This is a 128-bit hash that is specified by RFC 1321. It was designed by Ron Rivest in 1991 to replace an earlier hash function, MD4. MD5 produces a 128-bit hash or digest. It has been found to not be as collision resistant as SHA.

SHA

The Secure Hash Algorithm (SHA) is perhaps the most widely used hash algorithm today. There are now several versions of it, all of which are considered secure and collision free:

• SHA-1 This is a 160-bit hash function that resembles the earlier MD5 algorithm. This was designed by the NSA to be part of the DSA.

• SHA-2 This is actually two similar hash functions, with different block sizes, known as SHA-256 and SHA-512. They differ in the word size: SHA-256 uses 32-byte (256-bit) words, whereas SHA-512 uses 64-byte (512-bit) words. There are also truncated versions of each standardized known as SHA-224 and SHA-384. These were also designed by the NSA.

• SHA-3 This is the latest version of SHA. It was adopted in October 2012.

RipeMD

RACE Integrity Primitives Evaluation Message Digest (RipeMD) is a 160-bit hash algorithm developed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel. There exist 128-, 256- and 320-bit versions of this algorithm, called RIPEMD-128, RIPEMD-256, and RIPEMD-320, respectively. These all replace the original RIPEMD, which was found to have collision issues. The larger bit sizes make this far more secure than MD5 or RipeMD.

GOST

This hash algorithm was initially defined in the Russian national standard GOST R 34.11-94 Information Technology–Cryptographic Information Security–Hash Function. It produces a fixed-length output of 256 bits. The input message is broken up into chunks of 256-bit blocks. If a block is less than 256 bits, then the message is padded by appending as many zeros to it as are required to bring the length of the message up to 256 bits. The remaining bits are filled up with a 256-bit integer arithmetic sum of all previously hashed blocks, and then a 256-bit integer representing the length of the original message, in bits, is produced. It is based on the GOST block cipher.

Windows Passwords

Verifying images of drives is not the only purpose for hashes. Hashing is how Windows stores passwords. For example, if your password is “password,” then Windows will first hash it, producing something like:

0BD181063899C9239016320B50D3E896693A96DF

It will then store that in the Security Accounts Manager (SAM) file in the Windows System directory. When you log on, Windows cannot “unhash” your password, so what Windows does is take whatever password you type in, hash it, and then compare the result with what is in the SAM file. If they match exactly, then you can log in.

Getting Around Windows Passwords

Now how does this relate to forensics? Well, knowledge of hashing provides a means to circumvent Windows passwords. You may encounter a Windows machine that is suspected of having evidence, and the suspect will not reveal the password. Once you have obtained a warrant to search the drive, you can use the method described here to circumvent the password.

In 1980, Martin Hellman described a cryptanalytic technique that reduces the time of cryptanalysis by using precalculated data stored in memory. This technique was improved by Rivest before 1982. Basically, these types of password crackers are working with precalculated hashes of all passwords available within a certain character space, be that a–z or a–zA–z or a–zA–Z0–9, etc. This is called a rainbow table. If you search a rainbow table for a given hash, whatever plain text you find must be the text that was input into the hashing algorithm to produce that specific hash.

Now this still does not solve the issue of cracking Windows passwords. When Windows boots up, it seizes the SAM file and you cannot copy it off the machine. (Besides, you would need to log in to get the SAM file.) Tools like OphCrack can be used to boot to Linux, grab the SAM file, and take the hashes from that file. Then they search their rainbow tables to see if they can find a match. If a match is found, that must be the user’s password.

Salt

In terms of hashing, salt refers to random bits that are used as one of the inputs to the hash. Essentially, the salt is intermixed with the message that is to be hashed. Salt data makes rainbow tables harder to implement. The rainbow table must account for the salting algorithm as well as the hashing algorithm. Thus, the method is effective against rainbow table attacks.

Steganography

Steganography is the art and science of writing hidden messages in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message—a form of security through obscurity. Often, the message is hidden in some other file, such as a digital picture or audio file, so as to defy detection.

The advantage of steganography over cryptography alone is that messages do not attract attention to themselves. If no one is aware the message is even there, then they won’t even try to decipher it. In many cases, messages are encrypted and hidden via steganography.

There are some basic steganography terms you should know:

• Payload is the data to be covertly communicated. In other words, it is the message you wish to hide.

• The carrier is the signal, stream, or data file into which the payload is hidden.

• The channel is the type of medium used. This may be still photos, video, or sound files.

The most common way steganography is accomplished today is via least significant bits (LSB). In every file, there are a certain number of bits per unit. For example, an image file in Windows is 24 bits per pixel. If you change the least significant of those bits, then the change is not noticeable with the naked eye. And one can hide information in the LSB of an image file. With LSB replacement, certain bits in the carrier file are replaced.

Historical Steganography

In modern times, steganography means digital manipulation of files to hide messages. However, the concept of hiding messages is not new. As with cryptography, many methods have been used throughout history:

• The ancient Chinese wrapped notes in wax and swallowed them for transport. This was a crude but effective method of hiding messages.

• In ancient Greece, a messenger’s head might be shaved, a message written on his head, and then his hair was allowed to grow back. Obviously, this method required some time to be available.

• In 1518, Johannes Trithemius wrote a book on cryptography and described a technique where a message was hidden by having each letter taken as a word from a specific column.

• During WWII, the French Resistance sent messages written on the backs of couriers using invisible ink.

• Microdots are images/undeveloped film the size of a typewriter period embedded in innocuous documents. Spies supposedly used these during the Cold War.

• Also during the Cold War, the U.S. Central intelligence Agency used various devices to hide messages. For example, they developed a tobacco pipe that had a small space to hide microfilm, but could still be smoked.

Methods and Tools

Steganophony is a term for hiding messages in sound files. This can be done with the LSB method or other methods, such as echo hiding. This method adds extra sound to an echo inside an audio file, and the extra sound conceals information.

Information can also be hidden in video files. There are various methods to accomplish this. Discrete cosine transform is often used for video steganography. This method alters values of certain parts of the individual frames. The usual method is to round up the values.

A number of tools are available for implementing steganography. Many are free or at least have a free trial version. A few of these tools are listed here. Two are discussed in detail in the sections that follow.

• QuickStego Very easy to use, but very limited.

• Invisible Secrets Much more robust, with both a free and commercial version.

• MP3Stego Specifically for hiding payload in MP3 files.

• Stealth Files 4 This works with sound files, video files, and image files.

• Snow Hides data in whitespace.

• StegVideo Hides data in a video sequence.

Invisible Secrets

Invisible Secrets is a popular tool for steganography. It is important that a forensic examiner be familiar with the more widely known forensic tools and be able to utilize them. You can download Invisible Secrets from www.invisiblesecrets.com/download.html. Let’s walk through the process of hiding a message using this tool. First, select an image—anything you like. I am going to use an image of the cover of this book. Then take a text editor, such as Notepad, and put some text in it. Now we will hide that message in the image file.

First select Hide/Unhide Files. This is shown in Figure 7-5 with the option you should select circled.

Images

Figure 7-5 Invisible Secrets Hide/Unhide

Then hide the text file you created (I named mine “stegtest.txt”) as shown in Figure 7-6.

Images

Figure 7-6 Hiding a test file in Invisible Secrets

Now select the image file in which you wish to hide this text. This is shown in Figure 7-7.

Images

Figure 7-7 Choosing the carrier file in Invisible Secrets

Incidentally, the current version of Invisible Secrets allows you to hide data in .wav files as well as images. This is shown in Figure 7-8.

Images

Figure 7-8 Invisible Secrets different carrier types

The next screen has you select a password for this hidden data, and has you (optionally) encrypt the data as well. For this demonstration, we will leave it unencrypted and use a trivial password. You can see this in Figure 7-9.

Images

Figure 7-9 Invisible Secrets final steps

In the final step, you simply pick a location and name for the output file. This will include your hidden message (the payload) and the file you wish to hide it in (the carrier file). You can see this step in Figure 7-10.

Images

Figure 7-10 The last step in Invisible Secrets

The resulting image, shown in Figure 7-11, is indistinguishable from the original carrier file. You can enlarge it with Photoshop, examine it carefully, and you won’t be able to see that it contains hidden information.

Images

Figure 7-11 The resulting image in Invisible Secrets

MP3Stego

This is another tool, one you can download for free from www.petitcolas.net/fabien/steganography/mp3stego. This tool is used to hide data in MP3 files. From the MP3 Stego readme file are these instructions:

encode -E data.txt -P pass sound.wav sound.mp3

compresses sound.wav and hides data.txt. This produces the output called sound.mp3. The text in data.txt is encrypted using “pass”.

decode -X -P pass sound.mp3

uncompresses sound.mp3 into sound.mp3.pcm and attempts to extract hidden information. The hidden message is decrypted, uncompressed and saved into sound.mp3.

You can see this is a very simple program to use.

Steganalysis

If you can hide data in images or other carrier files, there must be some way to detect it, right? One of the most common methods is to analyze close color pairs. By analyzing changes in an image’s close color pairs, the steganalyst can determine if LSB substitution was used. Close color pairs consist of two colors whose binary values differ only in the LSB. You would expect a certain number of pixels to vary only in the LSB, but if the number of such pixels is greater than one would expect, that might indicate steganographically hidden data.

There are several methods for analyzing an image to detect hidden messages. The Raw Quick Pair (RQP) method is one. The RQP method is essentially an implementation of the close color pair concept. This method is based on statistics of the numbers of unique colors and close color pairs in a 24-bit image. RQP analyzes the pairs of colors created by LSB embedding.

Another option uses the chi-squared method from statistics. Chi-square analysis calculates the average LSB and builds a table of frequencies and pairs of values. Then it performs a chi-square test on these two tables. Essentially, it measures the theoretical versus calculated population difference.

Images
NOTE Steganography is an important concept in forensic analysis. Technically sophisticated child pornographers frequently hide their illicit images in innocuous images. It is also a popular technique with terrorists. During the raid on Osama bin Laden’s compound, which resulted in his death, a number of computer hard drives were found. It was discovered that Osama bin Laden was communicating with Al-Qaeda members using messages hidden in pornographic images.

Cryptanalysis

Cryptanalysis is a daunting task. It is essentially the search for some means to break through some encryption. And, unlike what you see in the movies, it is a very time-consuming task that frequently leads to only partial success. Cryptanalysis involves using any method to decrypt the message that is more efficient than simple brute-force attempts. (Remember that brute force is simply trying every possible key.)

A cryptanalysis success is not necessarily breaking the target cipher. In fact, finding any information about the target cipher or key is considered a success. There are several types of cryptographic success:

• Total break The attacker deduces the secret key.

• Global deduction The attacker discovers a functionally equivalent algorithm for encryption and decryption, but without learning the key.

• Instance (local) deduction The attacker discovers additional plain texts (or cipher texts) not previously known.

• Information deduction The attacker gains some Shannon information about plain texts (or cipher texts) not previously known.

• Distinguishing algorithm The attacker can distinguish the cipher from a random permutation.

In this section, we will look at some common methods for cryptanalysis.

Frequency Analysis

This is the basic tool for breaking most classical ciphers—it is not useful against modern symmetric or asymmetric cryptography. It is based on the fact that some letters and letter combinations are more common than others. In all languages, certain letters of the alphabet appear more frequently than others. By examining those frequencies, you can derive some information about the key that was used. Remember in English the words the and and are the two most common three-letter words. The most common single-letter words are I and a . If you see two of the same letters together in a word, it is most likely ee or oo.

Kasiski

Kasiski examination was developed by Friedrich Kasiski in 1863. It is a method of attacking polyalphabetic substitution ciphers such as the Vigenére cipher. The idea is to first examine the cipher text (or, even better, multiple cipher texts) just to see if you can determine how long the keyword is. Once the length of the keyword is discovered, you line up the cipher text in columns, with the number of columns being equal to the length of the keyword. Then, each column can be treated as a monoalphabetic substitution cipher. Then, each column can be cracked with simple frequency analysis. The method simply involves looking for repeated strings in the cipher text. The longer the cipher text, the more effective this method will be. This is sometimes also called Kasiski’s test or Kasiski’s method.

Modern Methods

As mentioned, the level of success when cracking modern cryptographic methods depends on a combination of resources. Those resources are computational power, time, and data. If you had an infinite amount of any of these, you could crack any modern cipher. But you won’t have an infinite amount.

Known Plain Text Attack

This method is based on having a sample of known plain texts and their resulting cipher texts. Then you use this information to try to ascertain something about the key used. It is easier to obtain known plain text samples than you might think. Consider e-mail. Many people, myself included, use a standard signature block. If you have ever received an e-mail from me, you know what my signature block is. Then, if you intercept encrypted e-mails I send, you can compare the known signature block to the end of the encrypted e-mail. You would then have a known plain text and the matching cipher text to work with. However, this requires many thousands of known plain text samples to be successful.

Chosen Plain Text Attack

This is closely related to the known plain text attack, with the difference being that the attacker has found a method to get the target to encrypt messages the attacker chooses. This can allow the attacker to attempt to derive the key used and thus decrypt other messages encrypted with that key. This can be difficult, but is not impossible. Again, this requires many thousands of chosen plain text samples to be successful.

Cipher Text Only

The attacker only has access to a collection of cipher texts. This is much more likely than known plain text, but also the most difficult. The attack is completely successful if the corresponding plain texts can be deduced, or even better, the key. The ability to obtain any information at all about the underlying plain text is still considered a success.

Related-Key Attack

This is like a chosen plain text attack, except the attacker can obtain cipher texts encrypted under two different keys. This is actually a very useful attack if you can obtain the plain text and matching cipher text.

These are the basic approaches used to attack block ciphers. There are other methods that are beyond the scope of this book and that the CCFP won’t mention, but I will give you a brief overview here.

Differential Cryptanalysis

Differential cryptanalysis is a form of cryptanalysis applicable to symmetric key algorithms. This was invented by Eli Biham and Adi Shamir. Essentially, it is the examination of differences in an input and how that affects the resultant difference in the output. It originally worked only with chosen plain text. There is a pretty good tutorial on this method at www.theamazingking.com/crypto-diff.php.

Linear Cryptanalysis

Linear cryptanalysis is based on finding affine approximations to the action of a cipher. Note that an affine function is a specific mathematical function that uses some linear function and some constant. The details on the mathematics are beyond the scope of this book. It is commonly used on block ciphers. This technique was invented by Mitsuru Matsui. It is a known plain text attack and uses a linear approximation to describe the behavior of the block cipher. Given enough pairs of plain text and corresponding cipher text, bits of information about the key can be obtained. Obviously, the more pairs of plain text and cipher text one has, the greater the chance of success. There is a pretty good tutorial on this method at http://www.theamazingking.com/crypto-linear.php.

Log Tampering

In addition to technically advanced techniques such as steganography and cryptography, there are simpler methods for an attacker to hide their tracks. The most obvious is log tampering. In any system that an attacker is interacting with, there is likely to be some logging of events. These will leave evidence of the attack. The savvy attacker will want to eliminate such evidence.

Log Deletion

One can easily delete any log, in Windows or in Linux. However, it will then be obvious that someone tampered with the logs. If you are examining the logs for a database server and see that all log entries prior to 2:30 A.M. are gone, that is a good indication that someone deleted the log. You may not know what they did, since the log is now gone, but you will know something happened.

Auditpol

This is a Windows tool² that is a bit more subtle. It will allow the attacker to turn off auditing, do whatever they intend to do, then turn auditing back on again. However, a savvy forensic analyst can still find evidence that this was done. If you see a server log that has entries approximately every 2 minutes, but there is a gap of 15 minutes with no entries, that could indicate someone used Auditpol.

Winzapper

This tool³ is the attacker’s friend. It allows the attacker to simply delete, or zap, specific log entries. It is difficult, if not impossible, to detect. You need to have local administrator privileges to use Winzapper.

Other Techniques

There are a variety of techniques an attacker can use to hide his or her tracks. Some of these involve masking or removing evidence that a crime has occurred; others involve making it difficult to track the criminal.

Onion Routing

Onion routing was originally developed by the U.S. Department of the Navy. The concept is this: Each packet is encrypted and a header is put on it. That header has the destination address of the next onion router in the network and the source address of the next onion router in the network. At each router, the packet is decrypted, but only showing the next “hop” in the destination. Until the final destination is reached, the packet is not fully decrypted. This means that if the packet is intercepted en route, you cannot tell its origin or ultimate destination. You can see this in Figure 7-12.

Images

Figure 7-12 Onion routing

The concept has been modified by civilians into what are called Tor networks. A Tor network uses a series of servers, each functioning like an onion router.

Spoofing

When a criminal is remotely accessing the victim, they frequently don’t want to be traced to their actual computer. There are two primary ways to prevent this. The first is IP spoofing, which mean simply using a different IP address. There are tools to do this, and it can be done manually.

Another technique is MAC spoofing. The MAC address of a network card is set at the factory and cannot be changed. But you can trick your machine into broadcasting a different MAC address than the one actually on the network card. Techniques like this are difficult for a forensic analysis to counter.

Wiping

The ultimate in evidence hiding is to wipe the drive. As we mentioned previously in this book, the U.S. Department of Defense recommends overwriting data seven times to truly wipe it. So there is a good chance that a criminal who simply formatted a drive has still left some evidence. However, a technically skilled criminal might employ methods that are more thorough:

• dd If you recall from earlier chapters, the Linux command dd can be used to truly forensically wipe a drive.

• Degaussing Recall that a hard drive stores data magnetically. Thus, exposing a hard drive to a strong magnetic force can permanently wipe the data.

Tunneling

Another technique that can be used to hide data is tunneling. Tunneling is the process of encrypting network traffic. This is done for many legitimate reasons. For example, when you log in to your bank account online, the traffic is encrypted. When you connect to your workplace network from a remote site, the traffic is encrypted. Sometimes attackers will use a VPN (Virtual Private Network) to encrypt their traffic to keep it from being analyzed by packet sniffers, intrusion detection systems, or other security measures.

Chapter Review

In this chapter we have examined a variety of ways that one can hide evidence. These antiforensic techniques are likely to be employed by technically skilled criminals. For the CCFP exam, the most important ciphers to know are Caesar, Vigenere, DES, AES, Blowfish, RSA, and Diffie-Hellman. The concepts of hashing and rainbow tables are also important. If you would like to learn more about cryptography, steganography, and hashing, visit www.CryptoCorner.com.

We also discussed steganography and even used a widely available tool to hide data in an image. Remember the most common modern steganographic technique is the LSB method. Also keep in mind that there are a variety of ways an attacker can tamper with logs or simply wipe the drive of all evidence on it.

Questions

1. Which of the following is the oldest known encryption method?

A. PGP

B. Multialphabet

C. Caesar cipher

D. Cryptic cipher

2. What type of encryption uses a different key to encrypt the message than it uses to decrypt the message?

A. Private key

B. Asymmetric

C. Symmetric

D. Secure

3. Which of the following encryption algorithms uses three key ciphers in a block system and uses the Rijndael algorithm?

A. DES

B. RSA

C. AES

D. NSA

4. The most common way steganography is accomplished is via _________.

A. MSB

B. ASB

C. RSB

D. LSB

5. Hiding data in sound files is called _________.

A. Steganography

B. Steganohiding

C. Cryptography

D. Steganophony

Answers

1. C. The Caesar cipher is the oldest known cipher and is purported to have been used by Julius Caesar.

2. B. Asymmetric or public key cryptography uses two different keys. RSA is the most widely used asymmetric cipher today.

3. C. AES uses 128-, 192-, and 256-bit keys.

4. D. The most common way to hide data is called least significant bit (LSB).

5. D. Steganophony is essentially steganography done with sound files.

References

1. http://en.wikipedia.org/wiki/RSA_(cryptosystem).

2. http://technet.microsoft.com/en-us/library/cc731451.aspx.

3. http://ntsecurity.nu/toolbox/winzapper/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 7 Hidden Files and Antiforensics

Create new playlist

Sign In

Sign Up

CHAPTER 7

Hidden Files and Antiforensics

Cryptography

The History of Encryption

The Caesar Cipher

ROT 13

Scytale

Atbash Cipher

Multialphabet Substitution

Vigenere Cipher

Transposition Ciphers

Enigma Machine

Modern Cryptography

Binary Operations

Symmetric Encryption

Data Encryption Standard

Feistel Ciphers

Blowfish

AES

IDEA Encryption

GOST

Serpent

Skipjack

RC 4

Asymmetric Cryptography

RSA

Diffie-Hellman

DSA

Elliptic Curve

Cryptographic Hash

MD5

SHA

RipeMD

GOST

Windows Passwords

Getting Around Windows Passwords

Salt

Steganography

Historical Steganography

Methods and Tools

Invisible Secrets

MP3Stego

Steganalysis

Cryptanalysis

Frequency Analysis

Kasiski

Modern Methods

Known Plain Text Attack

Chosen Plain Text Attack

Cipher Text Only

Related-Key Attack

Differential Cryptanalysis

Linear Cryptanalysis

Log Tampering

Log Deletion

Auditpol

Winzapper

Other Techniques

Onion Routing

Spoofing

Wiping

Tunneling

Chapter Review

Questions

Answers

References

Table of Contents for
Chapter 7 Hidden Files and Antiforensics