Chapter 8. Extracting Hidden Data

In This Chapter

  • Avoiding being fooled by covert operations

  • Driving through digital roadblocks

  • Beating the odds

  • Combating camouflage

  • Passing through passwords

  • Breaking in

As a computer forensic investigator, you eventually run into evidence whose very existence is hidden (unseen) or that has been hidden in plain sight (disguised). That is, you're confronted with invisible electronic evidence. Criminals may hide their files so that you don't even know that the files exist — at least not without effort. When insidious camouflage tactics are in play, you're not only involved in detective work — you're also engaged in combat plus detective work.

Your challenge is to discover covert attempts and break through them to extract hidden information. This area of computer forensics is arguably the most intriguing. You're matching wits with a criminal mind and playing mental chess games using digital pieces. Outsmarting someone who has gone to great lengths to hide data feels good, but you have to pay a price for this excitement. You also face the dull wait for software to come back with a clue to help you break the code — a password or hidden piece of data. You may experience the agony of defeat if cracking the password or defeating the encryption is beyond the technical means at your disposal. Then you might need to use alternative means of extracting the evidence. In this chapter, you find out how data can become hidden or disguised and how to extract it.

Recognizing Attempts to Blind the Investigator

Cyberspace is also, in part, criminal space. It's a medium where criminals apply ancient methods in digital disguise to remain undetectable.

Hiding data has been done by criminals and governments for thousands of years using techniques such as a wrap-around cipher (shifting the base alphabet over such as A=B, B=C, and so forth) used by Julius Caesar to today's stego (covered writing) data hiding techniques. The Greeks used to tattoo a message on a person's shaved head and a decoy message in his hand. When the inked person's hair grew back, he was sent out and had his head shaved again by the recipient to reveal the message.

The goal of data disguise is to hide the message. Hiding is done using one or more of the following three tactics. For simplicity, the term hiding refers to all of them. You can hide a message by making it

  • Invisible: Make the message unseen to hide its very existence.

  • Disguised: Hide the message in an object or item that looks innocuous so that the message isn't detected, such as in the image of a book cover.

  • Unreadable: Use techniques to make the information undecipherable to anyone except the intended recipient without attempting to hide its existence or disguise what it is. An example is the use of encryption.

The recipient would know of the scheme, be able to locate the message, and have the code key or ability to convert the message into readable form. In drastic cases, the message or messenger could get destroyed if the data was tampered with.

If the data is hidden, how do you know it's there? You don't know unless you try. (Now intrigue comes into play!) No magic formula or marker exists to guide you in the detection of hidden data. Fortunately, detection and cracking tools can analyze images for signs, such as overly large files and uneven bit mapping. You need to know when and where to use these tools. There are so many ways to hide data that you need to use various tools and techniques to ferret out hidden data.

As the computer forensic investigator, you have to look for signs that data hiding techniques are being used. For example, an engineering firm suspected that an employee was stealing valuable intellectual property (IP) by transmitting it from the firm's network. Investigators began looking for e-evidence on the local hard drives, but didn't find any. The next logical item to check were the company's e-mail logs. Investigators found two e-mails with harmless-looking image attachments sent by the employee of interest — the suspect. (When steganography is used to hide content in image files, the size of the files can become huge.) Sending huge file attachments creates suspicion. Using stego detection software, the investigators revealed that the images were hiding two of the company's high-value IP engineering specifications. The suspect had used stego to hide the IP within image files.

Encryption and compression

Cryptography is the science of writing in secret codes. The formal definition for cryptography is the practice and study of hiding information with the purpose to protect information from being read or understood by anyone except the intended recipient. In computer forensics, you deal with two types: encryption and compression.

Both encryption and compression make use of an algorithm to rewrite the initial data. They differ in the uses they're designed for and how they're analyzed:

  • Encryption: Readable plain text (data, a message, or any type of file) is scrambled by applying an algorithm (the cipher) to it to convert it into unreadable ciphertext. The ciphertext, plus its key, converts the text back to its original, readable form. Encryption has one and only one purpose: to make information unreadable to anyone other than the intended recipient. Encrypted files are fairly easy to spot because they usually have common file structures or extensions.

  • Compression is related to encryption in that a content-altering algorithm is applied to the data or message. But compression has a different purpose: to shrink the size of the file. The result is a file that's unrecognizable from its original form, although the reason is compression itself and not any form of data hiding. Compression adds a layer of complexity to forensics, but compressed files aren't themselves suspicious.

Note

Don't confuse compression with encryption. The nontechnical difference is the intent of the user. Other differences are described in this list:

  • Compression saves space by reducing file size.

    Encryption increases file size.

  • Compression software packages may put a password on a compressed file, but it's in no way an encrypted file.

    In fact, encrypting a compressed file increases file size, which makes the compression moot! Most password-protected compressed files are so weak that shareware password crackers are usually sufficient to crack them.

  • Compressed files can be uncompressed without any keys.

    All you need is the software and — voilà — it's uncompressed.

Two encryption methods affect computer forensic investigators:

  • Asymmetric: This two-key system uses a public key and a private key. As shown in Figure 8-1, the public key is used to encrypt the data. The recipient's private key is the only key that can decrypt the data. During encryption, the private key is produced by creating a key-pair. The genius of this system is that the user gives half of the key to the world by way of the public key but keeps the private key private. No one else can decrypt the data easily, or at all, without the private key.

    This method is a bit more complicated to implement, but after the asymmetric system is in place, it is — unfortunately for forensics — one of the more secure methods of encrypting data.

    The asymmetric encryption process uses a public key and a private key.

    Figure 8.1. The asymmetric encryption process uses a public key and a private key.

  • Symmetric: In this one-key system, the single key is shared by the sender and receiver, as shown in Figure 8-2. The same key is used to encrypt and decrypt the data, which makes the security of the key harder to protect.

    In addition to making it harder to protect the keys from falling into the wrong hands because of a one key design, symmetric keys tend to be shorter and easier to crack with the right equipment.

    Symmetric encryption uses a single shared key.

    Figure 8.2. Symmetric encryption uses a single shared key.

Data hiding techniques

So many methods of hiding data exist that even an entire book on the subject would be incomplete because a new technique would come to light before the print was dry — or the last word typed. In this section, we show you the basic techniques used by most people who try to hide data. This list isn't complete by any means, but it gives you a fighting chance by giving you an idea where to start looking.

Each of these tactics (with the exception of steganography) is fairly simple to spot, and can be defeated with specialized tools when used individually. Real problems occur when savvy criminals use a combination of data hiding techniques to obliterate their tracks. For example, someone could encrypt a file using asymmetric encryption such as PGP, and then embed the file in an audio file with the stego program S-Tools.

File extensions

A widely used and popular method of hiding a file type is to simply change the extension at the end of a filename. Try it:

  1. Change the .doc extension on an unimportant Word document to .xls. Click Yes when the warning message appears.

    The icon changes from a Word icon to an Excel icon.

  2. Double-click the file to try to open it.

    Because the extension indicates that the file is an Excel file, Excel opens. But the file fails to open because Excel can't open Word files.

  3. Launch Word and then open the file with the .xls extension.

    The file opens.

  4. Change the .xls extension back to .doc and notice that the icon changes too.

  5. Double-click the file to open it.

    It opens!

To find out whether an extension has been changed, you need to compare the file header to the file extension to make sure that they match. The file header is a sequence of bits at the beginning of a file and is used by programs to determine whether they can open the file. Chapter 11 covers file headers in greater detail.

Even when the file extension is changed (as you just did), the appropriate program still opens the file. On the other hand, when the file header is changed, the program no longer recognizes the file.

Note

Advanced users can change the file header easily by using a hex editor to make the file readable or unreadable. A hex editor is a program that can access data directly where it is stored without the need to know what type of format it is. Hex editors literally read data byte by byte and have the ability to change files at the byte level.

Hidden files

All operating systems assign attributes to files. One particular type of attribute is the ability to hide files, or more precisely, to mark files as hidden, which is comparable to files being marked for deletion. Hidden files are no more hidden than deleted files are deleted.

If you use Microsoft XP or Vista, you can show any hidden files by selecting the Show Hidden Files and Folders option in the Folder Options dialog box (see Figure 8-3). If you have an older file system, such as Microsoft Disk Operating System (DOS), use the Attrib command to either hide the file or make the file viewable.

Hidden shares

Hidden shares are shared areas on a network where files are stored but the shares are hidden. Hidden shares can be found on a local computer, but with networks everywhere, savvy criminals can use hidden shares on remote computers rather than risk using their own machines. Finding hidden shares is a bit more difficult than finding hidden files, but if you have the proper software, such as Legion V2.1 (www.packetstormsecurity.org), the process is straightforward. In addition to hiding shares, users sometimes also put passwords on hidden shares to protect them in depth.

Note

You can add a dollar sign symbol ($) to the end of the share so that it appears hidden and not visible from a network browser.

Windows shows files marked as hidden.

Figure 8.3. Windows shows files marked as hidden.

Alternate data streams

The uncommon data storage concept of alternate data streams (ADS) started with Windows NT version 3.51 and was introduced as a compatibility fix for the Macintosh HFS system. The implication of this fix is that you can piggyback data onto an existing file without changing the attributes of the first file — with the exception of the time stamp.

These data streams allow multiple forms of data to be associated with a file. A clever user can hide nefarious files in this manner because the files don't show up using a DIR (directory) command, nor do they appear in Windows Explorer. A few antivirus programs can pick up ADS information, but for the most part the majority of the computer world is oblivious to the existence of ADS. One ADS scanner you can try — it's free — is from Pointstone (www.pointstone.com).

Layers

The simplest example to demonstrate the use of a layer is to overlay a picture on text in a desktop publishing program. At first glance, you can see only the picture. After you move the picture, however, the text underneath is revealed. Another simple example is to change the font color of a document to the same color as its background. Open the file and all you see is what appears to be a blank page.

Tip

If you come across a blank file (a file which appears empty when you open it such as a blank Microsoft Word page), print it. Hidden text may appear on the hard copy.

Steganography

Steganography (or stego), a complex version of layering and data hiding, is a modern-day version of an ancient communication method. Stego refers to covered writing, such as invisible ink. In the digital world, this technique involves hiding a message inside an innocuous image, music file, or video that is posted on a Web site, e-mailed, or stored on a hard drive.

Imagine downloading an image of the Brooklyn Bridge from the Internet. As a suspicious investigator, you use your stego-detecting software to extract the message it's hiding. The problem is that, because many algorithms are used in stego, and without knowing which one was used, extracting the hidden information — or even knowing that it's there — is quite difficult.

Defeating Algorithms, Hashes, and Keys

When you encounter evidence that has been hidden in some way, your first decision is to decide how it was hidden. Was steganography used, or was the suspect using Windows Encrypting File System (EFS)? Much depends on what you find because if you use the wrong tools to attempt an extraction, you waste a lot of time and might even accidentally destroy your evidence.

Note

Always work with copies of the evidence and not the originals. If you destroy a copy, you can always make another one from your backup copy.

You can use several methods to defeat data hiding, and each one has its pros and cons. Often, the only way to find the key is to get it from the suspect! When you can't do that, you have to circumvent the crucial password by using one of these methods:

  • Brute force: Be brutal. In this procedure, you try every possible combination until you find the right one and crack the password. It involves trial and error. For simple hashes or algorithms, brute force works fairly well. As the key length increases, so do the number of possibilities. As you can tell from the following table, a 512-bit key has more than 154 zeros behind it!

    Key Length in Bits

    Number of Possible Combinations

    8

    256

    40

    1,099,511,627,776

    128

    18,446,744,073,709,600,000

    256

    1.15792 * 1077

    512

    1.3408 * 10154

    Note

    With the advances in cryptography algorithms and long key lengths, finding a key by brute force is often impractical. It's your last resort to password cracking.

  • Dictionary attack: Throw the book at them. This word-based trial-and-error method uses a dictionary of passwords or hashes that are compared to the hash value stored on the suspect's password file. Dictionaries contain not only standard words but also the names of celebrities, sports teams, TV shows, and Klingons (for Star Trek fans). Despite how often people are told to use good passwords, they don't. The most common passwords found in the field are password, letmein, 123456, and qwerty. Other popular passwords are the user's first name, the names of children or pets, addresses, phone numbers, and even Social Security numbers.

    Using a dictionary doesn't mean that you're limited to words or even letters. Most password cracking software uses letters, numbers, and even special characters as part of their dictionary attacks. In a good password-cracking software program using a decent dictionary, the word hello and the character substitution h3110 are cracked in less than a second.

  • Rainbow tables: These extensions of dictionaries are much larger hash databases that reside either on the Internet or with a private party. Rainbow tables let you use a larger database of possibilities than could be stored on a forensic computer.

  • Keystroke logger: Sometimes the best solution isn't to try to crack the encryption but, rather, to resort to sleuthing — when it's legal to do so, of course. Use a keylogger to capture the encryption keystrokes when the suspect types them. This method works well when you know that the person you're watching in a case is using some form of encryption. Keylogger features vary, but they all record the keystrokes typed on a computer keyboard. You can install keyloggers manually or use Trojan software (software that looks like it's for one purpose, such as playing a game, but in reality inserts another program on the computer).

    In addition to software keyloggers, physical keyloggers are installed between the keyboard and the back of a computer. This type of device is more difficult to install but cannot be detected by antivirus, anti-spyware, or anti-malware software.

  • Snooper software: This type of software is used in the same fashion as software keyloggers except that snooper software logs not only keystrokes but also almost any activity that occurs on the computer. Everything from screen shots to printouts, to chat sessions to e-mails, and even how many times you turned on the computer is archived. As you might imagine, this type of software takes up quite a bit of room on the storage device, but can be extremely useful when re-creating passwords or passwords on a suspect's computer. This method works well in a situation where you know ahead of time that the suspect is using a computer for illegal activities.

  • Suspect questioning: The suspect may be your only option to gain access to a password or passphrase. Although most people don't initially supply their passwords, after some legal arm-twisting, it sometimes does occur. In serious crime cases, though, don't count on a suspect helping you out!

  • Application specific integrated circuit (ASIC): This type of computer chip is specifically programmed to perform a task. The sole purpose of programming an ASIC decrypting system is to crack a specific type of encryption. Most computer forensic investigators don't have access to computers of this type, but government agencies do, and they can chew through a 40-bit encryption key in only seconds!

  • Cache checking: Certain applications and operating systems may put passwords in a cache temporarily — it's a smart place to search. Users who allow their systems to save their passwords so that they don't have to type them repeatedly are often saving their passwords in plain text mode in a cache area.

Finding Out-of-Sight Bytes

To hide information, criminals use special software programs to identify the least significant bits (LSBs) in a file and change them to contain hidden content without altering the file in a detectable way — in the background color of an image, for example. The best candidates for steganography (described at the beginning of this chapter) are byte-intensive digital pictures and audio files because they have a good supply of insignificant bits. Even a plain text document can hide content within the structure of the file. Certain areas in files (depending on whether they're video or audio or some other type) can be modified without compromising the quality of the file to the human eye or ear. The major forensic issue is exposing the presence of hidden data.

You have several methods to find clues to whether a file might have a hidden message in it:

  • Look for steganography software on the suspect's computer.

    A blatant clue is finding stego-creating software on the suspect's computer. The trick is to recognize the different types (experience is needed here) or known hash values of stego software using hash analysis. Many investigators have no clue how many steganographic software packages exist and may overlook the software as being "just part of the system." Figure 8-4 shows the steganography software JPHS for Windows. Notice that the software gives you details about the original file, the hidden file, and, toward the bottom, the new file with the stego.

  • Look for duplicate files.

    When you're making a forensic analysis and find a huge number of duplicate files, it's a glaring red flag. Stego often produces duplicate files because the original file is often left behind by sloppy criminals. When you find two files that look the same or are named the same, you have some major clues to work with. The types of files you find indicate the type of steganographic software that's used. Certain types of steganographic software work with only specific file types, such as video or audio files. Using forensics software, compare the files on a bit-for-bit scale with a hexadecimal editor to find the differences and further narrow the possibilities of which steganographic software was used.

    Tip

    Because you now have two files to work with, you can also run a statistical analysis to see which file falls outside the expected digital signatures of a typical file of its type.

  • Use stego detection software.

    Software such as Gargoyle (www.tucofs.com) can be used to detect files that have steganographic signatures. They may not always detect it, though, if a new algorithm was used or the algorithm is so good that it escapes detection.

    Stego software found on a computer.

    Figure 8.4. Stego software found on a computer.

    You use these basic tools to find files that have been used to hide data — and to discover the stego software that was used. Unless you use the same software, the chances of extracting the hidden data are zero.

Cracking Passwords

Passwords aren't of equal strength and may be only part of an attempt to authenticate a person's attempt to gain access to a computer or file they are protecting. From a user's perspective, a password is easy to remember but hard to guess. It can be a word, phrase, hash, or even biometric (something unique about someone biologically, such as a fingerprint or voice print). From a computer forensic investigator's perspective, a password is a barrier to get past to complete the investigation.

In most password applications, the password isn't even used to authenticate; rather, a hash value is used. A hash value (or simply hash) is the result of applying a one-way algorithm to a password. The reason for the one-way algorithm is to keep would-be intruders from reverse-engineering the hash back into the password. In other words, when you type a password, the computer is hashing the data you typed and comparing the result to the hashed password that's already saved. If both hashes match, the password is the same one that was entered originally.

Why use a hash in the first place? The most obvious reason is that storing plain text passwords isn't secure. Replace plain text passwords with a one-way hash value, and you exponentially increase the security of your passwords. To put this concept into perspective, suppose that an MD5 hash is used to hide a password. Roughly 8.5 billion combinations for an 8-character password exist, give or take a billion. Years would pass before you could hit all those combinations!

An even more secure version of a password is a passphrase, a phrase or short sentence that increases the number of possible combinations to strengthen the cryptographic hash. PGP (Pretty Good Privacy), a type of encryption software, is famous for the use of a passphrase and the difficulty of cracking the PGP hash. The MD5 has only a 128- bit key size, but PGP with passphrases can use, for example, a 2048-bit key size. Simply put, cracking the encrypted data or even the pass-phrase by using a brute force method is almost impossible.

Knowing when to crack and when not to crack

As in other areas of life, time and money determine the choices that are available to you. Whether you decide to crack a password or try other means to obtain data depends on how much time remains on the meter and how much money is on the table. The biggest obstacle to cracking encrypted passwords is the time it takes to crack a well-defended password. Money plays a role because it determines how many toys you have in your arsenal — and how big they are! For example, using a standard home computer, cracking a 40-bit key cipher takes from a day to several weeks. The deep-pocketed and well-equipped NSA spends less than one second cracking a simple 40-bit key cipher to several seconds for a well-defended 40-bit key cipher. If you have neither time nor money to waste and need to crack a password, be sure to read the rest of this chapter.

Disarming passwords to get in

You might have tried to no avail to obtain a password from a suspect and the e-evidence of the crime is sitting in the file you need to access. Because time and money are always an issue, start with simple solutions first and save the most time and money consuming solution for last. Use these guidelines not only as directions but also to inspire ways to work "outside the box:"

  • Crack the easy passwords first.

    Human nature dictates that few people use different passwords for all the files or accounts they are trying to protect. Most people simply reuse their passwords repeatedly and change them slightly every time. This situation can work to your advantage because some applications are much easier to crack than others. Cracking a password in a word processing or spreadsheet program is so easy that certain shareware programs can accomplish this task quite easily. After you have one of these passwords, try the password you cracked on the more difficult algorithms to see whether you have a winner. You might be surprised at how often this technique works. If it doesn't work, try substituting characters or variations of the password.

  • Grab clues.

    When a user asks the browser to remember a site password to avoid having to type it repeatedly, you catch a break. Look in the cache for the passwords. Usually they're not the ones you want, but they can give you a clue to the target password or hints to how the user thinks. In Figure 8-5, the Cain & Abel software shows a typical password cache dump. Pay attention to the line that reads Default Password: It shows you the password to access the Windows operating system.

  • Bring on the brute force crackers.

    If all else fails, you have to use password cracking software, such as Cain & Abel (www.oxid.it) or John the Ripper (www.openwall.com/john). They can crack a password by brute force or use a dictionary, depending on which clues you picked up during your search. Any hints you find to reduce the number of possibilities save you processing time in spades! If necessary, create a custom dictionary just for this case with all possible passwords that this particular user may have used. Be sure to check pet names and favorite teams.

    Cain's Secrets Dumper reveals passwords.

    Figure 8.5. Cain's Secrets Dumper reveals passwords.

Circumventing passwords to sneak in

Getting around passwords can either be a preventive measure or a nightmare. Usually, nothing exists between those two extremes.

Tip

If you can, install a keylogger or snooper software before a computer is seized and while the suspect is still using the computer.

If you have a bunch of evidence sitting on your desk with passwords protecting them, you might be able to peek into them depending on the type of application. Applications such as word processors, databases, and spreadsheets often save their data in formats that can be read with a hex editor. For example, you can view the file contents in raw form using a hex editor such as WinHex and not even have to break the password. Keep in mind that the formatting disappears and you see strange characters, but some of the data is in human-readable format.

Other extremely technical methods exist for attacking a file and working around a password. The cost in time and money, however, often isn't worth the effort unless your organization's initials are in the three-letter formats FBI, DHS, CIA, or NSA.

Decrypting the Encrypted

In many ways, trying to decrypt a file involves Hollywood hype more than it involves reality. Most cryptographers agree that a better solution is to break the key and use the "cracked" key rather than try to decrypt an entire file.

A good way to look at this quandary is to take a look at this chapter. This chapter alone has more than 34,000 characters in it, and trying to decrypt every single one with a strong key cipher would take literally thousands of years! Suppose that you create a key that's strong enough to withstand only a couple of months of analysis or that you're careless in storing the key. The bottom-line question is whether to crack a single key or an entire document? No clear-cut answer exists. Answers are based on a diagnosis of the situation and an educated guess at probability. Or, you might find a careless criminal.

Sloppiness cracks PGP

Another factor to consider in cracking encryption is that even heavily armored encryption algorithms, such as PGP, have been cracked at the key level. In the case of PGP, it wasn't the PGP system that was faulty — the users' careless use of the keys was their undoing. A chain is only as strong as its weakest link, which in this case happened to be the human link.

You can crack the key by using a keylogger. The user may actually leave a key stored on the computer allowing you easy access to cracking it, or (as is often the case) a user may not understand how the key really works and creates a weak or faulty key. You could even try tricking the user into revealing the key!

Desperate measures

One issue that most computer forensic analysts have no experience in handling is the self-destruct mechanism. Software self-destruct mechanisms are harder to detect than physical threats and are even harder to prevent. (After you pull the trigger, you can't call back the bullet.) A self-destruct system is usually a software program that destroys all evidence if a set of parameters are met such as wrong passwords or incorrect usernames.

Tip

If the sophistication of a suspect indicates that they may have installed a piece of code or a password fail-safe, make a backup copy of the backup copy and call in a professional who deals with software coding or security issues of this type. The last thing you need to happen to your evidence is to watch it disappear because the password fail-safe was set to wipe any data if you missed the password three times!

Just as in steganography, this type of defense mechanism is hard to spot if you aren't looking for it. You might receive a warning, and you might not, but much depends on how your procedures are set up to handle this contingency. If you follow the proper protocol of using a copy of the copy of the e-evidence, the payload can self-destruct and you can just reload and try again. If and when this happens to you, have a professional handle the "defusing" of the logic bomb.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset