Chapter 8. Understanding Information-Hiding Techniques

MOST USERS ARE UNAWARE THAT COMPUTERS CONTAIN large volumes of hidden data. In some cases, normal system use hides this data. In other cases, people deliberately conceal it using various techniques. As discussed in earlier chapters, hidden data includes fragments of deleted e-mail messages, backup copies of word processing files, deleted directory structures, and files reflecting a computer user's Internet browsing history. A careful examination of hidden data may tell a compelling story about document destruction or theft of intellectual property.

You can use a number of techniques to locate and retrieve hidden information. One method is to scan and evaluate alternate data streams. Another is to use rootkits. Finally, you can use steganalysis to find data hidden through steganography. This chapter looks at each of these techniques, as well as the history of data hiding.

History of Data Hiding

Data hiding has been used for centuries. Before the invention of the computer, people hid data using physical techniques. With the advent of electronic, stored-program computers, people began hiding data using digital approaches.

In the late 1970s, it was possible to hide data on 5¼-inch double-sided, double-density floppy disks. Microsoft's Disk Operating System (DOS) recognized only the first 80 tracks of a disk. Therefore, data could be hidden above track 80. This became one of the most common data hiding techniques in the early microcomputer era. Individuals and companies used this technique to protect copyrighted software. However, its effectiveness was short-lived because application programs could access the tracks beyond track 80. They did so by bypassing the operating system function calls and accessing disk controllers directly. To defeat this form of copy protection, individuals learned how to pirate software. More sophisticated approaches to data hiding have since been developed.

Just as it's possible to hide data on a floppy disk, it's possible to hide data on a network. Attackers can use a covert channel to pass information between computers on a network. The covert channel slips information through a firewall and past an intrusion detection system. It uses ports that most firewalls permit through. In addition, the covert channel conceals data in a packet header but appears to be an innocuous packet carrying ordinary information. Two popular covert channel techniques are protocol bending and packet crafting.

Protocol bending is a covert channel technique that embeds data in packet headers. It uses a network protocol for some unintended purpose. Typically, it involves embedding data in Transmission Control Protocol/Internet Protocol (TCP/IP) packets in unexpected places. This is very similar to hiding data in higher-level DOS tracks. In addition, someone who wants to hide data can "bend" Internet Control Message Protocol (ICMP) to establish a covert channel between network endpoints. It's possible to use the ICMP options field in an ICMP packet to convey application-layer covert data. ICMP packets are not expected to carry application-layer data, so most firewalls and intrusion detection systems don't inspect these packets.

Loki is one of the most widely known ICMP covert channel tools. In the absence of exhaustive packet analysis, Loki traffic looks like any other routine ICMP request-reply pattern for pings, source quenching, and so on. In fact, however, these ICMP packets transmit covert data. Another popular protocol bender is Reverse WWW Shell, which uses a form of protocol bending called shell shoveling over Hypertext Transfer Protocol (HTTP).

ICMP packets contain very short messages. No legitimate reason exists for large ICMP packets. If an ICMP packet is larger than 1,024 bytes, it may mean that a hacker is using the ICMP as a channel for transmitting covert messages (see Figure 8-1). The presence of large ICMP packets might indicate a compromised machine or some other kind of suspicious activity.

Packet crafting is a technique for embedding data in packet headers. Covert_TCP is a packet crafter that uses active channeling, generating its own packet train to create the channel. NUSHU is a passive channeler that piggybacks on packets transmitted to the TCP/IP stack by other applications.

A large ICMP packet that might carry a covert message.

Figure 8-1. A large ICMP packet that might carry a covert message.

Alternate Data Streams (ADS)

A data stream is a sequence of bytes. One or more streams make up a file. A stream is created when a file is created. The stream stores the file's contents. Extra streams within a file can also be created. These are called alternate data streams (ADS). Developers invented multi-stream files to allow a file to store metadata about the file itself outside the normal file system structures.

In a computer file, metadata provides information about a file. This information includes the means of creation, the purpose of the data, the time and date of creation, the creator or author of data, where the data was created, and what standards were used. Metadata is a technical description of a data element. It is not the data itself. Metadata is important in forensic investigations. For example, metadata may describe data that is relevant to a case. Or if data does not correspond to its metadata, there may be hidden data.

A fork is metadata associated with a file system object, such as a data element. A fork's size is arbitrary. In some cases, a fork may be even larger than the file's data. A data element may have one or more forks. In Microsoft's NTFS (NT File System), forks are known as ADS.

In Windows 2000, Microsoft began using ADS in NTFS to store information such as author or title file attributes and image thumbnails. ADS can store anything, so they are good hiding places. With Windows XP, Service Pack 2, Microsoft introduced the Attachment Execution Service, which stores details on the origin of downloaded files in ADS. Microsoft did this to help protect users from downloaded files that may present a risk. By using Attachment Execution Service, a user can automatically include or exclude files from specific origins.

Note

Microsoft makes no guarantee that it will support ADS in any newer Windows file systems. The metadata it contains, however, will U -carry forward in some form. Thus, forensic specialists must review operating system characteristics to determine how metadata is created and maintained and what data that metadata provides.

With ADS, each stream is unique, and multiple streams may be associated with a single file or directory. Streams can store any type of information that a normal file in an NTFS volume can store. Other file systems, such as File Allocation Table (FAT), the Windows predecessor to NTFS, allow access to only one unnamed stream of data that it perceives as the file's actual contents.

If a file or directory with alternate stream data moves to a non-NTFS volume, the stream data and other non-supported attributes become unrecoverable. Windows operating systems currently support ADS only on NTFS volumes. They do not provide a means for disabling ADS, which is a plus for forensic examinations.

Risks Associated with ADS

ADS are not inherently a security risk. However, the lack of native Windows support for locating, editing, and removing them presents opportunities for potential abuse. How a system's virus scanner and other security mechanisms deal with streams, if at all, plays a large part in mitigating potential risks associated with using NTFS volumes. Ultimately, the benefits of NTFS volumes far outweigh the potential risks of ADS, as long as system administrators are aware of streams and have the proper security tools to handle them.

Using ADS to Hide Data

Attackers have many methods to hide data in ADS. For example, kernel space filter drivers such as kdl use ADS by attaching their log files to system files or directories. Extensions such as StrmExt.dll do not show the existence of a stream used by kdl while active. However, other tools, such as ADSspy, LADS, and Streams, do reveal the stream. kdl holds a lock to ensure that external tools cannot delete or alter the stream until the lock is released. kdl is set up by default to use Registry keys for configuration. Therefore, changing the log file to be stored as an ADS is trivial.

Note

Attaching an ADS to a critical file or directory has a high probability of triggering a security mechanism.

The System File Checker (sfc.exe) verifies versions of protected system files. However, it ignores any ADS associated with those files. Any user who has the appropriate permissions can attach stream data to protected system files, and sfc.exe can't detect it.

The following are a few other methods attackers may use to avoid detection of hidden data:

  • Using stream names such as encrypted, archive, or other common Windows terms

  • Creating streams that have no extension identifier

  • Creating streams attached to obscure system files for data dumps, log files, and so on—for example, packager.exe and sqlsodbc.chm

  • Storing encrypted data in single streams or across multiple streams

  • Storing binary data across multiple streams to be reassembled and executed at the time of use to avoid detection

  • Storing device drivers as streams

Destructive and Other Uses for ADS

Train system administrators to identify, prevent, and repair the results of destructive behavior. Attackers make dangerous use of ADS. The following are a few harmful ADS techniques:

  • Flooding a guaranteed available critical file such as ntoskrnl.exe with useless stream data to use all available disk space

  • Attaching Trojans, worms, viruses, spyware, and other malware as streams

  • Embedding trade secrets in streamed files

Executing Code from ADS

Computer criminals sometimes modify ADS to embed executable malware. These files execute with the start command from the command shell. ADS that contain executable code do not run simply because you access the default stream. You must call ADS code directly. For example, to execute a binary data stream from a file, use the following command:

start ./ADSFile.txt:notepad.exe

To execute a binary data stream from a directory, use this command:

start %systemroot%:notepad.exe

Next, use regedit to add the executable stream to the Windows startup process by executing from the Registry, as follows:

  1. Choose Start, Run and enter regedit.

  2. Navigate to the key HKEY_LOCAL_MACHINESoftwareMicrosoftWindowsCurrentVersionRun.

  3. Create a REG_SZ string value.

  4. Right-click on the new entry and select Modify.

  5. In the Value Data box, enter (PATH TO FILE)ADSFile.txt:notepad.exe.

By using wscript to run scripts from streams, you can execute wsh from a stream by using the following command:

wscript ADSFile.txt:wsh.vbs

The next command forces execution of a script from a stream with an incorrect extension:

wscript //E:vbs ADSFile.txt:wsh.bob

The //E: engine switch explicitly defines which engine to use when executing the script.

Rootkits

A rootkit is also good for data hiding. A rootkit is a software application that attackers typically use to hide their presence or cover their tracks. A rootkit does not itself actually hide data. Instead, it subverts the tools an investigator might use to find the data. A rootkit is a program or a combination of several programs designed to hide or obscure the fact that a system has been compromised.

Note

Contrary to what its name may imply, a rootkit does not grant a user administrator privileges.

Rootkits originated as applications for taking control of a failing or unresponsive system. In recent years, attackers have begun using rootkits for malicious purposes. Intruders employ them to conceal malware and gain undetected access to systems. Rootkits exist for a variety of operating systems, including Microsoft Windows, Linux, Mac OS, and Solaris.

The manner in which an attacker installs a rootkit varies, depending on the operating system. Rootkits often modify parts of the operating system or install themselves as drivers or kernel modules. An attacker may use a rootkit to replace or corrupt system executables. The rootkit then hides installed processes and files. The software hidden by rootkit allows the attacker to establish the Internet footprint of the targeted organization. Footprinting is the process of collecting data about a specific network environment, usually for the purpose of finding ways to attack the target. Once the computer criminal has established the Internet footprint, he or she enumerates live hosts and identifies active network services with a tool, such as Nmap.

Note

A forensic analyst can often find and work around rootkits by making a bit-level copy of the media and performing an analysis on a forensic workstation with known tools. For more information about rootkits and actual rootkit examples, see http://www.rootkit.com.

Rootkits sometimes install a backdoor in a system. A backdoor provides a difficult-to-detect way to bypass normal authentication, gain remote access to a computer, obtain access to plain text, and so on. To install a backdoor, a rootkit may replace the logon mechanism with an executable that accepts a secret logon combination. The backdoor allows an attacker to access the system, regardless of changes to system accounts or other access control techniques.

A kernel module rootkit is a type of rootkit that installs itself into the application programming interface (API). The rootkit then intercepts system calls that programs send to request low-level functions. Because of its position in the stack, the rootkit has the ability to act as a man-in-the-middle, deciding what information and programs the user does and does not see. For example, someone may perform a directory listing and see all files except those prohibited by the rootkit. Well-written rootkits also control what the user sees in the Registry and the process list.

Steganography Concepts and Tools

The term steganography comes from the Greek for "concealed writing." In computer science, steganography is the science of hiding secret data within nonsecret data. It is based on the fact that data files can be slightly altered without losing their original functionality and without being detected by the human senses. The following are some important terms related to steganography:

  • The hiding data is called the carrier file, or cover file. Today, multimedia files, such as pictures or sound, are the most common carrier messages, but attackers use other types of carrier files as well.

  • Data that is to be kept a secret is called the embedded file, or embedded message.

  • The process of hiding is called embedding, or running the steganography algorithm.

  • The embedding process results in the stego message.

  • The recovering of the embedded message is called extraction.

Steganography's covert nature appeals to criminals. Therefore, a forensic investigator is likely to encounter steganography during a digital investigation. By searching for hints of steganography, an investigator might be able to collect otherwise undiscovered evidence from various digital media. Fortunately, an investigator can use automatic routine procedures to do this.

Types of Steganography

Simple steganography—also called pure steganography—is based on keeping the method for embedding a secret. However, with simple steganography, it is unwise to rely only on the secrecy of the steganography algorithm. Forensic specialists know about many of these techniques and test for them.

With public key steganography (PKS), the sender and receiver share a secret key called the stego key. Only someone who possesses the stego key can detect the presence of an embedded message. Without the stego key, an adversary can't even see that there is an embedded message. PKS works like this (see also Figure 8-2):

  1. The recipient's public key encrypts the covert message. This creates a pseudorandom bit string.

  2. The stego key decides which bits to use to possibly alter the carrier file.

  3. The steganography algorithm embeds the message.

  4. The owner of the correct stego key extracts the message.

Public key steganography.

Figure 8-2. Public key steganography.

Steganography Algorithms

According to Andreas Furuseth in Digital Forensics: Methods and Tools for Retrieval and Analysis of Security Credentials and Hidden Data, there is a variety of steganography algorithms. An algorithm is a step-by-step procedure that a computer follows to solve a problem. As shown in Table 8-1, each one uses different locations in digital data to hide secret messages. Simple embedding methods do not alter the carrier file's perceptive properties, but you can easily detect them. Slightly more sophisticated steganography software does the embedding in the least significant bit.

Table 8-1. Steganographic embedding methods.

METHOD

DESCRIPTION

Simple Embedding

Data appending

Data appending is a simple form of steganography. This method relies on the algorithm's secrecy. It embeds the message by adding it to the end of the carrier file. This works, for instance, for some image file formats, such as Joint Photographic Experts Group (JPEG) and bitmap (BMP), because the file header contains a field indicating the total amount of data or data after the "end of image" marker. Because most image viewers ignore the additional data, a stego message remains hidden.

Addition of comments

Many file formats allow for optional comments. Various source codes allow comments to aid the understanding of the code, and the interpreters ignore the comments. For example, Hypertext Markup Language (HTML) files have a comment tag that browsers ignore. You can easily view these comments in most browsers, however, by selecting an option to view the HTML source code. Therefore, such comments can serve as hiding places for information.

Use of file headers

Various data structures have header information, and some fields in the header are not mandatory or have values that are insignificant. Attackers can use such fields to communicate covertly. TCP/IP packets, for example, have unused space in the packet headers.

Least Significant Bit Embedding

Use of the least significant bit

Intruders can embed data in the least significant bit of an image, such as GIF and BMP, because browsers often think of this bit as random noise. Attackers can do the actual embedding in several ways, from sequential changes to random walks with a pseudorandom generator.

Frequency Domain Embedding

Frequency embedding

A common type of embedding targets frequency. Attackers use this type of embedding with JPEG compression and the MPEG Layer 3 (MP3) encoding of Waveform Audio File Format (WAV) files.

preservation of the original statistical properties

Statistics-aware embedding

Embedding methods alter the carrier file's statistical properties. For example, with least significant bit embedding in images, the frequency of colors changes. Statistics-aware embedding uses a model of the carrier file to preserve these characteristics.

Pseudorandom embedding

Some steganography software falls into the category of secret key steganography. This software uses a pseudorandom generator to select locations for the actual embedding.

Frequency domain methods are quite robust against perceptive inspection. However, the embedding introduces statistical changes to the carrier, resulting in stego messages with different statistical properties that cover messages. To prevent statistical attacks, the next generation of steganography algorithms preserves the carrier's original statistical properties. Some steganography software even does spatial and frequency domain embedding by utilizing pseudorandom techniques.

Steganography Software

Many tools hide information, including many types of steganography software. Most often, the information is hiding in some form of multimedia, such as an image, an audio file, or a video file. Many people have legitimate needs for these tools, such as those in government, military, law enforcement, and the academic community. However, criminals and terrorists also use steganography.

The following sections describe several steganography software tools.

EzStego

EzStego is an open source tool that uses least significant bit embedding in Graphics Interchange Format (GIF) images. GIF supports bitmap images and gets wide exposure on the Internet. It operates on the red-green-blue (RGB) color model, where each color is a combination of those primary colors. GIF uses eight bits per pixel, and each image can reference a palette of up to 256 distinct colors. A GIF image is a grid, with each cell—or pixel—pointing to the palette's appropriate position. Therefore, when rendering the picture, the color for each pixel is looked up in the palette. EzStego embeds data in GIF images by altering the colors of pixels.

EzStego adds a message to a pixel's least significant bit of a pixel, following these steps:

  1. Create a copy of the palette and rearrange it so that colors close to each other in the RGB color model are close in the palette.

  2. If there is an extra bit in the message, do the following:

    • Find the index, i, of this pixel's RGB color in the sorted palette.

    • Replace the least significant bit of i with a bit from the message, creating i*.

    • Find the RGB color that i* points to in the sorted palette.

The changes to the carrier file won't be apparent. The palette stays the same as the original, and the EzStego tool adds no other strange artifacts to the stego file with the embedding. Recovering the hidden message from the carrier involves finding the index of the pixel's color in the sorted palette.

One method to detect messages embedded with EzStego is to run all files through EzStego with the -unsteg option and check the result. Is the embedded message plaintext readable? The header information can indicate encryption. If the encrypted message has been stripped of header information, the extracted message is a pseudorandom bit string. These bits do not seem different from a bit stream you've extracted from a carrier image.

MandelSteg

The MandelSteg tool is freeware. It differs from other steganography tools in that it does not use an existing carrier. Instead, it creates its own stego images, based on a mathematical concept called Mandelbrot fractals. MandelSteg creates a Mandelbrot image and stores the hidden message in the specified bit of the image pixels. The tool GIFExtract pulls the message from the stego image.

MandelSteg is, from the forensic viewpoint, not a good alternative for hiding messages. The use of a Mandelbrot fractal image is unusual. Therefore, if you inspect seized data and detect the presence of Mandelbrot fractal images, it's safe to suspect the use of steganography.

An adversary can extract a MandelSteg message by running GIFExtract. A brute-force attack is possible because the different command-line options provide only 164 possibilities. If the embedded message is not encrypted or if known encryption headers are identified, you can detect the presence of steganography and defeat the tool. When the message is encrypted and stripped of headers, it appears pseudorandom and cannot be differentiated from other extracted messages.

Spam Mimic

Spam Mimic is a freeware steganography tool that uses spam as stego media. Spam Mimic works similarly to MandelSteg in that it does not need an existing carrier. The output from Spam Mimic is real text that looks like spam. The idea behind Spam Mimic is that spammers send junk mail all the time, so users who receive these messages don't suspect steganography.

Spam Mimic is simple steganography. You can run the algorithm in reverse and use the freely available tool to get the embedded message.

One challenge with Spam Mimic is that it is always inbound. Perhaps William is monitoring traffic coming to and from Alice. Inbound spam should not raise suspicion, but spam originating from Alice probably would. Spam Mimic is freely available, so William considers applying the algorithm for decoding and encoding. A forensic investigation monitoring traffic to and from a suspect can detect traffic going to the Spam Mimic home page, alert the investigator, and defeat the steganography.

Snow

Snow—which stands for steganographic nature of whitespace—is simple open source steganography software. It appends whitespace to lines in American Standard Code for Information Interchange (ASCII) text files, where the embedded message is encoded as space and tab characters. This whitespace at the ends of lines does not change the file's appearance in normal text viewers. Therefore, the resulting stego message is not visibly different from the original carrier. But Snow sometimes appends new lines to the carrier file, which clearly alters the stego message's visual characteristics.

Snow is not very sophisticated, and it leaves clear indicators of its presence that a diligent investigator can easily find. The trailing spaces are not normal and are a give-away. You can detect Snow with a visual inspection of text files or by using a detection algorithm that automates this process.

OutGuess

OutGuess is an open source tool that serves as a framework for information hiding, regardless of the carrier data type. But an attacker has to create a handler for each type, and OutGuess typically uses JPEG images. Using OutGuess efficiently on several images requires the computer criminal to create a small script.

You cannot visually detect OutGuess. The academic community created this open source tool. This same community has also attacked and broken OutGuess. Stegdetect can detect OutGuess 0.13b. It does this by using a statistical technique. OutGuess 0.2 outsmarts Stegdetect by preserving the statistics from the carrier image.

An investigator who knows the correct key can easily extract a message embedded with OutGuess. However, without the key, message extraction is more problematic.

appendX

appendX is a simple open source steganography tool. Its embedding method is to append data to the end of the carrier file. The carrier file is generally an image file such as a Portable Network Graphics (PNG), JPEG, or GIF.

The appendX tool supports Pretty Good Privacy (PGP) header stripping. When an attacker encrypts a message with PGP, the PGP package adds a header that clearly identifies the hidden data as ciphertext. But the tool appendX strips this header, so the appended data looks like noise.

appendX adds hidden data to the end of the carrier. Following the embedded message, appendX pads the embedded message, from the left, until it has appended 10 characters. This can help you identify the software an attacker has used.

The appendX tool is classified as weak because you can relatively easily detect the presence of embedded data. After you have identified appendX, you can easily extract the additional data in the file.

Invisible Secrets

Invisible Secrets is a commercial product from Neobyte Solutions (see http://www.neobytesolutions.com). This software security package provides encryption, safe deletion, and steganography. The steganography part of the package can hide information inside JPEG, BMP, PNG, HTML, and Waveform Audio File Format (WAV) data files. It uses least significant bit embedding, comment insertion, and whitespace appending. The software is easy to use, is well documented, and integrates into the Windows shell and Start menu by default. Invisible Secrets' embedding techniques are not more sophisticated than freely available steganography tools. However, this software is more user friendly.

Invisible Secrets is not difficult to detect because the methods used to embed a message are well known. An interesting feature of Invisible Secrets is the possibility of creating bogus stego messages. Doing so could increase the difficulty of detecting a specific hidden message. However, more stego messages would increase the possibility of detecting the use of steganography. Therefore, such a move would defeat the covert channel. This is especially true when using simple steganography software, which is fragile to algorithm exposure.

Defeating Steganography

System forensics investigators are experts at finding information from a digital crime scene. While searching for digital evidence, an investigator might find hints of steganography or, for some other reason, suspect its use. An absolute indication of the use of steganography is the discovery of steganography software. This software could be on the suspect's disk drive or hidden on a memory stick, compact disk (CD), or other removable media.

Search automatically for known steganography software. To do this, maintain a database of cryptographic hash values for known components of such software. The National Institute of Standards and Technology (NIST) keeps up a list of digital signatures of software applications—including steganography software—called the National Software Reference Library (NSRL). For more information, see http://www.nsrl.nist.gov.

Detecting the Use of Steganography Software

The following sections discuss some of the ways a forensic investigator can find evidence of the use of steganography.

Traces of Steganography Software

If you aren't able to find any steganography software during an investigation, you might discover traces of its use in various locations. The Windows Registry, a recently used file list, the Web browser history, or tools used for software extraction may show signs of steganography. For example, the list of recently used files in WinZip or WinRAR can present evidence of a recent extraction of EzStego.zip or some other software file.

Location of Pairs of Carrier/Stego Files

As discussed earlier in this chapter, steganography software often creates stego messages based on an original carrier file. A pair of files that have different hash values but the same perceptional properties—that is, two images that look similar but have slightly different least significant bit planes—is a potential carrier/stego file pair. Even if the carrier file were deleted, you could in some cases undelete it by using forensic tools.

Some forms of steganography hide information within an image by using mathematic algorithms. These math procedures replace unused or little-used bits within an image with bits representing the data an attacker aims to conceal. The algorithms change only a few of the many bits that comprise an image and do not damage the image's appearance. Thus, the image looks the same even though it now contains covert data. Use software to analyze images and detect the small variations imposed by the attacker's steganographic program. Then extract the hidden data for analysis. To help analyze images, collect characteristics of images such as child pornography images and use those characteristics to aid in the automatic detection of steganography. Also, it is possible to apply the same principle to other media types used for steganography.

Keyword Search and Activity Monitoring

In addition to having a database of hash signatures for steganography software, compile a dictionary of key terms, as discussed in Chapter 7, "Collecting, Seizing, and Protecting Evidence." You can perform searches on seized data to try to locate the listed terms. The search for keywords is done not only with steganography. The contents of the dictionary decide the target.

The rate of false positives depends on the words used. Good candidates are software names such as OutGuess and words such as carrier and cover. To create part of the database of keywords, use strings to extract words from steganography software binaries.

In addition to keyword searches, history logs in a suspect's Web browser might show visits to steganography Web sites. Have your system forensics team keep a list of such Web sites to compare against browser histories.

Suspect's Computer Knowledge

Most steganography software is fairly easy to use, and a suspect need not have a lot of computer knowledge to use it. However, if a suspect has advanced computer skills, you might speculate on the suspect's use of homemade steganography tools or algorithms. Examine any unknown software you encounter during an investigation to discover its functionality.

Unlikely Files

Some steganography software uses or creates uncommon file types. The MandelSteg tool described earlier in this chapter uses no existing carrier but creates a Mandelbrot image based on the message it will embed with hidden data. These images clearly stand out among vacation and other typical images contained on a PC. Consider what use a suspect might have of such images. A similar example is inconsistencies between religion, interests, and so on, and file types. Consider the speculation of al-Qaeda hiding information in pornographic images. Such images are against the Muslim religion. So consider findings of this kind unlikely and suspicious.

Location of Steganography Keys

Some steganography software uses steganography keys—called stego keys—to seed a pseudorandom number generator when it's used to select locations for bit manipulations. These tactics attack weak passwords in a brute-force manner. Software is available to support such attacks on steganography.

During the physical crime scene investigation phases, forensic specialists collect handwritten notes or markings. You should determine whether any of these are passwords or other key information. In addition, try to guess passwords with dictionary attacks and other brute-force methods, such as cryptanalysis and steganalysis.

Steganography software sometimes encrypts a message prior to embedding it. Analysis of steganography involves separating the embedded message from the stego message. After extracting a hidden encrypted message, attack this ciphertext by using cryptanalysis techniques or other forensic techniques.

Encryption schemes add a header to an encrypted message. This header could contain data, such as an encryption algorithm. In addition, this plaintext can aid steganalysis by simply sending out an amount of random-looking data. You can also use it to identify a successful brute-force attack on a secret key steganography system. For instance, Stegdetect relies on detecting known headers in the extracted message to signal success. Sometimes, however, an embedded encrypted message has been stripped of headers. The extracted message is a random-looking bit string, which makes it difficult to assess whether it is valid ciphertext you need to treat further or just noise.

Strengths and Weaknesses of Today's Detection Methods

The methods described so far for detecting steganography software let you know that an attacker has embedded malicious code, but they don't help you decipher what data has been hidden or what works as evidence. Skilled investigators must put the pieces together into a cohesive description of the event.

Some of the techniques just described are minor expansions of already existing forensic procedures. Adding steganography keywords to the dictionary for the keyword search is one such example. Adaptation of methods requires little work and little expense, yet it can be very effective. Some of the tools used for detecting steganography or extracting data are quite expensive and, with large data volumes, can be quite time-consuming. One option is to automate these tasks.

Steganalysis

Steganalysis is the process of detecting messages hidden with steganography. In other words, steganalysis is about separating cover messages from stego messages. It tries to defeat a steganography algorithm by looking for weaknesses. Steganalysis often involves the use of statistical properties to look for abnormalities in files, such as strange palettes in GIF images or other known signatures in stego messages.

According to Andreas Furuseth, a system forensics specialist must have a good grasp of steganalysis and its use when attacking steganography. However, system forensics encompasses more methods in addition to just steganalysis to attack steganography. For example, items from the physical scene can provide security credentials in the form of written passwords. Well-known forensic methods include searching for keys, passwords, and known keywords and recovering deleted data. Locating such artifacts helps an investigator defeat steganography.

Steganalysis is limited to the detection of an embedded message, not message extraction. Steganalysis detects stego messages among cover messages. The detection of a hidden message can identify the embedding method. After identifying a tool or method, you might be able to extract the message.

Most steganalysis algorithms and tools target specific steganography software. Such tools rely on specific signatures left in the stego message. The embedding of a message can give a specific statistical property, which is another way to detect stego messages.

The following sections discuss some of the common steganalysis methods.

File Signatures

Some steganography software adds specific signatures to stego messages. For example, the string "CDN" is always present when using a tool called Hiderman. Use these file signatures to spot stego messages.

File Anomalies

Some simple steganography software embeds messages by appending data to the end of the carrier file. appendX and Invisible Secrets, discussed earlier in this chapter, are examples of this type of steganography software. When software, such as image viewers, reads these files, the amount of data read depends on the file length defined in the file header. Therefore, the appended data is not read, and the changes to the carrier are not perceptible.

When using Invisible Secrets and least significant bit embedding in BMP images, all the bits it hasn't used for embedding are set to 0 or to 1. You can detect such file anomalies when examining steganography software via steganalysis.

Visual Attacks

It is possible to remove all parts of an image covering a hidden message and visually determine whether it contains a potential message. Visual attacks succeed only when the cover image has clearly structured contents. Image textures typically withstand this type of attack.

Extracting Hidden Information

After you detect hidden information, you can extract the embedded message. In some situations, this means running the identified steganography software. With simple steganography, this is relatively easy. In the presence of a stego key, you need that key to succeed. In this case, perform a brute-force or dictionary attack on the system.

Even though not all users of steganography encrypt their message, it's wise to expect encryption. Stripping headers from encrypted files results in an embedded message being indistinguishable from noise. PGP Stealth is a tool that strips all headers from a PGP encrypted message. The complexity of a brute-force search is then much greater. For each stego key, try all encryption keys. Only a successful cryptanalysis of the embedded message will show whether the currently tried stego keys are correct.

You can disable hidden information in several ways. A drastic step is to disallow all communication—that is, intercept the communicated image. When this is not possible or wanted, change the assumed stego message by giving it a different file format, performing image processing such as blurring or cropping, or adding noise to the least significant bit of a GIF image. You will probably lose the original embedded message because of these changes. However, note the existence of watermarks, which are quite robust and can withstand multiple changes. There is a trade-off between robustness and the amount of hidden information. Adding error correction to the message reduces the amount of information the message can carry.

Steganalysis Software

Certain tools, called steganalysis software, can detect the presence of steganography. Some of the tools are open source, whereas others are quite expensive. This section describes steganalysis software to use for dictionary attacks against steganography software.

StegSpy

StegSpy v2.1 is freely available steganalysis software to spot Hiderman, JPHideandSeek, Masker, JPegX, and Invisible Secrets. It has a graphical interface that allows you to manually select one file at a time to examine.

StegSpy does signature-based steganalysis. It looks for particular strings to detect steganography software used for message embedding. For example, it looks for the string CDN, which is always present with Hiderman. Also, after discovering a known signature, StegSpy locates the embedded data position by detecting the end of the image, based on header information.

Because StegSpy can examine only one file at a time and because it examines signatures but not file anomalies, this tool's forensic value is low. However, the knowledge it contains—the signatures—is quite useful.

Stegdetect

Stegdetect is open source steganalysis software developed by the academic community. It unearths the presence of messages embedded with JSteg, JPHide, Invisible Secrets, OutGuess 01.3b, F5, appendX, and Camouflage. Newer theoretical steganalysis algorithms have been suggested, but no publicly known steganalysis software supports those algorithms at this time.

Stegdetect can find any JPEG-based steganography system. It can classify new images as stego images or carrier images.

Stegbreak

Stegbreak is open source software developed by the same author who developed Stegdetect. This is not software that detects the presence of steganography. Rather, it is for message extraction and attempted dictionary attacks against JSteg-Shell, JPHide, and OutGuess.

A forensic investigation can dig up password clues. Add these clues, such as name, birthday, pet's name, and so on, to the dictionary that Stegbreak uses. Stegbreak's success is closely related to the quality of the password and the dictionary. The rules to permute words in the dictionary also determine Stegbreak's success.

Stegbreak tries to verify that the extracted bit string is an embedded message and not just noise. It does this by identifying file headers in the extracted bit string.

Stego Suite

WetStone Technologies (see http://www.wetstonetech.com) offers Stego Suite. This commercial package consists of the detection tools Stego Hunter, Stego Watch, and Stego Analyst, and a password cracker called Stego Break. WetStone also offers training in using these tools. These tools can scan audio files, JPG, BMP, GIF, PNG, and more to identify more than 500 known steganography programs.

StegAlyzer

The Steganography Analysis and Research Center (see http://www.sarc-wv.com/products.aspx) sells three steganalysis products:

  • StegAlyzerAS searches file systems for traces of known steganography software.

  • StegAlyzerSS includes the functionality to detect known stego file signatures.

  • StegAlyzerRTS detects steganography artifacts and signatures in real-time over a network.

These tools are expensive but are available on a free 30-day trial basis. Freely available tools such as The Sleuth Kit and Autopsy Browser support the detection of file hash signatures and known stego file signatures. However, the StegAlyzer products have a large database of software and stego file signatures.

CHAPTER SUMMARY

It isn't a question of whether an attacker has hidden covert data on the hard drives of unsuspecting users but what data, for what purposes, and where. Well-funded hackers, criminals, and terrorists hide data, and forensic investigators and law enforcement try to catch up with the latest tactics.

You'll be flushing out a number of data-hiding techniques, such as ADS, rootkits, and steganography, in your future investigations. To detect their embedded messages, it's important to be aware of attackers' methods. In turn, many methods and programs are available to help you defeat steganography, including steganalysis.

KEY CONCEPTS AND TERMS

  • Algorithm

  • Alternate data streams (ADS)

  • Backdoor

  • Carrier file

  • Covert channel

  • Digital watermarking

  • Embedded file

  • Embedding

  • Extraction

  • Footprinting

  • Kerckhoffs' principle

  • Kernel module rootkit

  • Metadata

  • Packet crafting

  • Pirate

  • Protocol bending

  • Public key steganography (PKS)

  • Rootkit

  • Security through obscurity

  • Simple steganography

  • Steganalysis

  • Steganalysis software

  • Steganography

  • Stego key

  • Stego message

CHAPTER 8 ASSESSMENT

  1. Packet crafting and protocol bending are two ________techniques.

  2. _________ can be modified to embed executable malware.

  3. The benefits of NTFS volumes far outweigh the potential risks of ADS, as long as system administrators are aware of streams and have the security tools to handle them.

    1. True

    2. False

  4. A _______ is software that prevents users from seeing all items or directories on a computer.

  5. Steganography is the science of extracting secret data from nonsecret data.

    1. True

    2. False

  6. Which of the following do attackers use to hide secret data in steganography?

    1. Metadata

    2. Embedded file

    3. Carrier file

    4. Stego message

  7. Which of the following is another name for running a steganography algorithm?

    1. Embedding

    2. Digital watermarking

    3. Extraction

    4. Packet crafting

  8. With PKS, the sender and receiver share a secret key called the stego key. Only a possessor of the stego key can detect the presence of an embedded message.

    1. True

    2. False

  9. Steganography is another name for cryptography.

    1. True

    2. False

  10. Which of the following is not a steganographic embedding method?

    1. Simple embedding

    2. Least significant bit embedding

    3. Frequency domain embedding

    4. Digital watermarking

    5. Preservation of the original statistical properties

  11. Which of the following are examples of steganography software? (Select three.)

    1. StegAlyzerAS

    2. MandelSteg

    3. EzStego

    4. Stegdetect

    5. Snow

    6. Stegbreak

  12. The only way to defeat steganography is to use steganalysis.

    1. True

    2. False

  13. Which of the following is not true of steganalysis?

    1. It involves separating cover messages from stego messages.

    2. It hides secret data within nonsecret data.

    3. It is the process of detecting messages hidden using steganography.

    4. It often involves the use of statistical properties to look for abnormalities in files.

  14. _________ is the name for separating an embedded message from a stego message.

  15. Which of the following are examples of steganalysis software? (Select three.)

    1. StegAlyzerAS

    2. MandelSteg

    3. EzStego

    4. Stegdetect

    5. Snow

    6. Stegbreak

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset