Chapter 6

High-tech investigations of cyber crime

Emlyn Butterfield

Abstract

Digital devices and information are now ingrained in today’s society, from computers used to perform everyday tasks to mobile phones used to send and receive calls and messages. The proliferation of these devices not only aids an individual’s productivity and communication channels it can now be used to perpetrate crimes and criminal activity. Digital devices can be used as part of, or be the “victim” of, a criminal act. This has led to the requirement of high-tech investigations to analyze digital information to identify what has happened and who it was. This chapter introduces some of the key steps and fundamental concepts required to successfully perform a high-tech investigation and maintain the chain of custody of digital evidence from identification to seizure; this includes the initial steps within a high-tech analysis.

Keywords

High-tech investigation

Data capture

Analysis

Digital evidence

Data interpretation

Introduction

Digital information has become ubiquitous with the world of today; there is an increasing reliance on digital information to maintain a “normal” life, communication and general socialization. The prolific use of networked devices now allows anybody from any country to now attack, or utilize a digital device to attack, their next door neighbor or someone on the other-side of the world with only a few clicks of a button. People’s ignorance, including criminals, of what information a device stores and the amount of data it generates means that a properly trained and equipped expert can recover and make use of information from almost any digital device. This same digital information can now provide evidence and intelligence that can be critical to criminal and civil investigations. Understanding the threat of cyber-crime and cyber-terrorism allows us to put into context the current technical situation, however identifying the potential for attack is only the start. Within this chapter we will be looking at defining high-tech investigations and evidential processes that are applicable to all investigations. The description and processes will aid in the investigation of a crime, or malicious action, after the event has occurred.

High-Tech Investigations and Forensics

The term “forensics” can bring to mind popular American television series. Television shows that glorify forensic analysis such as these have both helped and, to some extent, hindered forensic science. It has helped by bringing the concept of forensic capabilities to a wider audience so that a heightened level of awareness exists. Conversely, it has also hindered it by exaggerating the technical capabilities of forensic scientists: no matter what the television shows suggest it is entirely possible for data to be beyond recovery by even the most eminent experts. However, surprisingly, many users remain ignorant of the kind of data that can be scavenged from various digital sources that will underpin an investigation. Digital devices are essentially a part of any investigation in one way or another:

 Used to conduct the activity under investigation: the device is the main focus of the activity, such as the main storage and distribution device in a case of indecent images of children.

 Target of the activity under investigation: the device is the “victim,” such as in an incident of hacking.

 Supports the activity under investigation: the device is used to facilitate the activity, such as mobile phones used for communication.

High-tech investigations relate to the analysis and interpretation of data from digital devices, often called upon when an incident (criminal or civil) has occurred. They are not purely about using the most advanced technology to perform the work. A good proportion of what is done is actually “low-tech” in the sense it is the investigator’s mind doing the work and interpreting the data that is available. The primary objective of a high-tech investigations is to identify what happened and by whom.

Core Concepts of High-Tech Investigations

It is generally agreed that a high-tech investigation encompasses four main distinct components, all of which are important toward the successful completion of an investigation:

1. Collection: the implementation of a forensic process to preserve the data contained on the digital evidence while following accepted guidelines and procedures. If performed incorrectly any data produced at a later date may not stand up in a court of law.

2. Examination: systematic review of the data utilizing forensic methodologies and tools whilst maintaining its integrity

3. Analysis: evaluating the data to determine the relevance of the information to the requirements of the investigation, including that of any mitigating circumstances.

4. Reporting: applying appropriate methods of visualization and documentation to report on what was found on the digital evidence that is relevant to the investigation.

These four main components underpin the entire investigative process allowing high-tech investigators and reviewers of the final product to be confident of its authenticity, validity, and accuracy (also see Chapter 4).

An important consideration throughout a high-tech investigation is to maintain the “chain of custody” of the exhibit, so that it can be accounted for at all stages of an investigation and its integrity maintained. With a physical exhibit this is achieved, in part, through the use of an evidence bag and a tamper evident seal. The integrity of digital information is maintained in the form of one-way hash functions, such as MD5, SHA-1, and SHA-256. One-way hash functions can be used to create a unique digital fingerprint of the data; this means that, when implemented correctly, even a small change to the data will result in a completely different digital fingerprint. If the physical and digital integrity of an exhibit is maintained then it allows for a third party to verify the process performed. This is an important factor in improving the chances of evidence acceptance within the legal proceedings.

Whilst each country may have their own guidelines or best practice in relation to handling digital evidence the general essence is almost always the same. The UK has the Association of Chief Police Officers (ACPO) Good Practice Guide for Digital Evidence (Williams, 2012) and in the US there is the Forensic Examination of Digital Evidence: A Guide for Law Enforcement (National Institute of Justice, 2004). These documents do not go into technical detail on how to perform analysis of the digital data, they are more focused toward the best practices involved in the seizure and preservation of evidence. The ability to correctly acquire or process digital evidence is extremely important for anyone working in high-tech investigations. The acquisition of exhibits provides the basis for a solid investigation. If the acquisition is not done correctly and the integrity, or the continuity, of the exhibit is questionable then an entire case may fail. The salient points will be discussed in the next sections.

Digital Landscapes

Traditionally digital forensics focused on the single home computer or a business’ local area network (LAN). But in a world where networks are prolific, with the advent of the Internet and the mass market of portable devices, digital evidence can come from almost any device used on a daily basis. It is therefore important to consider different technical routes and peculiarities when dealing with the digital evidence.

The advent of advanced technology within business and at home now means that more advanced techniques of data capture are required. This has led to the development of live, online and offline data capture techniques. The remit of the investigation and the technology to be investigated will determine the data capture technique performed. However, the key to the data capture phase is that the captured data’s integrity can be confirmed and verified.

The “Crime Scene”

As with any kind of investigation it is important to plan prior to performing an investigation, in particular where physical attendance is required at a “crime scene,” this will not always be possible, or required. Digital devices can appear within any investigation and it is easy to overlook the significance of a digital device to a particular investigation type.

Before attending a “crime scene” pre-search intelligence is key in identifying the layout of the scene, the potential number of people or devices, and the type of digital information relevant to the investigation. This information allows the organization of equipment and resources that may be required to seize or capture the data. Early consideration should be made as to whether digital devices can be removed from the “crime scene” or whether data needs to be captured and then brought back to the laboratory for analysis.

If attendance at a “crime scene” is required then the overarching rule is to preserve the evidence. This, however, cannot come before the safety of those on site. Once personal safety is assured then evidential preservation can commence. At the first opportunity everyone not involved in the investigation should be removed from the vicinity of all keyboards or mice (or other input device) so that no interaction can be made with any digital device. If left, people can cause untold damage to the digital data making the later stage of the investigation much harder, if not impossible.

The physical “crime scene” should be recorded using photographs, video recordings, and sketches. This makes it possible to identify the location of devices at a later date, and also allows a third party to see the layout and the devices in situ. It may be that these images are reviewed at a later date and, following analysis, important points found in the digital data allow inferences to be drawn from what was physically present; such as the connection of a USB DVD writer.

With the sheer number of digital devices that may be present at a “crime scene” consideration must be made to the likelihood that a device contains information in relation to the investigation. It is no longer feasible to go on site and seize every single digital item, budgets and time constraints will not allow this. Consideration must be made as to the investigation type, the owner of the device and any intelligence and background information available to determine whether the device is suitable for seizure. Such a decision should be made in conjunction with the lead investigator and legal and procedural restrictions.

If a device requires seizing, it should first be determined if the device is on or off. If on, then consideration should be made of live data capture and a record made of all visible running programs and processes. Once a decision has been made and any live data captured, the power should then be removed from the device. If the device is a server, or similar device, running critical systems and databases, then the correct shutdown procedure should be followed. It is possible that an unscrupulous individual has “rigged” a system to run certain programs, or scripts, when it is shutdown, such as wiping data or modifying certain information; however, the risk of losing critical business information through a corrupted database or system needs to be considered fully. Generally a normal home laptop or computer can simply have the power removed. Once taken offline, or if it is already off, the device should be placed into an evidence bag with a tamper evidence seal and the chain of custody maintained. Each device should be given a unique reference number to aid identification - and these should be unique to each high-tech investigation.

Once the crime scene is physically secure attention needs to be made of the devices to be seized and how technically to achieve that - this is detailed in the following sections.

Live and Online Data Capture

Live data capture is utilized when the device is not taken offline: that is, it is decided not to turn it off. For example if a critical business server is taken offline it may cause disruption or loss of revenue for the business. If a program is running, it may mean critical data will be lost or it will not be possible to recover that information if the power is removed. This can also be the case when dealing with encryption: if the power is turned off, the data is no longer in a format that is accessible without the correct password.

A high-tech investigation should enable someone to follow the steps performed and produce exactly the same results. However, the problem with live data is that it is in a constant state of change, therefore it can never be fully replicated. Although, these issues exist, it is now accepted practice to perform some well-defined and documented live analysis as part of an investigation, and the captured data can be protected from further volatility by generating hashes of the evidence at the time that it is collected.

Traditionally, when looking for evidence in relation to website access, data would be captured from the local machine in the form of temporary internet files. However, as advanced Internet coding technologies leave fewer scattered remnants on a local machine, techniques must now be used to log onto an actual webpage and grab the contents that can be seen by a user. Alternatively, requests can be made of the service provider to produce the information. This process requires detailed recording of the actions performed and a hash of the file at the conclusion. An example of online data capture is the capture of evidence from Social Networks, which is now becoming progressively prominent in high-tech investigations, including those related to cyberbullying.

With live data, even with the securing of a physical crime scene, it is still possible that an outside influence can be applied to the digital data, such as remote access. It is very important therefore that this information is seized digitally as soon as possible. If possible the data on the device should be reviewed and once satisfied that data will not be lost, the device should be isolated from network communication, mobile signals or any other form of communication that could allow data to be removed or accessed remotely. In large organizations support should be sought from the system administrators to help in the identification and isolation of digital devices, to prevent unwanted corruption of important data. The devices can then be removed or the data captured using appropriate tools.

Offline (Dead) Data Capture

This is the traditional method of data capture, through the removal of the main storage unit, typically a hard drive: an exact replica is made of the data on the device and later analyzed.

An essential principle of forensics is that the original data, which might be used as evidence, is not modified. Therefore, when processing physical evidence, it is imperative that a write-blocker is used; a write-blocker captures and stops any requests to write to the evidence. This device sits in line with the device and the analysis machine. There are numerous write-blockers available that can protect various kinds of physical devices from being modified by the investigator. There are physical write-blockers, which are physically connected to the digital evidence and the analysis machine. There are also software-based write-blockers which interrupt the driver behavior in the operating system.

Verification of the Data

Having captured the data the first step, as a high-tech investigator, is to confirm that the data has not been altered. To facilitate this, the data capture has its hash value recalculated; this is then compared against the original hash. If these do not match then no further steps are performed until the senior investigating officer, or manager, is contacted and the situation discussed. Such an error may undermine even the most concrete evidence found on the exhibit. Hash mismatches can occur if the data was not copied correctly or there may have been a fault with the original device. The original exhibit may need to be revisited and a new image created. This may not be possible if an online, or live data capture, was performed as the data may no longer be available. The second the capture is made, new data may be added to the device and any old data may be overwritten, meaning the device will never be in the same state again.

Reviewing the Requirements

The requirements of the investigation, or remit, provide the specific questions that need to be answered. This can be used to identify possible routes for analysis. It is important to ensure a thorough analysis of the requirements is made early on in the investigation to ensure that time and money are not wasted. From the remit, alongside any background information provided, the following need to be identified as a minimum:

1. Number and type of exhibits: so it is known what data is to be investigated

2. Individuals/business involved: so it is known who is to be investigated

3. Date and times: so it is known when the incident occurred, which will provide a time window to be investigated

4. Keywords: what may be of interest during the investigation if it is found, this could be names or bank account numbers for example

5. Supplied data: if a particular file or document on the data is to be looked for—it is useful if a copy of this is provided

Starting the Analysis

There may be a wealth of information gleaned from the captured evidence some of which may not be relevant. No one process or method will necessarily answer all the questions posed. It is important to remember the following points when reviewing information to ensure nothing is missed or misinterpreted:

1. False Positives: files that are not relevant to an investigation but they may contain a keyword that is important

2. Positives: files/data that are relevant to the investigation

3. False Negatives: files that are not picked up but are relevant—they may be in an unreadable format (for example, compressed or encrypted)

The actual analysis of the data will vary depending on the type of investigation that needs to be carried out. Therefore at the beginning of the investigation consideration and a careful analysis must be made of the actual questions that are being asked.

The analysis of data can be broken up into two stages:

1. Pre-Analysis: if this is done incorrectly it can have a major impact on the rest of the investigation. It is the process of getting the data ready to make the actual analysis as smooth as possible. This process is all about preparing the data through the recovery of deleted files and partitions, and the mounting of compressed file and folders and encrypted files (so they then become searchable and have context)

2. Analysis: this is the review of the data to find information that will assist in the investigation, through the identification of evidence that proves, or disproves, a point

A high-tech investigation should not be dependent upon the tool used; a tool is simply a means to an end. However, it is important that the investigator is comfortable and sufficiently qualified and experienced in using the chosen digital analysis tool. The ability to click a button in a forensic tool or to follow a predefined process is not forensics—this is evidential data recovery. A high-tech investigator must be able to review what is in front of them and interpret that information to form a conclusion, and if appropriate, an opinion. The location of evidence can be as important as the evidence itself; therefore careful consideration must be made as to the context of what is seen. If a file resides in a user’s personal documents folder, it does not mean that they put it there. It is the investigator’s role to identify its provenance and provide context as to how it got there, when, and whether it has been opened. The interpretation and production of such information may help in proving, or disproving, an avenue of investigation.

There is no correct way to begin the actual analysis of the data; there is no rule book which will state exactly what to do and what to look at. Depending on any legal restrictions, the investigator may be limited to only reviewing certain files and data. If there is any uncertainty on this issue the investigator must discuss this with their manager or the senior investigator. If all data can be accessed then the investigator can browse through the folders and files. If anything stands out as “unusual” or of interest it may provide direction and focus to the technical analysis steps. To some extent this may depend on the operating system under review.

At the start of an investigation a check should be made to ensure that all the expected data in the capture is accounted for. It is very easy for partitions on a disc to be modified so that they are not seen straight away or for a partition to be deleted and a new one created. In terms of a physical disc this may involve the review of the number of sectors available on the disc compared to those currently used.

Signature Analysis

It is easy to obscure a files’ true meaning, and it useful to identify whether all the files are what they purport to be; this can be a simple way of highlighting notable files. Operating systems use a process of application binding to link a file type to an application. Windows, for example, uses file extensions and maintains a record of which application should open which file: for example .doc files are opened in Microsoft Word. The fact that Windows uses file extensions gives rise to a data-hiding technique whereby a user can change the extension of the file to obscure its contents. If a file named MyContraband.jpg was changed to lansys.dll and moved to a system folder, the casual observer would probably never find it.

Linux uses a files header (or signature) to identify which application should open the file (the file can be viewed in hex to see this). It is therefore harder to obscure a files’ contents/true type as with a broken header the file will often not open. Linux (and Mac) have a built-in Terminal command that allows you to identify a file’s signature, simply using the command file –i [where –i represents the input file].

Most Forensic tools have the capability to check a file’s signature and report whether this is different from that expected from the extension. The file’s signature can be checked against a precompiled database. If the signature exists it will then check the extension associated with it. One of the following results for each file will then be obtained (certain forensic tools may give more specific results but all align to the same two concepts):

 Match - the signature and extension match with what is stored

 Mismatch - the signature and extension do not match and therefore the file should be checked to identify evidence of manipulation

Filtering Evidence

It is well known that a hash value is an important tool within any high-tech investigation. Hash values are intrinsic to a forensic investigation; they are initially utilized to verify and confirm the integrity of the evidence received. They can then be used to confirm the integrity of any, and all, evidence produced. An investigator can also use hash values to reduce the amount of data under review - through the use of what is referred to as hash sets, which are simply a grouping of known hash values. An investigator can maintain a vast hash set which can significantly cut down on the files to be reviewed; removing what is “known good” can vastly reduce what needs to be investigated, thus speeding up the entire investigation.

It is also possible to create custom hash sets of notable files which can be run against a case to quickly identify what is present. If a file or data are provided at the start of the investigation, for example an image that is of interest, a hash can be created of the image and then searched for across the exhibit - based purely on the hash value. This is a quick way to identify notable files and will allow the investigator to focus on data that contains information definitely related to the investigation.

Keyword Searching

Keyword searching allows the quick identification of notable terms and information, typically retrieved from the remit or the background information. An ability to identify keywords that are relevant to an investigation is an extremely important skill. The wrong keyword choice may take several days to run and months to review. There are generally two ways to conduct a search:

1. An index search: the tool used may be able to index all data, essentially recording every word present, so that it can be searched. This type of search is comprehensive as it does not generally care about the compression used, such as in PDF’s or ZIP’s, where a real-time search would not be able to identify all relevant keywords. Whilst this search is generally very slow to setup, once completed all results are almost instantaneous (Windows performs a similar action on your local computer).

2. A real-time search: a keyword can be created and run at any point in an investigation—the search can take some time to complete. Typically a real-time search is unable to search files that are compressed or in unusual formats, unless they are first uncompressed.

Regular expressions (regexp) can be utilized to make a more specific keyword search. Regexp is a way of defining a search pattern that utilizes wildcards and special characters to offer more flexibility and power than a simple keyword search. If 1234-1234 was provided as a serial number of a device, but it was not known if it included a hyphen; if it could be replaced by another special character; or if it existed at all then multiple search terms would need to be created (also see Chapter 7).

Rather than attempt to write every possible search term a simple regexp search could be created that covered this: for example 1234[.]?1234.

The expression states that the characters in the brackets can be found zero or one time (this is denoted with a ?). Within the bracket is a . (dot), this is a regexp character that denotes anything can be between the two numbers. It is good practice to test a regexp before launching it on a case, as it is a more complex string than a simple keyword search it can take more time to complete.

Core Evidence

It is impossible to detail the core evidence available on the various operating and file systems available within a single chapter; however there are several core evidential areas that are typically applicable in a high-tech investigation:

 File Slack: the way that files are stored on a device means that there is a significant amount of storage space that is unused but is allocated to a file. This is referred to as file slack and is simply the space between the end of a file and the space it was allocated on a device. This slack space can contain information from old files, which may be fragments of important data related to the investigation. It is also possible for a user to hide information in this space so that it is not easily recoverable.

 Temporary Files: many applications utilize temporary files when performing a function, such as when a user is working on a document or printing file. These files are typically deleted when the task is complete. However if there is an improper shutdown of the device, or it loses power, it is possible to recover and identify user actions.

 Deleted Files: the way in which digital data is deleted means that in a lot of instances it is possible to recover the data. In most cases of deletion all that is actually done is the pointer to the file is removed, the actual data is still resident on the exhibit and can be recovered in a relatively easy manner.

As Windows is still the most common operating system, the following sections will briefly describe some of the core artifacts that may be of use during an investigation: including the significance of this information (also see Chapter 7).

Windows LNK Files

Windows uses shortcuts to provide links to files in other locations. This could be to an application on a desktop or to a document on a network store. These files are referred to as LNK (or link) as they have the file extension .lnk. Of particular interest to a high-tech investigator are the LNK files found within a user’s “Recent” folder. These files are created when a user opens a document and is the reference to the original document. LNK files are persistent which means they are there even after the target file is removed or no longer available. The “Recent” folder and LNK files are one of the first places an investigator will check when looking for user activity on a Windows-based system. These will provide information related to user activity; whether any external/remote drives are in use; and if any notable filenames can be found. LNK files include:

 The complete path to the original file

 Volume serial number: this is a unique reference to a partition (or volume)

 The size of the file that the LNK is pointing to

 MAC time stamps of the file the LNK is pointing to

Windows Prefetch Files

Windows Prefetch files are designed to speed up the application start-up process and contain the name of the application; the number of times it has been run; and a timestamp indicating the last time it was run. This can give a solid indication as to the applications a user has run, and even malware that was run. These can be found within the folder %SystemRoot%Prefetch.

Windows Event Logs

Windows maintains a record of all application and system activities within event logs. These are entries created automatically by the operating system and can provide significant information about chronological actions performed by users and the system. This will include user logons and offs; file access; account creation; services that are running; and the installation of drivers. They are typically used to perform troubleshooting activities on the computer: these can be found in %SystemRoot%Windowssystem32config.

Windows Registry

The Windows registry is a database storing settings for a computer defining all the users; applications; and hardware installed on the system; and any associated settings, allowing the system to be configured correctly at boot-up. The registry is stored in a format that requires decoding to be read; there are numerous tools that can do this. Once opened it provides a wealth of information including, but is in no way limited to, evidence of the applications and files a user has opened; what devices were connected; and the IP addresses used.

Restore Points

Microsoft Windows provides a service known as restore points, the version of Windows determines what these actually contain. The simple purpose of restore points is to snapshot the computer, at a predefined date and time, or when an event occurs (such as the installation of software), so that it can be restored by the user if an error occurs. The restore points contain snapshots of the Windows Registry; system files; LNK files and with later versions of Windows they can also include incremental backups of user files. This can provide an invaluable resource for an investigation, as it will provide historic information such as applications that are no longer installed. Windows XP has a default retention period of 90 days for restore points, whereas later versions are only limited by the amount of disk space permitted to be used by them.

Case Study

Following reports of customers being mis-sold legal-based documentation, a high-tech investigation was requested by a legal practice. Arrangements were made to attend the premises of the organization under investigation, legal proceedings meant that the organization had no idea that this was to happen—preventing malicious data destruction. A legal stipulation was enforced, intended to reduce loss of revenue for the business, which meant that digital devices could not be removed from the premises. Pre-search intelligence identified that up to 20 staff worked at the premises at any one time, and the access that was available for the building; including vehicle access routes. No information was available, nor time available, to identify what digital devices may be present.

The following day the premises were attended by both a legal team and a team of high-tech investigators. The scene was initially secured by removing all occupants from the vicinity of all digital devices. A full recording of the site was conducted using digital cameras and sketches and each digital device was identified. A review was made of the potential digital sources to determine their current state: in the main the devices were computers or laptops which had nothing significant running, and were therefore disconnected from power. A server was identified that was currently running, a capture was made of the memory to ensure running processes and connections were recorded, and then the server was shutdown.

Forensic data captures were made of all devices onsite, which in itself took over 12 hours. These captures were then placed into tamper proof evidence bags and returned to the laboratory and analyzed. The background to the investigation provided relevant keywords and file types. These were used to analyze the data which subsequently identified a number of files, emails and documents that were relevant to the investigation, these allowed the legal team to progress their legal proceedings.

Summary

This chapter looked at the technical side of a high-tech investigation and how they are conducted. Included were key concepts associated with investigations of digital data as well as the tools; processes; and techniques pertinent to the process from collecting the evidence through to its analysis. These concepts are important for any investigator to know so that the correct procedures and processes can be implemented and the decisions made by others are also understood. It is important to remember that no two investigations will be the same; there is simply too much variation in the types of data storage and capabilities of devices for this ever to be the case. An investigation will almost always come down to the investigator and their ability to interpret and understand what they are seeing. It is important that even those who are not involved with the high-tech investigation are aware of the processes involved, as it has such a significant impact on any investigation into cyber-crime and cyber-terrorism. Such knowledge may assist in the identification of previously unthought-of digital devices or areas of investigation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset