This chapter covers the following exam topics:
Introduction to cybersecurity forensics
The role of attribution in a cybersecurity investigation
Fundamentals of Microsoft Windows forensics
Fundamentals of Linux forensics
This chapter introduces cybersecurity forensics and defines the role of attribution in a cybersecurity investigation. You will also learn the use of digital evidence as well as the fundamentals of Microsoft Windows and Linux forensics.
The “Do I Know This Already?” quiz helps you identify your strengths and deficiencies in this chapter’s topics. The 10-question quiz, derived from the major sections in the “Foundation Topics” portion of the chapter, helps you determine how to spend your limited study time. Table 2-1 outlines the major topics discussed in this chapter and the “Do I Know This Already?” quiz questions that correspond to those topics.
1. Which of the following are the three broad categories of cybersecurity investigations?
a. Public, private, and individual investigations
b. Judiciary, private, and individual investigations
c. Public, private, and corporate investigations
d. Government, corporate, and private investigations
2. In addition to cybercrime and attacks, evidence found on a system or network may be presented in a court of law to support accusations of crime or civil action, including which of the following?
a. Fraud, money laundering, and theft
b. Drug-related crime
c. Murder and acts of violence
d. All of the above
3. Which of the following is true about attribution in a cybersecurity investigation?
a. A suspect-led approach is often accepted in supreme courts.
b. A suspect-led approach is pejorative and often biased to the disadvantage of those being investigated.
c. A suspect-led approach is mostly used in corporate investigations.
d. A suspect-led approach is mostly used in private investigations.
4. Which of the following is not true regarding the use of digital evidence?
a. Digital forensics evidence provides implications and extrapolations that may assist in proving some key fact of the case.
b. Digital evidence helps legal teams and the court develop reliable hypotheses or theories as to the committer of the crime or threat actor.
c. The reliability of the digital evidence is vital to supporting or refuting any hypothesis put forward, including the attribution of threat actors.
d. The reliability of the digital evidence is not as important as someone’s testimony to supporting or refuting any hypothesis put forward, including the attribution of threat actors.
5. Which of the following statements is true about processes and threads?
a. Each thread starts with a single process, known as the primary process, but can also create additional processes from any of its services.
b. Each service starts with a single hive, known as the primary hive, but can also create additional threads from any of its hives.
c. Each process starts with a single thread, known as the primary thread, but can also create additional threads from any of its threads.
d. Each hive starts with a single thread, known as the primary thread, but can also create additional threads from any of its threads.
6. What is a job in Microsoft Windows?
a. A job is a group of threads.
b. A job is a group of hives.
c. A job is a group of services.
d. A job is a group of processes.
7. Which of the following file systems is more secure, scalable, and advanced?
a. FAT32
b. FAT64
c. uFAT
d. NTFS
8. Which of the following Linux file systems not only supports journaling but also modifies important data structures of the file system, such as the ones destined to store the file data for better performance and reliability?
a. GRUB
b. LILO
c. Ext4
d. FAT32
9. Which of the following are examples of Linux boot loaders?
a. GRUB
b. ILOS
c. LILO
d. Ubuntu BootPro
10. Which of the following is true about journaling?
a. The journal is the least used part of the disk, making the blocks that form part of it more prone to hardware failure.
b. The journal is the most used part of the disk, making the blocks that form part of it less prone to hardware failure.
c. The journal is the most used part of the disk, making the blocks that form part of it more prone to hardware failure.
d. The journal is the least used part of the disk, making the blocks that form part of it less prone to hardware failure.
Cybersecurity forensics (or digital forensics) has been of growing interest among many organizations and individuals due to the large number of breaches during the last few years. Many folks choose digital forensics as a career path in law enforcement and corporate investigations. During the last few years, many technologies and forensic processes have been designed to meet the growing number of cases relying on digital evidence. There is a shortage of well-trained, experienced personnel who are experts in cybersecurity forensics.
Cybersecurity forensic practitioners are at a crossroads in terms of changes affecting evidence recovery and management. Forensic evidence is often used in a court of law. This is why it is extremely important for digital forensic experts to perform an excellent analysis and collect and maintain reliable evidence. Also, the huge increase in cybercrime has accelerated the need for enhanced information security management. It also requires forensics experts to help remediate the network and affected systems and try to reveal the responsible threat actor. This is often called threat actor attribution. Desktops, laptops, mobile devices, servers, firewall logs, and logs from network infrastructure devices are rich in information of evidentiary value that can assist forensics experts in reconstructing the attack and gain a better understanding of the threat actor responsible for the attack.
There are three broad categories of cybersecurity investigations:
Public investigations: These investigations are resolved in the court of law.
Private investigations: These are corporate investigations.
Individual investigations: These investigations often take the form of ediscovery.
In addition to cybercrime and attacks, evidence found on a system or network may be presented in a court of law to support accusations of crime or civil action, including but not limited to the following:
Extortion
Domestic violence
Fraud, money laundering, and theft
Drug-related crime
Murder and acts of violence
Pedophilia and cyber stalking
Sabotage
Terrorism
Usually, criminal investigations and prosecutions involve government agencies that work within the framework of criminal law. Cybersecurity forensic practitioners are expected to provide evidence that may help the court make their decision in the investigated case. Also, practitioners must constantly be aware of and comply with regulations and laws during case examination and evidence presentation. It is important to know that factors detrimental to the disclosure of digital evidence include the knowledge of exculpatory evidence that would challenge the evidence.
One of the key topics in cybersecurity forensics is attribution of assets and threat actors. There is undeniable motivation to support an evidence-led approach to cybersecurity forensics to achieve good attribution. A suspect-led approach is pejorative and often biased to the disadvantage of those being investigated. Due to the large number of technical complexities, it is often impractical for cybersecurity forensics experts to determine fully the reliability of endpoints, servers, or network infrastructure devices and provide assurances to the court about the soundness of the processes involved and the complete attribution to a threat actor.
The forensics expert needs to ensure that not one part of the examination process is overlooked or repetitive. In addition, cybersecurity forensic experts are often confronted with the inefficacy of traditional security processes in systems and networks designed to preserve documents and network functionality—especially because most systems are not designed to enhance digital evidence recovery. There is a need for appropriate cybersecurity forensic tools, including software imaging and the indexing of increasingly large datasets in order to successfully reconstruct an attack and attribute the attack to an asset or threat actor. One thing to keep in mind is that traditional digital forensics tools are typically designed to obtain the “lowest-hanging fruit” and encourage security professionals to look for the evidence that is easiest to identify and recover. Often, these tools do not have the capability to even recognize other, less-obvious evidence.
During cybersecurity investigations, the forensics expert may revisit portions of the evidence to determine its validity. As a result, additional investigation might be required, which often can be a tedious process. In some cases, the complexity of the network and the time required for the investigation can affect the efficacy of the cybersecurity forensics professional to reconstruct and provide an accurate interpretation of the evidence. From a practical and realistic perspective, the amount of time and effort involved in the digital forensic process should pass an acceptable “reasonableness test.” In other words, all imaginable effort shouldn’t be put into finding all conceivable traces of evidence and then seizing and analyzing it. This is especially becoming more challenging for the cybersecurity forensics expert as the volume of data to be analyzed becomes too big.
Evidence in cybersecurity investigations that go to court is used to prove (or disprove) facts that are in dispute, as well as to prove the credibility of disputed facts (in particular, circumstantial evidence or indirect evidence). Digital forensics evidence provides implications and extrapolations that may assist in proving some key fact of the case. Such evidence helps legal teams and the court develop reliable hypotheses or theories as to the committer of the crime (threat actor). The reliability of the evidence is vital to supporting or refuting any hypothesis put forward, including the attribution of threat actors.
Digital forensic evidence is information in digital form found on a wide range of endpoint, server, and network devices—basically, any information that can be processed by a computing device or stored on other media. Evidence tendered in legal cases, such as criminal trials, is classified as witness testimony or direct evidence, or as indirect evidence in the form of an object, such as a physical document, the property owned by a person, and so forth. Cybersecurity forensic evidence can take many forms, depending on the conditions of each case and the devices from which the evidence was collected.
There are three general types of evidence:
Best evidence
Corroborating evidence
Indirect or circumstantial evidence
Historically, the term best evidence refers to evidence that can be presented in court in the original form (for example, an exact copy of a hard disk drive). However, in cyber forensics, what is the original when it comes to digital photography, copier machines, computer storage, and cloud storage? Typically, properly collected system images and appropriate copies of files can be used in court.
Corroborating evidence (or corroboration) is evidence that tends to support a theory or an assumption deduced by some initial evidence. This corroborating evidence confirms the proposition.
Indirect or circumstantial evidence relies on an extrapolation to a conclusion of fact (such as fingerprints, DNA evidence, and so on). This is, of course, different from direct evidence. Direct evidence supports the truth of a proclamation without need for any additional evidence or interpretation. Forensic evidence provided by an expert witness is typically considered circumstantial evidence. Indirect or circumstantial evidence is often used in civil and criminal cases that lack direct evidence.
Digital information that is stored in electronic databases and computer-generated audit logs and does not contain information generated by humans has been challenged in some court trials. Law enforcement and courts can also demand proof that the creation and storage of evidence records are part of the organization’s business activities.
Again, cybersecurity forensic evidence can take many forms, depending on the conditions of each case and the devices from which the evidence was collected. To prevent or minimize contamination of the suspect’s source device, you can use different tools, such as a piece of hardware called a write blocker, on the specific device so you can copy all the data (or an image of the system).
The imaging process is intended to copy all blocks of data from the computing device to the forensics professional evidentiary system. This is sometimes referred to as a “physical copy” of all data, as distinct from a logical copy, which only copies what a user would normally see. Logical copies do not capture all the data, and the process will alter some file metadata to the extent that its forensic value is greatly diminished, resulting in a possible legal challenge by the opposing legal team. Therefore, a full bit-for-bit copy is the preferred forensic process. The file created on the target device is called a forensic image file. The following are the most common file types for forensic images:
.AFF
.ASB
.E01
.DD or raw image files
Virtual image formats such as .VMDK and .VDI
The benefit of being able to make an exact copy of the data is that the data can be copied and the original device can be returned to the owner or stored for trial, normally without having to be examined repeatedly. This reduces the likelihood of drive failure or evidence contamination.
SANS has a good resource that goes over disk imaging tools in cyber forensics at https://www.sans.org/reading-room/whitepapers/incident/overview-disk-imaging-tool-computer-forensics-643.
In short, imaging or disk imaging is the process of making a forensically sound copy to media that can retain the data for an extended amount of time. One of the things to be careful about is to make sure that the disk imaging does not alter the layout of the copy or even omit free and deleted space. It is very important to have a forensically sound copy of the original evidence and only work from that copy to avoid making changes or altering the original image. In addition, you must use appropriate media to avoid any alteration or contamination of the evidence. The original copy should be placed in secure storage or a safe.
There is also the process of file deletion and its degradation and eventual erasure through system operation. This results in many files being partly stored in the unallocated area of a system’s hard disk drive. Typically, such fragments of files can only be located and “carved out” manually using a hex editor that’s able to identify file headers, footers, and segments held in the image. This is because the file system allocation information is not typically available and results in a very labor-intensive and challenging operation for the forensics professional. File carving continues to be an important process that’s used in many cases where the recovery of alleged deleted files is required. Different forensic tools are available, such as ILookIX, EnCase, and others. These tools provide features that allow you to locate blocks and sectors of hard disk drives that could contain deleted information that’s important. Recovering files from unallocated space is usually referred to as data carving.
It is very important to make sure that the timestamps of all files on a system being analyzed during any cyber forensics investigation are reliable. This is critical for making a valid reconstruction of key events of the attack or security incident.
Mobile devices such as cell phones, wearables, and tablets are not imaged in the same way as desktops. Also, today’s Internet of Things (IoT) world is very different from just a few years ago. Now we have to worry about collecting evidence from low-power and low-resource devices (including sensors, fog edge devices, and so on). The hardware and interfaces of these devices, from a forensic perspective, are very different. For example, an iPhone cannot be accessed unless you know the manufacturing password from Apple. Apple uses a series of encrypted sectors located on microchips, making it difficult to access the raw data inside the phone. Newer Android versions similarly prevent more than a backup being taken of the device and no longer allow physical dumps to be recovered.
In some cases, not only does evidence need to be collected from mobile devices, but also from mobile device management (MDM) applications and solutions.
You can collect a lot of information from network infrastructure devices, such as routers, switches, wireless LAN controllers, load balancers, firewalls, and many others that can be very beneficial for cybersecurity forensics investigations. Collecting all this data can be easier said than done, which is why it is important to have one or more systems as a central log repository and to configure all your network devices to forward events to this central log analysis tool. You should also make sure it can hold several months’ worth of events. As you learned during your preparation for the SECFND exam, syslog is often used to centralize events. You should also increase the types of events that are logged—for example, DHCP events, NetFlow, VPN logs, and so on.
Another important thing to keep in mind is that network devices can also be compromised by threat actors. Subsequently, the data generated by these devices can also be assumed to be compromised and manipulated by the attacker. Finding forensic evidence for these incidents can become much harder.
Network infrastructure devices can be compromised by different attack methods, including the following:
Leftover troubleshooting commands
Manipulating Cisco IOS images
Security vulnerabilities
Cisco has several good resources that go over device integrity assurance and verification. These resources can be found at the following links:
Cisco IOS Software Integrity Assurance
http://www.cisco.com/c/en/us/about/security-center/integrity-assurance.html
Cisco IOS XE Software Integrity Assurance
http://www.cisco.com/web/about/security/intelligence/ios-xe-integrity-assurance.html
Cisco Guide to Harden Cisco IOS Devices
http://www.cisco.com/c/en/us/support/docs/ip/access-lists/13608-21.html
http://www.cisco.com/web/about/security/intelligence/iosimage.html
Offline Analysis of IOS Image Integrity Blog
http://blogs.cisco.com/security/offline-analysis-of-ios-image-integrity
Securing Tool Command Language on Cisco IOS
http://www.cisco.com/web/about/security/intelligence/securetcl.html
Cisco Security Vulnerability Policy
http://www.cisco.com/web/about/security/psirt/security_vulnerability_policy.html
Use of the Configuration Register on All Cisco Routers
Digitally Signed Cisco Software
Cisco IOS Software Checker
http://tools.cisco.com/security/center/selectIOSVersion.x
Creating Core Dumps
http://www.cisco.com/en/US/docs/internetworking/troubleshooting/guide/tr19aa.html
Cisco IOS Configuration Guide
MD5 File Validation
Image Verification
Telemetry-Based Infrastructure Device Integrity Monitoring
http://www.cisco.com/web/about/security/intelligence/network-integrity-monitoring.html
Cisco Supply Chain Security
These documents go over numerous identification techniques, including the following:
Image file verification using the Message Digest 5 file validation feature
Using the image verification feature
Using offline image file hashes
Verifying authenticity for digitally signed images
Cisco IOS runtime memory integrity verification using core dumps
Creating a known-good text region
Text memory section export
Cisco address space layout randomization considerations
Different indicators of compromise
Unusual and suspicious commands
Checking that Cisco IOS software call stacks are within the text section boundaries
Checking command history in the Cisco IOS core dump
Checking the command history
Checking external accounting logs
Checking external syslog logs
Checking booting information
Checking the ROM monitor variable
Checking the ROM monitor information
You can take several preventive steps to facilitate a forensic investigation of network devices, including the following security best practices:
Maintaining Cisco IOS image file integrity
Implementing change control
Hardening the software distribution server
Keeping Cisco IOS software updated
Deploying digitally signed Cisco IOS images
Using Cisco Secure Boot
Using Cisco Supply Chain Security
Leveraging the Latest Cisco IOS security protection features
Using authentication, authorization, and accounting
Using TACACS+ authorization to restrict commands
Implementing credentials management
Implementing configuration controls
Protecting interactive access to devices
Gaining traffic visibility with NetFlow
Using centralized and comprehensive logging
Chain of custody is the way you document and preserve evidence from the time that you started the cyber forensics investigation to the time the evidence is presented in court. It is extremely important to be able to show clear documentation of the following:
How the evidence was collected
When it was collected
How is was tracked
How it was stored
Who had access to the evidence and how it was accessed
If you fail to maintain proper chain of custody, it is likely you will not be able to use the evidence in court. It is also important to know how to dispose of evidence after an investigation.
When you collect evidence, you must protect its integrity. This involves making sure that nothing is added to the evidence and that nothing is deleted or destroyed (this is known as evidence preservation).
A method often used for evidence preservation is to only work with a copy of the evidence—in other words, you do not want to work directly with the evidence itself. This involves creating an image of any hard drive or any storage device.
Several forensics tools are available on the market. The following are two of the most popular:
Guidance Software’s EnCase (https://www.guidancesoftware.com/)
AccessData’s Forensic Toolkit (http://accessdata.com/)
Another methodology used in evidence preservation is to use write-protected storage devices. In other words, the storage device you are investigating should immediately be write-protected before it is imaged and should be labeled to include the following:
Investigator’s name
The date when the image was created
Case name and number (if applicable)
Additionally, you must prevent electronic static or other discharge from damaging or erasing evidentiary data. Special evidence bags that are antistatic should be used to store digital devices. It is very important that you prevent electrostatic discharge (ESD) and other electrical discharges from damaging your evidence. Some organizations even have cyber forensic labs that control access to only authorized users and investigators. One method often used involves constructing what is called a Faraday cage. This “cage” is often built out of a mesh of conducting material that prevents electromagnetic energy from entering into or escaping from the cage. Also, this prevents devices from communicating via Wi-Fi or cellular signals.
What’s more, transporting the evidence to the forensics lab or any other place, including the courthouse, has to be done very carefully. It is critical that the chain of custody be maintained during this transport. When you transport the evidence, you should strive to secure it in a lockable container. It is also recommended that the responsible person stay with the evidence at all times during transportation.
This section covers the fundamentals of Windows forensics and related topics.
While preparing for the CCNA Cyber Ops SECFND exam, you learned that a process is a program that the system is running. Each process provides the required resources to execute a program. A process is made up of one or more threads, which are the basic unit an operating system allocates process time to. A thread can be executed during any part of the application runtime, including being executed by another thread. Each process starts with a single thread, known as the primary thread, but can also create additional threads from any of its threads.
Processes can be grouped together and managed as a unit. This is called a job object and can be used to control attributes of the processes they are associated with. Grouping processes together simplifies impacting a group of processes because any operation performed on a specific job object will impact all associated processes. A thread pool is a group of worker threads that efficiently execute asynchronous callbacks for the application. This is done to reduce the number of application threads and to manage the worker threads. A fiber is a unit of execution that is manually scheduled by an application. Threads can schedule multiple fibers; however, fibers do not outperform properly designed multithreaded applications.
It is important to understand how these components all work together when developing applications and later securing them. There are many threats to applications (known as vulnerabilities) that could be abused to change the intended outcome of an application. This is why it is critical to include security in all stages of developing applications to ensure these and other application components are not abused.
Windows services are long running executable applications that run in their own Windows session. Basically, they are services that run in the background. Services can automatically kick on when a computer boots up. Services are ideal for running things within a user security context, starting applications that should always be run for a specific user, and long running functionally that doesn’t interface with other users who are working on the same computer. An example would be the desire to have an application that monitors if the storage is consumed past a certain threshold. The programmer would create a Windows service application that monitors storage space and set it to automatically start at boot, so it is continuously monitoring for the critical condition. If the user chooses not to monitor their system, they could open the services windows and change the Startup type to manual, meaning it must be manually turned on or they could just stop the service. The services inside the service control manager can be started, stopped, or triggered by an event. Because services operate in their own user account, they can operate when a user is not logged in to the system, meaning the monitor storage space application example could be set to automatically run for a specific user or for any users including when there isn’t a user logged in.
Windows administrators can manage services using services snap-in, Sc.exe, or Windows PowerShell. The services snap-in is built in with the services management console and can connect to a local or remote computer on a network enabling the administrator to perform some of the following actions.
View installed services
Start, stop, or restart services
Change the startup type for a service
Specify service parameters when available
Change the startup type
Change the user account context where the service operates
Configure recovery actions in the event a service fails
Inspect service dependencies for troubleshooting
Export the list of services
The Sc.exe, also known as the Service Control utility, is a command-line version of the services snap-in. This means it can do everything the services snap-in can do, as well as install and uninstall services. Windows PowerShell can also manage Windows services using the following commands, also called cmdlets:
Get-Service: Gets the services on a local or remote computer
New-Service: Creates a new Windows service
Restart-Service: Stops and then starts one or more services
Resume-Service: Resumes one or more suspended (paused) services
Set-Service: Starts, stops, and suspends a service and changes its properties
Start-Service: Starts one or more stopped services
Stop-Service: Stops one or more running services
Suspend-Service: Suspends (pauses) one or more running services.
Other tools that can manage Windows services are Net.exe, Windows Task Manager, and MSConfig; however, their capabilities are limited compared to the other tools mentioned. For example, MSConfig can enable or disable Windows services while Windows Task manager can show a list of installed services as well as start or stop them.
Like other aspects of Windows, services are targeted by attackers. Windows has improved securing services in later versions of the operating system after finding various attack methods compromising and completely owning older versions of Windows. Windows, however, is not perfect, so best practice dictates securing services such as disabling the following services unless they are needed:
TCP 53: DNS Zone Transfer
TCP 135: RPC Endpoint Mapper
TCP 139: NetBIOS Session Service
TCP 445: SMB Over TCP
TCP 3389: Terminal Services
UDP 137: NetBIOS Name Service
UDP 161: Simple Network Management Protocol
TCP/UDP 389: Lightweight Directory Access Protocol
In addition, you should enable host security solutions such as the Windows Firewall filters services from outsiders. Enforcing least privilege access, using restricted tokens, and access control can reduce the damages that could occur if an attacker successfully compromised a Windows system’s services. Basically applying best practices to secure hosts and your network will also apply to reducing the risk of attacks against Microsoft Windows system services.
The list that follows highlights the key concepts concerning processes and threads:
A process is a program that the system is running and is made of one or more threads.
A thread is a basic unit an operating system allocates process time to.
A job is a group of processes.
A thread pool is a group of worker threads that efficiently execute asynchronous callbacks for the application.
Microsoft Windows services are long running executable applications that run in their own Windows session
Services are ideal for running things within a user security context, starting applications that should always be run for a specific user, and long running functionally that doesn’t interface with other users who are working on the same computer
Windows administrators can manage services using Services snap-in, Sc.exe, or Windows PowerShell
When performing forensics investigations in Windows or any other operating system, you should look for orphan and suspicious processes and services on the system. Malware could create processes running in your system.
Memory can be managed different ways, which is referred to as memory allocation or memory management. Static memory allocation is when a program allocates memory at compile time. Dynamic memory allocation is when a program allocates memory at runtime. Memory can be assigned to blocks representing portions of allocated memory dedicated to a running program. A program will request a block of memory, which the memory manager will assign to the program. When the program completes whatever it’s doing, the allocated memory blocks are released and available for other uses.
A stack is the memory set aside as scratch space for a thread of execution. A heap is memory set aside for dynamic allocation, meaning where you put data on the fly. Unlike a stack, there isn’t an enforced pattern to the allocation and deallocation of blocks from the heap. With heaps, you can allocate a block at any time and free it at any time. Stacks are best when you know how much memory is needed, whereas heaps are better for when you don’t know how much data you will need at runtime or if you need to allocate a lot of data. Memory allocation happens in hardware, in the operating system, and in programs and applications.
There are various approaches to how Windows allocates memory. The ultimate result is the same; however, the approaches are slightly different. VirtualAlloc is a specialized allocation of the OS virtual memory system, meaning it is allocated straight into virtual memory via reserved blocks of memory. Typically, it is used for special-purpose type allocation because the allocation has to be very large, needs to be shared, needs a specific value, and so on. Allocating memory in the virtual memory system is the most basic form of memory allocation. Typically, VirtualAlloc manages pages in the Windows virtual memory system.
HeapAlloc allocates any size of memory that is requested dynamically. It is designed to be very fast and used for general-purpose allocation. Heaps are set up by VirtualAlloc used to initially reserve allocation space from the OS. Once the memory space is initialized by the VirtualAlloc, various tables, lists, and other data structures are built to maintain operation of the heap. Heaps are great for smaller objects; however, due to having a guaranteed thread allocation, they can cause performance issues. HeapAlloc is a Windows API function.
The next memory examples are more programing focused and not Windows dependent. Malloc is a standard C and C++ library function that allocates memory to a process using the C runtime heap. Malloc will usually require one of the operating system APIs to create a pool of memory when the application starts running and then allocate from that pool as there are Malloc requests for memory. Malloc therefore has the disadvantage of being runtime dependent.
It is important to note that Malloc is part of a standard, meaning it is portable, whereas HeapAlloc is not portable, meaning it’s a Windows API function.
Another programing-based memory allocator is New, which is a standard C++ operator that allocates memory and then calls constructors on that memory. New has the disadvantage of being compiler dependent and language dependent, meaning other programing languages may not support New. One final programing-based memory allocator is CoTaskMemAlloc, which has the advantage of working well in either C, C++, or Visual Basic. It is not important for the SECFND to know the details of how each memory allocator functions. The goal is to have a general understanding of memory allocation.
The list that follows highlights the key memory allocation concepts:
Volatile memory is memory that loses its contents when the computer or hardware storage device loses power.
Nonvolatile memory, or NVRAM, holds data with or without power.
Static memory allocation is when a program allocates memory at compile time.
Dynamic memory allocation is when a program allocates memory at runtime.
A heap is memory set aside for dynamic allocation.
A stack is memory set aside as scratch space for a thread of execution.
VirtualAlloc is a specialized allocation of the OS virtual memory system, meaning it’s allocated straight into virtual memory via reserved blocks of memory.
HeapAlloc allocates any size of memory that is requested.
Malloc is a standard C and C++ library function that allocates memory to a process using the C runtime heap.
New and CoTaskMemAlloc are also programing-based memory allocators.
Pretty much anything performed in Windows refers to or is recorded into the registry, meaning any actions taken by a user reference the Windows registry. Therefore, a definition for the Windows registry could be a hierarchical database used to store information necessary to configure the system for one or more users, applications, and hardware devices.
Some functions of the registry are to load device drivers, run startup programs, set environment variables, and store user settings and operating system parameters. You can view the Windows registry by typing the command regedit in the Run window.
The Windows registry can contain very valuable information that is useful to cyber forensic professionals. It can contain information about recently run programs, programs that have been installed or uninstalled, users who perhaps have been removed or created by a threat actor, and much more.
The Windows subsystem that manages the registry is called the Configuration Manager. The Windows registry appears as a single hierarchy in tools such as regedit; however, it is actually composed of a number of different binary files, called hives, on disk. The hive files themselves are broken into fixed-sized bins of 0 × 1000 bytes, and each bin contains variable-length cells. These cells hold the actual registry data. References in hive files are made by the cell index. The cell index is a value that can be used to determine the location of the cell containing the referenced data. The structure of the registry data is typically composed of two distinct data types: key nodes and value data.
The structure of the registry is similar to a file system. The key nodes are similar to directories or folders, and the values can be compared to files. On the other hand, data in the registry always has an unequivocal associated type, unlike data on a file system. To work with registry data in memory, it is necessary to find out where in memory the hives have been loaded and know how to translate cell indexes to memory addresses. It will also be helpful to understand how the Windows Configuration Manager works with the registry internally, and how we can make use of its data structures to tell us what the operating system itself maintains about the state of the registry.
The folders listed on the left start with the five hierarchal folders called hives, each beginning with the term HKEY (meaning “handle to a key”). Two of the hives are real locations: HKEY_USERS (HKU) and HKEY_LOCAL_MACHINE (HKLM). The remaining three are shortcuts to other elements within the HKU and HKLM hives. Each of these main five hives is composed of keys, which contain values and subkeys. Values are the names of specific values pertaining to the operation system or applications within a key. One way to think of the Windows registry is to compare it to an application containing folders. Inside an application, folders hold files. Inside the Windows registry, the hives hold values.
The following list defines the function of the five hives within the Windows registry:
HKEY_CLASSES_ROOT (HKCR): HKCR information ensures that the correct programs open when executed in Windows Explorer. HKCR also contains further details on drag-and-drop rules, shortcuts, and information on the user interface. The reference location is HKLMSoftwareClasses.
HKEY_CURRENT_USER (HKCU): HKCU contains configuration information for any user who is currently logged in to the system, including the user’s folders, screen colors, and Control Panel settings. The reference location for a specific user is HKEY_USERS. The reference for a general user is HKU.DEFAULT.
HKEY_CURRENT_CONFIG (HCU): HCU stores information about the system’s current configuration. The reference for HCU is HKLMConfigprofile.
HKEY_LOCAL_MACHINE (HKLM): HKLM contains machine hardware-specific information that the operating system runs on. This includes a list of drives mounted on the system and generic configurations of installed hardware and applications. HKLM is a hive that isn’t referenced from within another hive.
HKEY_USERS (HKU): HKU contains configuration information of all user profiles on the system. This includes application configurations and visual settings. HKU is a hive that isn’t referenced from within another hive.
Some interesting data points can be abstracted from analyzing the Windows registry. All registries contain a value called LastWrite time, which is the last modification time of a file. This can be used to identify the approximate date and time an event occurred. Autorun locations are registry keys that launch programs or applications during the boot process. This is extremely important to protect because Autorun could be used by an attacker for executing malicious applications. The most recently used (MRU) list contains entries made due to actions performed by the user. The purpose of an MRU is to contain a list of items in the event the user returns to them in the future. Think of an MRU as similar to how a cookie is used in a web browser. The UserAssist key contains information about what resources the user has accessed.
Many other things, such as network settings, USB devices, and mounted devices, have registry keys that can be pulled up to identify activity within the operating system.
The registry can specify whether applications start automatically when the system is booted or when a user logs in. A good reference about this is the following Microsoft Sysinternals document: https://technet.microsoft.com/en-us/sysinternals/bb963902. Malware can change the registry to automatically start a program when the system is booted or when a user logs in.
A good example of Windows registry categories related to program execution and other functions can be found at https://blogs.sans.org/computer-forensics/files/2012/06/SANS-Digital-Forensics-and-Incident-Response-Poster-2012.pdf.
The Security hive is one of the Windows registry hives that includes information that is related to the running and operations of the system. The information available in this and other hives is all about the system, rather than specific users on the system. The Windows Registry Security hive contains useful information regarding the system configuration and settings.
Information about local users on a system is maintained in the SAM “database” or hive file. In corporate environments, the SAM hive may not have a great deal of useful information. User information may be found on a domain controller or LDAP server. However, in environments where the users access their system using local accounts, this hive file can provide great information.
In some cases, during investigations you may need to crack a user’s password—for instance, a user created by a threat actor and used in malware. There are several free password cracking tools available, including Cain & Abel (http://www.oxid.it/cain.html), OphCrack (http://ophcrack.sourceforge.net), and John the Ripper (http://www.openwall.com/john).
The System hive contains a great deal of configuration information about the system and devices that were included in it and have been attached to it.
Before learning the different file system structures, you need to understand the different parts in a partitioned hard drive.
The master boot record (MBR) is the first sector (512 bytes) of the hard drive. It contains the boot code and information about the hard drive itself. The MBR contains the partition table, which includes information about the partition structure in the hard disk drive. The MBR can tell where each partition starts, its size, and the type of partition. While performing forensics analysis, you can verify the existing partition with the information in the MBR and the printed size of the hard drive for a match. If there is some missing space, you can assume a potential compromise or corruption of the system.
The first sector (512 bytes) of each partition contains information, such as the type of the file system, the booting code location, the sector size, and the cluster size in reference to the sector.
If you formatted the partition with FAT or NTFS, some sectors at the beginning of the partition will be reserved for the master file table (MFT), which is the location that contains the metadata about the files in the system. Each entry is 1 KB in size, and when a user deletes a file, the file’s entry in the MFT is marked as unallocated. On the other hand, the file’s information still exists until another file uses this MFT entry and overwrites the previous file’s information.
The rest of the partition space after the file system’s area has been reserved will be available for data. Each unit of the data area is called a cluster or block. If files are deleted from the hard drive, the clusters that contain data related to this file will be marked as unallocated. Subsequently, the data will exist until new data that is related to a new file overwrites it.
The following are a few facts about clusters:
Allocated cluster: Holds data that is related to a file that exists and has an entry in the file system’s MFT area.
Unallocated cluster: A cluster that has not been connected to an existing file and may be empty or “not empty,” thus containing data that is related to a deleted file and still hasn’t been overwritten with a new file’s data.
When you run a backup tool for the system, it backs up only the files that exist in the current file system’s MFT area and identifies its related cluster in the data area as allocated. Typically, when you back up your hard drive, the backup software compresses the data. On the other hand, when you are collecting a forensic image, the size of the collected image must be exactly equal to the size of the source.
The File Allocation Table (FAT) was the default file system of the Microsoft DOS operating system back in the 1980s. Then other versions were introduced, including FAT12, FAT16, FAT32, and exFAT. Each version overcame some of the limitations of the file system until the introduction of the New Technology File System (NTFS).
FAT partitions include the following main areas:
Boot sector, which is the first sector of the partition that is loaded in memory. The boot sector includes the following information:
Jump code, which is the location of the bootstrap and the operating system initialization code
Sector size
Cluster size
The total number of sectors in the partition
Number of root entries (FAT12 and FAT16 only)
The File Allocation Table (FAT), which is the actual file system
Another copy of the FAT table if the first FAT table has been corrupted
Root directory entries
The address of the first cluster, which contains the file’s data
The data area
One of FAT’s limitations is that no modern properties can be added to the file, such as compression, permissions, and encryption.
The number after each FAT version, such as FAT12, FAT16, or FAT32, represents the number of bits that are assigned to address clusters in the FAT table:
FAT12: This is a maximum of 2^12 = 4,096 clusters.
FAT16: This is a maximum of 2^16 = 65,536 clusters.
FAT32: This is a maximum of 2^32 = 4,294,967,296 clusters, but it has 4 reserved bits, so it is actually 28 bits, which means a maximum of 2^28 = 268,435,456.
exFAT: This uses the whole 32 bits for addressing.
NTFS is the default file system in Microsoft Windows since Windows NT and is a more secure, scalable, and advanced file system compared to FAT. NTFS has several components. The boot sector is the first sector in the partition, and it contains information about the file system itself, such as the start code, sector size, cluster size in sectors, and the number of reserved sectors. The file system area contains many files, including the master file table (MFT). The MFT includes metadata of the files and directories in the partition. The data area holds the actual contents of the files, and it is divided in clusters with a size assigned during formatting and recorded in the boot sector.
NTFS has a file called $MFT. In this file is an entry for each file in the partition. This entry is 1,024 bytes in size. It even has an entry for itself. Each entry has a header of 42 bytes at the beginning and a signature of 0xEB52904E, which is equivalent to FILE in ASCII. The signature also can be BAD, which in this case indicates that an error has occurred. After the header is another 982 bytes left to store the file metadata. If there is space left to store the file contents, the file’s data is stored in the entry itself and no space in the data area is used by this file. MFT uses attributes to stockpile the metadata of the file. Different attribute types can be used in a single MFT entry and are assigned to store different information.
NTFS keeps track of lots of timestamps. Each file has a timestamp for Modify, Access, Create, and Entry Modified (commonly referred to as the MACE values).
NTFS includes a feature referred to as Alternate Data Streams (ADS). This feature has also been referred to as “multiple data streams” as well as “alternative data streams.” ADS exists with the goal of supporting the resource forks employed by the Hierarchal File System (HFS) employed by Apple Macintosh systems.
Microsoft File System Resource Manager (FSRM) also uses ADS as part of “file classification.”
Cybersecurity forensics experts use tools such as EnCase and ProDiscover to collect evidence from systems. These tools display the ADS found in acquired images in red.
The EFI system partition (ESP) is a partition on a hard disk drive or solid-state drive whose main purpose is to interact with the Unified Extensible Firmware Interface (UEFI). UEFI firmware loads files stored on the EFI system partition to start the operating system and different utilities. An EFI system partition needs to be formatted with a file system whose specification is based on the FAT file system and maintained as part of the UEFI specification. The EFI system partition specification is independent from the original FAT specification. It includes the boot loaders or kernel images for all installed operating systems that are present in other partitions. It also includes device driver files for hardware devices present in a system and used by the firmware at boot time, as well as system utility programs that run before an operating system is loaded. The EFI system partition also contains data files, including error logs.
The Unified Extensible Firmware Interface Forum at http://www.uefi.org has a lot of great information about Secure Boot, UEFI operations, specifications, tools, and much more.
This section covers cyber forensics fundamentals of Linux-based systems. Most of these concepts also apply to the Mac OS X operating system.
In Linux, there are two methods for starting a process—starting it in the foreground and starting it in the background. You can see all the processes in UNIX by using the command ps () in a terminal window, also known as shell. What follows ps are the details of what type of processes should be displayed. Example 2-1 includes the output of the ps command in a Linux system.
omar@odin:~$ ps awux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 120416 6432 ? Ss Oct27 0:30 /lib/systemd/systemd --system --deserialize 20
daemon 867 0.0 0.0 26044 1928 ? Ss Oct27 0:00 /usr/sbin/atd -f
root 938 0.0 0.0 19472 252 ? Ss Oct27 3:22 /usr/sbin/irqbalance --pid=/var/run/irqbalance.pid
root 1027 0.0 0.1 65520 5760 ? Ss Oct27 0:00 /usr/sbin/sshd -D
root 1040 0.0 0.4 362036 16752 ? Ssl Oct27 33:00 /usr/bin/dockerd -H fd://
redis 1110 0.0 0.1 40136 6196 ? Ssl Oct27 63:44 /usr/bin/redis-server 127.0.0.1:6379
mysql 1117 0.0 3.2 1300012 127632 ? Ssl Oct27 41:24 /usr/sbin/mysqld
root 1153 0.0 0.0 4244 580 ? Ss Oct27 0:00 runsv nginx
root 1230 0.0 0.0 15056 1860 ? Ss Oct27 0:00 /usr/sbin/xinetd -pidfile /run/xinetd.pid -stayalive -inetd_compat -
root 1237 0.0 0.1 142672 4396 ? Ssl Oct27 3:01 docker-containerd -l unix:///var/run/docker/libcontainerd/docker-con
root 1573 0.0 0.0 65408 3668 ? Ss Oct27 0:42 /usr/lib/postfix/sbin/master
postfix 1578 0.0 0.0 67644 3852 ? S Oct27 0:15 qmgr -l -t unix -u
root 4039 0.0 0.0 0 0 ? S 19:08 0:00 [kworker/0:1]
root 4478 0.0 0.0 43976 3544 ? Ss Nov27 0:02 /lib/systemd/systemd-udevd
root 4570 0.0 0.1 275876 6348 ? Ssl Nov27 0:55 /usr/lib/accountsservice/accounts-daemon
root 5477 0.0 0.0 0 0 ? S 19:29 0:00 [kworker/u8:1]
bind 6202 0.0 1.5 327604 59748 ? Ssl Nov02 17:04 /usr/sbin/named -f -u bind
postfix 7371 0.0 0.1 67476 4524 ? S 19:57 0:00 pickup -l -t unix -u -c
root 7413 0.0 0.0 0 0 ? S 19:58 0:00 [kworker/u8:0]
root 7580 0.0 0.0 4508 700 ? Ss 20:00 0:00 /bin/sh /opt/gitlab/embedded/bin/gitlab-logrotate-wrapper
root 8267 0.0 0.0 4380 660 ? S 20:10 0:00 sleep 3000
root 8346 0.0 0.1 111776 7496 ? Ss 20:11 0:00 sshd: omar [priv]
omar 8358 0.0 0.0 118364 1640 ? S 20:12 0:00 sshd: omar [priv]
omar 8359 0.0 0.1 45368 5084 ? Ss 20:12 0:00 /lib/systemd/systemd --user
root 8362 0.0 0.0 0 0 ? S 20:12 0:00 [kworker/1:0]
root 8364 0.0 0.0 0 0 ? S 20:12 0:00 [kworker/0:0]
omar 8365 0.0 0.0 162192 2860 ? S 20:12 0:00 (sd-pam)
omar 8456 0.0 0.0 111776 3492 ? R 20:12 0:00 sshd: omar@pts/0
omar 8457 0.1 0.1 22576 5136 pts/0 Ss 20:12 0:00 -bash
root 8497 0.0 0.0 0 0 ? S 20:12 0:00 [kworker/u8:2]
git 8545 0.0 0.0 4380 672 ? S 20:13 0:00 sleep 1
omar 8546 0.0 0.0 37364 3324 pts/0 R+ 20:13 0:00 ps awux
gitlab-+ 13342 1.2 0.2 39720 9320 ? Ssl Nov27 580:31 /opt/gitlab/embedded/bin/redis-server 127.0.0.1:0
gitlab-+ 13353 0.0 1.2 1053648 50132 ? Ss Nov27 0:32 /opt/gitlab/embedded/bin/postgres -D /var/opt/gitlab/postgresql/data
gitlab-+ 13355 0.0 0.3 1054128 11908 ? Ss Nov27 0:00 postgres: checkpointer process
gitlab-+ 13356 0.0 0.2 1054128 9788 ? Ss Nov27 0:16 postgres: writer process
gitlab-+ 13357 0.0 0.1 1054128 4092 ? Ss Nov27 0:15 postgres: wal writer process
gitlab-+ 13358 0.0 0.1 1055100 4884 ? Ss Nov27 0:53 postgres: autovacuum launcher process
systemd+ 32717 0.0 0.0 100324 2280 ? Ssl Nov27 0:02 /lib/systemd/systemd-timesyncd
Several other tools are great for displaying not only the processes running in the system but also the resource consumption (CPU, memory, network, and so on). Two widely used tools are top and htop. Example 2-2 shows the output of top, and Example 2-3 shows the output of htop.
top - 20:20:25 up 64 days, 5:17, 1 user, load average: 0.09, 0.06, 0.01
Tasks: 197 total, 2 running, 195 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.5 us, 0.0 sy, 0.0 ni, 98.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 3914932 total, 195108 free, 2008376 used, 1711448 buff/cache
KiB Swap: 4058620 total, 3994692 free, 63928 used. 1487784 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13465 git 20 0 731848 496408 15184 S 2.0 12.7 1105:01 bundle
13342 gitlab-+ 20 0 39720 9320 2896 S 1.3 0.2 580:36.88 redis-server
9039 omar 20 0 41800 3772 3112 R 0.7 0.1 0:00.02 top
1 root 20 0 120416 6432 3840 S 0.0 0.2 0:30.43 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.21 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:11.62 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root 20 0 0 0 0 S 0.0 0.0 58:43.51 rcu_sched
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root rt 0 0 0 0 S 0.0 0.0 0:29.90 migration/0
10 root rt 0 0 0 0 S 0.0 0.0 0:16.00 watchdog/0
11 root rt 0 0 0 0 S 0.0 0.0 0:16.03 watchdog/1
12 root rt 0 0 0 0 S 0.0 0.0 0:29.83 migration/1
13 root 20 0 0 0 0 S 0.0 0.0 0:17.28 ksoftirqd/1
15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs
17 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns
18 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 perf
19 root 20 0 0 0 0 S 0.0 0.0 0:02.84 khungtaskd
20 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback
21 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd
22 root 39 19 0 0 0 S 0.0 0.0 0:14.74 khugepaged
23 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 crypto
24 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd
25 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset
26 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd
27 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ata_sff
1 [||||| 1.0%] 4 [ 0.0%]
2 [| 0.0%] 5 [| 1.3%]
3 [ 0.0%] 6 [ 0.0%]
Mem[||||||||||||||||||||||||||||||||||||||||||||||748M/15.6G] Tasks: 47, 108 thr; 1 running
Swp[| 29.7M/15.9G] Load average: 0.00 0.00 0.00
Uptime: 47 days, 08:00:26
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
17239 omar 20 0 24896 4112 3248 R 1.3 0.0 0:00.07 htop
1510 root 35 15 1812M 115M 4448 S 0.0 0.7 1h26:51 Plex Plug-in [com.plexapp.system] /usr/lib/plexmediaserver/Resources/P
1 root 20 0 117M 6012 3416 S 0.0 0.0 0:28.91 /lib/systemd/systemd --system --deserialize 19
432 root 20 0 35420 7316 7032 S 0.0 0.0 7:00.23 /lib/systemd/systemd-journald
475 root 20 0 100M 3180 944 S 0.0 0.0 0:00.00 /sbin/lvmetad -f
938 root 20 0 4396 1308 1220 S 0.0 0.0 0:00.04 /usr/sbin/acpid
964 syslog 20 0 250M 3916 2492 S 0.0 0.0 0:58.03 /usr/sbin/rsyslogd -n
965 syslog 20 0 250M 3916 2492 S 0.0 0.0 0:00.00 /usr/sbin/rsyslogd -n
966 syslog 20 0 250M 3916 2492 S 0.0 0.0 1:05.88 /usr/sbin/rsyslogd -n
943 syslog 20 0 250M 3916 2492 S 0.0 0.0 2:04.34 /usr/sbin/rsyslogd -n
967 root 20 0 273M 14348 9176 S 0.0 0.1 0:03.84 /usr/lib/snapd/snapd
968 root 20 0 273M 14348 9176 S 0.0 0.1 0:00.00 /usr/lib/snapd/snapd
969 root 20 0 273M 14348 9176 S 0.0 0.1 0:00.63 /usr/lib/snapd/snapd
1041 root 20 0 273M 14348 9176 S 0.0 0.1 0:00.75 /usr/lib/snapd/snapd
1043 root 20 0 273M 14348 9176 S 0.0 0.1 0:00.77 /usr/lib/snapd/snapd
1045 root 20 0 273M 14348 9176 S 0.0 0.1 0:00.00 /usr/lib/snapd/snapd
11707 root 20 0 273M 14348 9176 S 0.0 0.1 0:00.64 /usr/lib/snapd/snapd
947 root 20 0 273M 14348 9176 S 0.0 0.1 0:06.68 /usr/lib/snapd/snapd
1040 root 20 0 302M 2924 1384 S 0.0 0.0 0:11.12 /usr/bin/lxcfs /var/lib/lxcfs/
1042 root 20 0 302M 2924 1384 S 0.0 0.0 0:11.26 /usr/bin/lxcfs /var/lib/lxcfs/
20680 root 20 0 302M 2924 1384 S 0.0 0.0 0:11.19 /usr/bin/lxcfs /var/lib/lxcfs/
6250 root 20 0 302M 2924 1384 S 0.0 0.0 0:07.26 /usr/bin/lxcfs /var/lib/lxcfs/
953 root 20 0 302M 2924 1384 S 0.0 0.0 0:40.87 /usr/bin/lxcfs /var/lib/lxcfs/
958 root 20 0 28632 3020 2640 S 0.0 0.0 0:04.29 /lib/systemd/systemd-logind
F1Help F2Setup F3SearchF4FilterF5Tree F6SortByF7Nice -F8Nice +F9Kill F10Quit
Just like in Windows or any other operating system, looking for orphan, zombie, and suspicious processes is one of the tasks in Linux forensics. For instance, if you find a process running with open network sockets that doesn’t show up on a similar system, there may be something suspicious on that system. You may find network saturation originating from a single host (by way of tracing its Ethernet address or packet counts on its switch port) or a program eating up 100% of the CPU but nothing in the file system with that name.
You should also become familiar with the Linux file system. Ext4 is one of the most used Linux file systems. It has several improvements over its predecessors Ext3 and Ext2. Ext4 not only supports journaling (covered in the next section), but also modifies important data structures of the file system, such as the ones destined to store the file data. This is done for better performance, reliability, and additional features.
Ext3 supported 16 TB of maximum file system size, and 2 TB of maximum file size. Ext4 supports a maximum of 1 exabyte (EB), which equals 1,048,576 TB. The maximum possible number of sub directories contained in a single directory in Ext3 is 32,000. Ext4 allows an unlimited number of sub directories. It uses a “multiblock allocator” (mballoc) to allocate many blocks in a single call, instead of a single block per call. This feature avoids a lot of overhead and improves system performance.
Becoming familiar with the Linux file system is recommended for any cyber forensics practitioner. For example, in a compromised system, you may find a partition showing 100% utilization, but if you use the du command, the system may only show 30% utilization.
Two popular tools are used to analyze the Linux file system for cyber forensics: the Sleuth Kit and Autopsy. These tools are designed to analyze hard disk drives, solid-state drives (SSDs), and mobile devices. You can download the software and obtain more information about these tools at http://www.sleuthkit.org.
Ext4 and Ext3 are journaling file systems. A journaling file system maintains a record of changes not yet committed to its main part. This data structure is referred to as a “journal,” which is a circular log. One of the main features of a file system that supports journaling is that if the system crashes or experiences a power failure, it can be restored back online a lot quicker while also avoiding system corruption. A journaling file system may only keep track of stored metadata, but this depends on the implementation. Keeping track of only stored metadata improves performance but increases the possibility of data corruption.
The journal is the most used part of the disk, making the blocks that form part of it more prone to hardware failure. One of the features of Ext4 is that it checksums the journal data to know if the journal blocks are failing or becoming corrupted. Journaling ensures the integrity of the file system by keeping track of all disk changes, but it introduces a bit of overhead.
As you learned earlier in this chapter, the MBR is a special type of boot sector that contains 512 or more bytes located in the first sector of the drive. The MBR includes instructions about how the logical partitions that have file systems are organized on the drive. It also has executable code to load the installed operating system.
The most common boot loaders in Linux are Linux Loader (LILO), Load Linux (LOADLIN), and the Grand Unified Bootloader (GRUB).
Figure 2-1 illustrates the Linux boot process in detail.
There are two main partitions on a Linux system:
The data partition, which contains all Linux system data, including the root partition
The swap partition, which is extra memory on the hard disk drive or SSD that is an expansion of the system’s physical memory
The swap space (otherwise known as just “swap”) is only accessible and viewable by the system itself. The swap makes sure that the operating system keeps working. Windows, Mac OS X, and other operating systems also use swap or virtual memory. The swap space is slower than real physical memory (RAM), but it helps the operating system immensely. A general rule of thumb is that Linux typically counts on having twice the amount of swap than physical memory.
One interesting point related to cyber forensics is that pretty much everything in RAM has the potential of being stored in swap space at any given time. Subsequently, you may find interesting system data such as plaintext data, encryption keys, user credentials, emails, and other sensitive information—especially due to the weaknesses in some applications that allow unencrypted keys to reside in memory.
Review the most important topics in the chapter, noted with the Key Topic icon in the outer margin of the page. Table 2-2 lists these key topics and the page numbers on which each is found.
Define the following key terms from this chapter and check your answers in the glossary:
The answers to these questions appear in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Q&A.” For more practice with exam format questions, use the exam engine on the website.
1. Which of the following is true about VirtualAlloc?
a. It is a specialized allocation of the Windows virtual memory system, meaning it allocates straight into virtual memory via reserved blocks of memory.
b. It is another name for swap space.
c. It is a specialized allocation of the Linux virtual memory system, meaning it allocates straight into virtual memory via reserved blocks of memory.
d. It is a specialized allocation of the Mac OS X virtual memory system, meaning it allocates straight into virtual memory via reserved blocks of memory.
2. Which of the following is true about HeapAlloc?
a. It allocates any size of memory that is requested dynamically in Mac OS X. It is designed to be slow and used for special-purpose memory allocation.
b. It allocates any size of memory that is requested dynamically in Microsoft Windows. It is designed to be slow and used for special-purpose memory allocation.
c. It allocates any size of memory that is requested dynamically in Linux-based operating systems. It is designed to be very fast and used for general-purpose allocation.
d. It allocates any size of memory that is requested dynamically in Microsoft Windows. It is designed to be very fast and used for general-purpose allocation.
3. In cyber forensics, the storage device you are investigating should immediately be write-protected before it is imaged and should be labeled to include which of the following? (Choose two.)
a. Investigator’s name
b. Victim’s name
c. The date when the image was created
d. NetFlow record ID
4. Which of the following is a benefit in cyber forensics of being able to make an exact copy of the data being investigated?
a. The original device can be returned to the owner or stored for trial, normally without having to be examined repeatedly.
b. The original device can be returned to the owner or stored for trial, typically always having to be examined repeatedly.
c. A backup of the data can be performed so that the case manager and investigator can retrieve any lost records.
d. A backup of the data can be performed so that the victim can retrieve any lost records.
5. What is best evidence?
a. Evidence that can be presented in court in the original form.
b. Evidence that tends to support a theory or an assumption deduced by some initial evidence. This best evidence confirms the proposition.
c. Evidence that cannot be presented in court in the original form.
d. Evidence that can be presented in court in any form.
6. Which of the following is extra memory on the hard disk drive or SSD that is an expansion of the system’s physical memory?
a. MBR
b. MFT
c. Swap
d. RAM partition
7. Which of the following is true about journaling?
a. A journaling file system provides less security than the alternatives.
b. Journaling file systems are slow and should be avoided.
c. A journaling file system maintains a record of changes not yet committed to the file system’s main part.
d. A journaling file system does not maintain a record of changes not yet committed to the file system’s main part.
8. Which type of evidence relies on an extrapolation to a conclusion of fact (such as fingerprints, DNA evidence, and so on)?
a. Indirect or circumstantial evidence
b. Secondary evidence
c. Corroborating evidence
d. Best evidence
9. Which of the following is one of the most used Linux file systems that has several improvements over its predecessors and that supports journaling?
a. NTFS
b. exFAT
c. Ext5
d. Ext4
10. Which of the following statements is true about heaps in Windows?
a. Heaps are set up by Malloc and are used to initially reserve allocation space from the operating system.
b. Heaps are set up by swap and are used to initially reserve allocation space at bootup from the operating system.
c. Heaps are set up by GRUB and are used to initially reserve allocation space from the operating system.
d. Heaps are set up by VirtualAlloc and are used to initially reserve allocation space from the operating system.