Chapter 14
Containment, Eradication, and Recovery

Chapter 11, “Building an Incident Response Program,” provided an overview of the steps required to build and implement a cybersecurity incident response program according to the process advocated by the National Institute of Standards and Technology (NIST). In their Computer Security Incident Handling Guide, NIST outlines the four-phase incident response process shown in Figure 14.1.

Schematic illustration of the incident response process.

FIGURE 14.1 Incident response process

Source: NIST SP 800-61: Computer Security Incident Handling Guide

The remainder of Chapter 11 provided an overview of the Preparation phase of incident response. Chapter 12, “Analyzing Indicators of Compromise,” and Chapter 13, “Performing Forensic Analysis and Techniques,” covered the details behind the Detection and Analysis phase, including sources of cybersecurity information and forensic analysis. This chapter concludes the coverage of CySA+ Domain 4.0: Incident Response with a detailed look at the final two phases of incident response: Containment, Eradication, and Recovery, and Postincident Activity.

Containing the Damage

The Containment, Eradication, and Recovery phase of incident response moves the organization from the primarily passive incident response activities that take place during the Detection and Analysis phase to more active undertakings. Once the organization understands that a cybersecurity incident is underway, it takes actions designed to minimize the damage caused by the incident and restore normal operations as quickly as possible.

Containment is the first activity that takes place during this phase, and it should begin as quickly as possible after analysts determine that an incident is underway. Containment activities are designed to isolate the incident and prevent it from spreading further. If that phrase sounds somewhat vague, that's because containment means very different things in the context of different types of security incidents. For example, if the organization is experiencing active exfiltration of data from a credit card processing system, incident responders might contain the damage by disconnecting that system from the network, preventing the attackers from continuing to exfiltrate information. On the other hand, if the organization is experiencing a denial-of-service attack against its website, disconnecting the network connection would simply help the attacker achieve its objective. In that case, containment might include placing filters on an upstream Internet connection that blocks all inbound traffic from networks involved in the attack or blocks web requests that bear a certain signature.

Containment activities typically aren't perfect and often cause some collateral damage that disrupts normal business activity. Consider the two examples described in the previous paragraph. Disconnecting a credit card processing system from the network may bring transactions to a halt, causing potentially significant losses of business. Similarly, blocking large swaths of inbound web traffic may render the site inaccessible to some legitimate users. Incident responders undertaking containment strategies must understand the potential side effects of their actions while weighing them against the greater benefit to the organization.

Segmentation

Cybersecurity analysts often use network segmentation as a proactive strategy to prevent the spread of future security incidents. For example, the network shown in Figure 14.2 is designed to segment different types of users from each other and from critical systems. An attacker who is able to gain access to the guest network would not be able to interact with systems belonging to employees or in the datacenter without traversing the network firewall.

You learned how network segmentation is used as a proactive control in a defense-in-depth approach to information security in Chapter 7, “Infrastructure Security and Controls.”

In addition to being used as a proactive control, network segmentation may play a crucial role in incident response. During the early stages of an incident, responders may realize that a portion of systems are compromised but wish to continue to observe the activity on those systems while they determine other appropriate responses. However, they certainly want to protect other systems on the network from those potentially compromised systems.

Figure 14.3 shows an example of how an organization might apply network segmentation during an incident response effort. Cybersecurity analysts suspect that several systems in the datacenter were compromised and built a separate virtual LAN (VLAN) to contain those systems. That VLAN, called the quarantine network, is segmented from the rest of the datacenter network and controlled by very strict firewall rules. Putting the systems on this network segment provides some degree of isolation, preventing them from damaging systems on other segments but allowing continued live analysis efforts.

Schematic illustration of the proactive network segmentation.

FIGURE 14.2 Proactive network segmentation

Schematic illustration of the network segmentation for incident response.

FIGURE 14.3 Network segmentation for incident response

Isolation

Although segmentation does limit the access that attackers have to the remainder of the network, it sometimes doesn't go far enough to meet containment objectives. Cybersecurity analysts may instead decide that it is necessary to use stronger isolation practices to cut off an attack. Two primary isolation techniques may be used during a cybersecurity incident response effort: isolating affected systems and isolating the attacker.

Isolating Affected Systems

Isolating affected systems is, quite simply, taking segmentation to the next level. Affected systems are completely disconnected from the remainder of the network, although they may still be able to communicate with each other and the attacker over the Internet. Figure 14.4 shows an example of taking the quarantine VLAN from the segmentation strategy and converting it to an isolation approach.

Notice that the only difference between Figures 14.3 and 14.4 is where the quarantine network is connected. In the segmentation approach, the network is connected to the firewall and may have some limited access to other networked systems. In the isolation approach, the quarantine network connects directly to the Internet and has no access to other systems. In reality, this approach may be implemented by simply altering firewall rules rather than bypassing the firewall entirely. The objective is to continue to allow the attacker to access the isolated systems but restrict their ability to access other systems and cause further damage.

Isolating the Attacker

Isolating the attacker is an interesting variation on the isolation strategy and depends on the use of sandbox systems that are set up purely to monitor attacker activity and that do not contain any information or resources of value to the attacker. Placing attackers in a sandboxed environment allows continued observation in a fairly safe, contained environment. Some organizations use honeypot systems for this purpose. For more information on honeypots, see Chapter 1, “Today's Cybersecurity Analyst.”

Schematic illustration of the network isolation for incident response.

FIGURE 14.4 Network isolation for incident response

Removal

Removal of compromised systems from the network is the strongest containment technique in the cybersecurity analyst's incident response toolkit. As shown in Figure 14.5, removal differs from segmentation and isolation in that the affected systems are completely disconnected from other networks, although they may still be allowed to communicate with other compromised systems within the quarantine VLAN. In some cases, each suspect system may be physically disconnected from the network so that they are prevented from communicating even with each other. The exact details of removal will depend on the circumstances of the incident and the professional judgment of incident responders.

Schematic illustration of the network removal for incident response.

FIGURE 14.5 Network removal for incident response

Evidence Gathering and Handling

The primary objective during the containment phase of incident response is to limit the damage to the organization and its resources. Although that objective may take precedence over other goals, responders may still be interested in gathering evidence during the containment process. This evidence can be crucial in the continuing analysis of the incident for internal purposes, or it can be used during legal proceedings against the attacker.

Chapter 7 provided a thorough review of the forensic strategies that might be used during an incident investigation. Chapter 1 also included information on reverse engineering practices that may be helpful during an incident investigation.

If incident handlers suspect that evidence gathered during an investigation may be used in court, they should take special care to preserve and document evidence during the course of their investigation. NIST recommends that investigators maintain a detailed evidence log that includes the following:

  • Identifying information (for example, the location, serial number, model number, hostname, MAC addresses, and IP addresses of a computer)
  • Name, title, and phone number of each individual who collected or handled the evidence during the investigation
  • Time and date (including time zone) of each occurrence of evidence handling
  • Locations where the evidence was stored

Failure to maintain accurate logs will bring the evidence chain-of-custody into question and may cause the evidence to be inadmissible in court.

Identifying Attackers

Identifying the perpetrators of a cybersecurity incident is a complex task that often leads investigators down a winding path of redirected hosts that crosses international borders. Although you might find IP address records stored in your logs, it is incredibly unlikely that they correspond to the actual IP address of the attacker. Any attacker other than the most rank of amateurs will relay their communications through a series of compromised systems, making it very difficult to trace their actual origin.

Before heading down this path of investigating an attack's origin, it's very important to ask yourself why you are pursuing it. Is there really business value in uncovering who attacked you, or would your time be better spent on containment, eradication, and recovery activities? The NIST Computer Security Incident Handling Guide addresses this issue head on, giving the opinion that “Identifying an attacking host can be a time-consuming and futile process that can prevent a team from achieving its primary goal—minimizing the business impact.”

Law enforcement officials may approach this situation with objectives that differ from those of the attacked organization's cybersecurity analysts. After all, one of the core responsibilities of law enforcement organizations is to identify criminals, arrest them, and bring them to trial. That responsibility may conflict with the core cybersecurity objectives of containment, eradication, and recovery. Cybersecurity and business leaders should take this conflict into consideration when deciding whether to involve law enforcement agencies in an incident investigation and the degree of cooperation they will provide to an investigation that is already underway.

Incident Eradication and Recovery

Once the cybersecurity team successfully contains an incident, it is time to move on to the eradication phase of the response. The primary purpose of eradication is to remove any of the artifacts of the incident that may remain on the organization's network. This could include the removal of any malicious code from the network, the sanitization of compromised media, and the securing of compromised user accounts.

The recovery phase of incident response focuses on restoring normal capabilities and services. It includes reconstituting resources and correcting security control deficiencies that may have led to the attack. This could include rebuilding and patching systems, reconfiguring firewalls, updating malware signatures, and similar activities. The goal of recovery is not just to rebuild the organization's network but also to do so in a manner that reduces the likelihood of a successful future attack.

During the eradication and recovery effort, cybersecurity analysts should develop a clear understanding of the incident's root cause. This is critical to implementing a secure recovery that corrects control deficiencies that led to the original attack. After all, if you don't understand how an attacker breached your security controls in the first place, it will be hard to correct those controls so that the attack doesn't reoccur! Understanding the root cause of an attack is a completely different activity than identifying the attacker. Root cause assessment is a critical component of incident recovery whereas, as mentioned earlier, identifying the attacker can be a costly distraction.

Root cause analysis also helps an organization identify other systems they operate that might share the same vulnerability. For example, if an attacker compromises a Cisco router and root cause analysis reveals an error in that device's configuration, administrators may correct the error on other routers they control to prevent a similar attack from compromising those devices.

Reconstruction and Reimaging

During an incident, attackers may compromise one or more systems through the use of malware, web application attacks, or other exploits. Once an attacker gains control of a system, security professionals should consider it completely compromised and untrustworthy. It is not safe to simply correct the security issue and move on because the attacker may still have an undetected foothold on the compromised system. Instead, the system should be rebuilt, either from scratch or by using an image or backup of the system from a known secure state.

Rebuilding and/or restoring systems should always be done with the incident root cause analysis in mind. If the system was compromised because it contained a security vulnerability, as opposed to through the use of a compromised user account, backups and images of that system likely have that same vulnerability. Even rebuilding the system from scratch may reintroduce the earlier vulnerability, rendering the system susceptible to the same attack. During the recovery phase, administrators should ensure that rebuilt or restored systems are remediated to address known security issues.

Patching Systems and Applications

During the incident recovery effort, cybersecurity analysts will patch operating systems and applications involved in the attack. This is also a good time to review the security patch status of all systems in the enterprise, addressing other security issues that may lurk behind the scenes.

Cybersecurity analysts should first focus their efforts on systems that were directly involved in the compromise and then work their way outward, addressing systems that were indirectly related to the compromise before touching systems that were not involved at all. Figure 14.6 shows the phased approach that cybersecurity analysts should take to patching systems and applications during the recovery phase.

Schematic illustration of the patching priorities.

FIGURE 14.6 Patching priorities

Sanitization and Secure Disposal

During the recovery effort, cybersecurity analysts may need to dispose of or repurpose media from systems that were compromised during the incident. In those cases, special care should be taken to ensure that sensitive information that was stored on that media is not compromised. Responders don't want the recovery effort from one incident to lead to a second incident!

Generally speaking, there are three options available for the secure disposition of media containing sensitive information: clear, purge, and destroy. NIST defines these three activities clearing in NIST SP 800-88: Guidelines for Media Sanitization:

  • Clear applies logical techniques to sanitize data in all user-addressable storage locations for protection against simple noninvasive data recovery techniques; this is typically applied through the standard Read and Write commands to the storage device, such as by rewriting with a new value or using a menu option to reset the device to the factory state (where rewriting is not supported).
  • Purge applies physical or logical techniques that render target data recovery infeasible using state-of-the-art laboratory techniques. Examples of purging activities include overwriting, block erase, and cryptographic erase activities when performed through the use of dedicated, standardized device commands. Degaussing is another form of purging that uses extremely strong magnetic fields to disrupt the data stored on a device.
  • Destroy renders target data recovery infeasible using state-of-the-art laboratory techniques and results in the subsequent inability to use the media for storage of data. Destruction techniques include disintegration, pulverization, melting, and incinerating.

These three levels of data disposal are listed in increasing order of effectiveness as well as difficulty and cost. Physically incinerating a hard drive, for example, removes any possibility that data will be recovered but requires the use of an incinerator and renders the drive unusable for future purposes.

Figure 14.7 shows a flowchart designed to help security decision makers choose appropriate techniques for destroying information and can be used to guide incident recovery efforts. Notice that the flowchart includes a Validation phase after efforts to clear, purge, or destroy data. Validation ensures that the media sanitization was successful and that remnant data does not exist on the sanitized media.

Flow chart depicts the sanitization and disposition decision flow.

FIGURE 14.7 Sanitization and disposition decision flow

Source: NIST SP 800-88: Guidelines for Media Sanitization

Validating the Recovery Effort

Before concluding the recovery effort, incident responders should take time to verify that the recovery measures put in place were successful. The exact nature of this verification will depend on the technical circumstances of the incident and the organization's infrastructure. Four activities that should always be included in these validation efforts follow:

  • Validate that only authorized user accounts exist on every system and application in the organization. In many cases, organizations already undertake periodic account reviews that verify the authorization for every account. This process should be used during the recovery validation effort.
  • Verify the proper restoration of permissions assigned to each account. During the account review, responders should also verify that accounts do not have extraneous permissions that violate the principle of least privilege. This is true for normal user accounts, administrator accounts, and service accounts.
  • Verify that all systems are logging properly. Every system and application should be configured to log security-related information to a level that is consistent with the organization's logging policy. Those log records should be sent to a centralized log repository that preserves them for archival use. The validation phase should include verification that these logs are properly configured and received by the repository.
  • Conduct vulnerability scans on all systems. Vulnerability scans play an important role in verifying that systems are safeguarded against future attacks. Analysts should run thorough scans against systems and initiate remediation workflows where necessary. For more information on this process, see Chapter 4, “Designing a Vulnerability Management Program,” and Chapter 5, “Analyzing Vulnerability Scans.”

These actions form the core of an incident recovery validation effort and should be complemented with other activities that validate the specific controls put in place during the Containment, Eradication, and Recovery phase of incident response.

Wrapping Up the Response

After the immediate, urgent actions of containment, eradication, and recovery are complete, it is very tempting for the CSIRT to take a deep breath and consider their work done. While the team should take a well-deserved break, the incident response process is not complete until the team completes postincident activities that include managing change control processes, conducting a lessons learned session, and creating a formal written incident report.

Managing Change Control Processes

During the containment, eradication, and recovery process, responders may have bypassed the organization's normal change control and configuration management processes in an effort to respond to the incident in an expedient manner. These processes provide important management controls and documentation of the organization's technical infrastructure. Once the urgency of response efforts pass, the responders should turn back to these processes and use them to document any emergency changes made during the incident response effort.

Conducting a Lessons Learned Session

At the conclusion of every cybersecurity incident, everyone involved in the response should participate in a formal lessons learned session that is designed to uncover critical information about the response. This session also highlights potential deficiencies in the incident response plan and procedures. For more information on conducting the post-incident lessons learned session, see the “Lessons Learned Review” section in Chapter 11.

During the lessons learned session, the organization may uncover potential changes to the incident response plan. In those cases, the leader should propose those changes and move them through the organization's formal change process to improve future incident response efforts.

During an incident investigation, the team may encounter new indicators of compromise (IOCs) based on the tools, techniques, and tactics used by attackers. As part of the lessons learned review, the team should clearly identify any new IOC and make recommendations for updating the organization's security monitoring program to include those IOCs. This will reduce the likelihood of a similar incident escaping attention in the future.

Developing a Final Report

Every incident that activates the CSIRT should conclude with a formal written report that documents the incident for posterity. This serves several important purposes. First, it creates an institutional memory of the incident that is useful when developing new security controls and training new security team members. Second, it may serve as an important record of the incident if there is legal action that results from the incident. Finally, the act of creating the written report can help identify previously undetected deficiencies in the incident response process that may feed back through the lessons learned process.

Important elements that the CSIRT should cover in a postincident report include the following:

  • Chronology of events for the incident and response efforts
  • Root cause of the incident
  • Location and description of evidence collected during the incident response process
  • Specific actions taken by responders to contain, eradicate, and recover from the incident, including the rationale for those decisions
  • Estimates of the impact of the incident on the organization and its stakeholders
  • Results of postrecovery validation efforts
  • Documentation of issues identified during the lessons learned review

Incident summary reports should be classified in accordance with the organization's classification policy and stored in an appropriately secured manner. The organization should also have a defined retention period for incident reports and destroy old reports when they exceed that period.

Evidence Retention

At the conclusion of an incident, the team should make a formal determination about the disposition of evidence collected during the incident. If the evidence is no longer required, then it should be destroyed in accordance with the organization's data disposal procedures. If the evidence will be preserved for future use, it should be placed in a secure evidence repository with the chain of custody maintained.

The decision to retain evidence depends on several factors, including whether the incident is likely to result in criminal or civil action and the impact of the incident on the organization. This topic should be directly addressed in an organization's incident response procedures.

Summary

After identifying a security incident in progress, CSIRT members should move immediately into the containment, eradication, and recovery phase of incident response. The first priority of this phase is to contain the damage caused by a security incident to lower the impact on the organization. Once an incident is contained, responders should take actions to eradicate the effects of the incident and recovery normal operations. Once the immediate response efforts are complete, the CSIRT should move into the postincident phase, conduct a lessons learned session, and create a written report summarizing the incident response process.

Exam Essentials

Explain the purpose of containment activities. After identifying a potential incident in progress, responders should take immediate action to contain the damage. They should select appropriate containment strategies based on the nature of the incident and impact on the organization. Potential containment activities include network segmentation, isolation, and removal of affected systems.

Know the importance of collecting evidence during a response. Much of the evidence of a cybersecurity incident is volatile in nature and may not be available later if not collected during the response. CSIRT members must determine the priority that evidence collection will take during the containment, eradication, and recovery phase and then ensure that they properly handle any collected evidence that can later be used in legal proceedings.

Explain how identifying attackers can be a waste of valuable resources. Most efforts to identify the perpetrators of security incidents are futile, consuming significant resources before winding up at a dead end. The primary focus of incident responders should be on protecting the business interests of the organization. Law enforcement officials have different priorities, and responders should be aware of potentially conflicting objectives.

Explain the purpose of eradication and recovery. After containing the damage, responders should move on to eradication and recovery activities that seek to remove all traces of an incident from the organization's network and restore normal operations as quickly as possible. This should include validation efforts that verify security controls are properly implemented before closing the incident.

Define the purpose of postincident activities. At the conclusion of a cybersecurity incident response effort, CSIRT members should conduct a formal lessons learned session that reviews the entire incident response process and recommends changes to the organization's incident response plan, as needed. Any such changes should be made through the organization's change control process. The team should also complete a formal incident summary report that serves to document the incident for posterity. Other considerations during this process include evidence retention, indicator of compromise (IoC) generation, and ongoing monitoring.

Lab Exercises

Activity 14.1: Incident Containment Options

Label each one of the following figures with the type of incident containment activity pictured.

Schematic illustration of the type of incident containment
activity.

________________________________________

Schematic illustration of the type of incident containment
activity.

________________________________________

Schematic illustration of the type of incident containment
activity.

________________________________________

Activity 14.2: Incident Response Activities

For each of the following incident response activities, assign it to one of the following CompTIA categories:

  • Containment
  • Eradication
  • Validation
  • Postincident Activities

Remember that the categories assigned by CompTIA differ from those used by NIST and other incident handling standards.

  • Patching  ___________________
  • Sanitization  ___________________
  • Lessons learned  ___________________
  • Reimaging  ___________________
  • Secure disposal  ___________________
  • Isolation  ___________________
  • Scanning  ___________________
  • Removal  ___________________
  • Reconstruction  ___________________
  • Permission verification  ___________________
  • User account review  ___________________
  • Segmentation  ___________________

Activity 14.3: Sanitization and Disposal Techniques

Fill in the flowchart with the appropriate dispositions for information being destroyed following a security incident.

Each box should be completed using one of the following three words:

  • Clear
  • Purge
  • Destroy
Flow chart depicts the  Sanitization and Disposal Techniques.

Review Questions

  1. Which one of the phases of incident response involves primarily active undertakings designed to limit the damage that an attacker might cause?
    1. Containment, Eradication, and Recovery
    2. Preparation
    3. Postincident Activity
    4. Detection and Analysis
  2. Which one of the following criteria is not normally used when evaluating the appropriateness of a cybersecurity incident containment strategy?
    1. Effectiveness of the strategy
    2. Evidence preservation requirements
    3. Log records generated by the strategy
    4. Cost of the strategy
  3. Alice is responding to a cybersecurity incident and notices a system that she suspects is compromised. She places this system on a quarantine VLAN with limited access to other networked systems. What containment strategy is Alice pursuing?
    1. Eradication
    2. Isolation
    3. Segmentation
    4. Removal
  4. Alice confers with other team members and decides that even allowing limited access to other systems is an unacceptable risk and decides instead to prevent the quarantine VLAN from accessing any other systems by putting firewall rules in place that limit access to other enterprise systems. The attacker can still control the system to allow Alice to continue monitoring the incident. What strategy is she now pursuing?
    1. Eradication
    2. Isolation
    3. Segmentation
    4. Removal
  5. After observing the attacker, Alice decides to remove the Internet connection entirely, leaving the systems running but inaccessible from outside the quarantine VLAN. What strategy is she now pursuing?
    1. Eradication
    2. Isolation
    3. Segmentation
    4. Removal
  6. Which one of the following tools may be used to isolate an attacker so that they may not cause damage to production systems but may still be observed by cybersecurity analysts?
    1. Sandbox
    2. Playpen
    3. IDS
    4. DLP
  7. Tamara is a cybersecurity analyst for a private business that is suffering a security breach. She believes the attackers have compromised a database containing sensitive information. Which one of the following activities should be Tamara's first priority?
    1. Identifying the source of the attack
    2. Eradication
    3. Containment
    4. Recovery
  8. Which one of the following activities does CompTIA classify as part of the recovery validation effort?
    1. Rebuilding systems
    2. Sanitization
    3. Secure disposal
    4. Scanning
  9. Which one of the following pieces of information is most critical to conducting a solid incident recovery effort?
    1. Identity of the attacker
    2. Time of the attack
    3. Root cause of the attack
    4. Attacks on other organizations
  10. Lynda is disposing of a drive containing sensitive information that was collected during the response to a cybersecurity incident. The information is categorized as a high security risk and she wishes to reuse the media during a future incident. What is the appropriate disposition for this information?
    1. Clear
    2. Erase
    3. Purge
    4. Destroy
  11. Which one of the following activities is not normally conducted during the recovery validation phase?
    1. Verify the permissions assigned to each account
    2. Implement new firewall rules
    3. Conduct vulnerability scans
    4. Verify logging is functioning properly
  12. What incident response activity focuses on removing any artifacts of the incident that may remain on the organization's network?
    1. Containment
    2. Recovery
    3. Postincident Activities
    4. Eradication
  13. Which one of the following is not a common use of formal incident reports?
    1. Training new team members
    2. Sharing with other organizations
    3. Developing new security controls
    4. Assisting with legal action
  14. Which one of the following data elements would not normally be included in an evidence log?
    1. Serial number
    2. Record of handling
    3. Storage location
    4. Malware signatures
  15. Sondra determines that an attacker has gained access to a server containing critical business files and wishes to ensure that the attacker cannot delete those files. Which one of the following strategies would meet Sondra's goal?
    1. Isolation
    2. Segmentation
    3. Removal
    4. None of the above
  16. Joe would like to determine the appropriate disposition of a flash drive used to gather highly sensitive evidence during an incident response effort. He does not need to reuse the drive but wants to return it to its owner, an outside contractor. What is the appropriate disposition?
    1. Destroy
    2. Clear
    3. Erase
    4. Purge
  17. Which one of the following is not typically found in a cybersecurity incident report?
    1. Chronology of events
    2. Identity of the attacker
    3. Estimates of impact
    4. Documentation of lessons learned
  18. What NIST publication contains guidance on cybersecurity incident handling?
    1. SP 800-53
    2. SP 800-88
    3. SP 800-18
    4. SP 800-61
  19. Which one of the following is not a purging activity?
    1. Resetting to factory state
    2. Overwriting
    3. Block erase
    4. Cryptographic erase
  20. Ben is responding to a security incident and determines that the attacker is using systems on Ben's network to attack a third party. Which one of the following containment approaches will prevent Ben's systems from being used in this manner?
    1. Removal
    2. Isolation
    3. Detection
    4. Segmentation
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset