Security incidents occur when you least expect them. In a moment, the operation of the business is interrupted, or news about the leak of company information is on social networks and the internet and goes viral. These are times of great uncertainty, and you need to respond quickly and appropriately.
It is a crucial moment, and the clock is ticking fast; there is no time for improvisation, and the only way to succeed is to have a plan and sufficient resources to deal with the security breach. Any organization must have the infrastructure, tools, and staff with the knowledge and skills to respond to and investigate security breaches.
There are several frameworks, such as the National Institute of Standards and Technology (NIST) and SysAdmin, Audit, Network, and Security (SANS), that consider the importance of developing an incident response capability, with the first step identified as preparation.
You will learn the importance of developing and implementing an incident response capacity for different attack scenarios and creating a program that supports business continuity and actions to identify, contain, and eradicate threats.
In this chapter, we are going to cover the following topics:
In case you haven't done already, you need to download and install VMware Workstation Player from this link https://www.vmware.com/products/workstation-player/workstation-player-evaluation.html.
You'll also need to download the following from the book's official GitHub repository https://github.com/PacktPublishing/Incident-Response-with-Threat-Intelligence:
Responding to a cybersecurity incident could be considered a reactive activity since it is done after an offensive action has been detected or identified. So, what does it mean to take a proactive stance? It means that even when you can't avoid incidents, you can have defined procedures for acting in certain circumstances. Also, if you have an infrastructure that supports those activities and the necessary tools to work throughout the life cycle of the incident, you will be better prepared.
Some of the benefits of developing an incident response capability are the following:
Another vital aspect of adopting a proactive posture for incident management is the organization's ability to adapt to the constant evolution of threats.
Inspired by the pyramidal hierarchy of needs of the Russian-American Abraham Maslow, Matt Swann from Microsoft developed The Incident Response Hierarchy of Needs: https://github.com/swannman/ircapabilities.
This model describes from the base how organizations should develop their levels of maturity and technical capabilities to deal with risks and cyber threats. The main idea is to build solid foundations in each of the layers to build a real capacity for detection and response to cybersecurity incidents, as shown in the following figure:
Each element from the preceding diagram is described as follows:
You can also divide this structure into three stages:
The pyramid showing the three stages is as follows:
This pyramid is a very useful guide and can be applied to the maturity process of any organization to build a better incident response technical capacity.
According to the consulting firm Deloitte, the basis for the development of an incident response capacity consists of a strategy developed specifically for the company that includes the following:
All of this must be orchestrated under a governance structure, as shown in the following figure:
By integrating the technical and organizational capabilities, not only is it possible to respond effectively to cyber-secured incidents, but you will also guarantee that they are aligned with the business's vision and requirements. In the next part, you will learn how to build a comprehensive incident response program.
Technology must be a facilitator of the business. The objective of digital transformation is to rely on technology to achieve its goals in the short, medium, and long term, evolving consistently and adapting to the characteristics of the environment.
In the same way, cybersecurity must be a component that, like a bodyguard, helps ensure that the technology meets its objective, anticipating and protecting itself against any risk or threat that could interrupt the business processes or the organization's technological infrastructure.
The challenge is to align technology with business processes and these, in turn, with security; this should be a priority for organizations. The best way to achieve it is by identifying critical business assets, operations, and the associated threats through a risk and threat assessment.
Business knowledge is essential to understand the organization's position around these threats and its risk appetite. For instance, in a company, the leak of internal information can have more business impact than the encryption of the data itself in a ransomware attack. In that case, it probably will not handle the incident as if it identifies it as a more complex threat.
In addition to understanding the critical business processes, it is also crucial to understand the priorities of the C-levels. Each area has different concerns, and therefore, their ways of seeing the business and the associated risks are not necessarily the same.
Keeping this in mind will help align ideas and objectives of all essential business areas to define a unified vision and facilitate the different parties' acceptance.
Some of the priorities from a business perspective can be the following:
It's essential that these concerns help with the optimization of resources that will be allocated and obtain better results.
Another factor to consider is the changing external environment's impact. An example was the COVID-19 pandemic, which forced technology adoption in many businesses. Employees had to work remotely, which represented a real challenge in cybersecurity.
Due to the lack of planning, threat actors attacked many companies and compromised their infrastructures.
One of the most important points before starting the implementation process of an incident response program is to understand the organization's position before cybersecurity threats and know its ability to respond to an attack.
The first step is to make a diagnosis of the current capacity of the organization. There are different tools to make this diagnosis, some commercial and others free.
To learn how we can evaluate the maturity level of an organization, we will use a maturity assessment tool created by the international organization CREST (a non-profit organization established in the UK in 2006 for accreditations and certifications in security).
To start, follow these steps:
Username: investigator
Password: L34rn1ng!
Once you have downloaded the two documents, you can start the assessment of the maturity level of the organization.
For security reasons, the macros are disabled. Always be careful with files downloaded from the internet; it is a tactic widely used by evil actors to compromise users' devices.
In this case, the source is reliable, so we are going to allow the execution of the macros.
Now, to start working with this template, you will need information about a fictitious organization.
The results of this assessment will show the level of maturity of the organization in its security incident response capability and help you identify points for improvement and to develop and plan an efficient strategy.
Every organization must define what guidelines will be followed for the preparation to respond to security incidents and the starting point is the evaluation of the level of maturity and the security posture of the organization.
Preparing to respond to cybersecurity incidents should be an ongoing cycle that should consider an up-to-date view of risks and threats.
Incident response is not just about the use of tools or procedures; it requires developing a comprehensive incident response program that helps the organization be more efficient in detecting threats and increasing preparedness to respond to incidents and security breaches.
Procedures and guidelines should be well documented. Having the procedures documented step by step helps reduce the number of errors and allows the work to be done more efficiently.
Workflows must be documented for the distinct types of incidents or activities; for example, define the actions to be taken if a ransomware incident materializes or if a lateral movement is detected with exfiltration of information from the network.
Activities should be clearly documented so that sometimes the first responder does not necessarily require technical knowledge of incident response.
As in all aspects related to cybersecurity, it is very important to consider people, processes, and technology in the development of an incident response capacity as shown in the following figure:
As you will see later, you cannot respond to cybersecurity incidents efficiently if any of these three elements are missing. For instance, without technology, the response capacity will be insufficient and limited. Technology will not be able to solve the problem without the interaction of qualified personnel. Although the organization has the technology and personnel, the response will not follow any direction without defined processes.
There are different criteria for the creation of an incident response team, and much will depend on the characteristics of each organization.
According to NIST's 800-61 framework of reference, there are different models of incident response teams, and these can be the following:
Additionally, these teams can have several working models, such as the following:
To maintain an adequate level of knowledge and skills among team members, it is important to develop a continuous training program where different levels of learning are considered for participants in different areas.
Incident response processes are particularly important to avoid confusion, minimize damage, and reduce response time.
Incident handling criteria must be defined in different circumstances and stages of the life cycle of the resolution to incidents, for example:
Reporting and establishing record-keeping protocols are some key procedures that must be performed for proper incident management.
Observe, Orient, Decide, Act (OODA) is a methodology to use the available information and its context to make decisions in incident response. One way to apply it is by getting information from different sources such as logs, alerts, and threat intelligence and elaborating different hypotheses to take action, as shown in the following figure:
The different elements of OODA methodology are discussed as follows:
This methodology is particularly useful to identify the best way to quickly respond to a security breach.
The technology behind incident response is based on the systems used by security analysts to carry out their investigation, response, and management tasks. Incident response requires an infrastructure, hardware, and software that support the activities that are required in the event of a security breach.
It should be considered that sometimes the normal infrastructure of the organization could be affected, and alternative mechanisms should be used to be able to follow the procedures defined in the incident response plan.
These tools should include the following, for example:
It is also particularly important that the infrastructure of the organization is configured to facilitate the collection of information that can be valuable. An example is to have a centralized log collection system, or tools that allow the rapid deployment of data collection agents from devices in the network that allows the execution of scripts, and also Yara and Sigma rules, which we will see in detail in Chapter 13, Creating and Deploying Detection Rules.
A cybersecurity incident could impact several aspects of the organization; in many cases, one of the most significant is the interruption in the continuity of business operations.
When this happens, a common mistake in organizations is prioritizing the continuity of the business operations over the incident response procedures. This does not necessarily have to be so; incident response, business continuity, and disaster recovery are processes that can and should be aligned. One of the goals of incident response plans is to help ensure that business operations can continue and that the business will continue to the recovery stage.
But let's review from a simplified point of view the differences between these plans and how they are integrated.
The incident response plan defines the actions to be taken in a security breach and how to identify, analyze, contain, and eradicate threats. From there, take the lessons learned to improve the organization's security capabilities.
There are several aspects to consider when creating an incident response plan:
Now that we understand the aspects to consider while creating an incident response plan, next, we will look at the key roles to create an incident response team.
According to NIST, a key component to responding efficiently to a cybersecurity incident is the incident response team. There are key roles that are performed in the team:
Additionally, you need to define the best model to distribute your teams according to the characteristics of your organization, as you will see next.
Based on the requirements of the organization, teams can be organized into three different models:
Some considerations when defining the creation of these teams are the following:
Currently, any business depends on its technological infrastructure to operate. For that reason, and regardless of the size of the organization, companies need to develop an incident response capacity and plan to deal with cyberattacks.
Business continuity (BC) is an integral part of good cybersecurity practices and corporate governance. It has gained relevance due to the increase of risks and threats for organizations in the business and technological environments.
A Business Continuity Plan (BCP) is designed to ensure that the business continues to operate after an incident; this does not mean that the company should continue working at 100%, but that at least those processes and assets that are indispensable and critical for the organization must be kept in operation.
According to the European Union Agency for Cybersecurity (ENISA), to implement a successful Business Continuity Management (BCM) plan, you need to explicitly know the critical business process and define the project's scope, objectives, and deliverables to align it to the business objectives.
Responsibility for management falls on the Business Continuity Management team, and the number of members and the specific roles will depend on the size and type of the organization.
Additionally, for the governance of the business continuity process, a Business Continuity Steering Committee (BCSC) must be set up, which, unlike the Business Continuity Management team, acts at the time that it is required to ensure business continuity and ensure that plans are kept updated, reviewed, and tested.
NIST 800-34 (Contingency Planning Guide for Federal Information Systems) recommends that the BCSC should be overseen by a senior manager such as the Chief Information Officer (CIO).
A key part of the process of planning a business continuity plan is the Business Impact Analysis (BIA). This analysis helps to measure the impact and losses caused by a cybersecurity incident.
Along with the BIA, the risk assessment information should be considered to determine the likelihood of a disruption in business processes. In this way, it will be possible to define the best strategy to ensure business continuity.
The BIA is also an essential part of disaster recovery (DR) and incident management (IM) plans.
A disaster recovery plan helps to ensure that the business can return to its normal operating state before a security incident occurs. A recovery plan must consider process and procedure development to ensure the restoration of systems and assets affected in an incident.
Depending on the incident, the outcome of the threat hunting activities and the investigation will help ensure no residual risks or threats that prevent a return to normalcy.
The NIST Special Publication 800-184 "Guide for Cybersecurity Event Recovery" defines that recovering from a cybersecurity incident could require rebuilding a system and restoring backup information involving people, processes, and technology.
According to the Cybersecurity Framework (CSF), recovery is a critical function for a complete defense. The recovery process involves two phases focused on tactical and strategic results.
The tactical recovery phase involves executing a recovery plan defined proactively before an incident occurs.
The second phase is more strategic and focuses on mitigating the incident's impact and reducing the likelihood of future incidents.
A critical component to recover after a cybersecurity incident is planning. Recovery planning must consider an in-depth analysis of the essential business areas and the dependency of processes and systems. Also, it is crucial to explore different scenarios and how threats could impact the business.
Threat modeling is an integral part of scenario exploration as it does not consider risks superficially. Still, it helps to identify the capabilities and tools of adversaries in a more detailed way.
It is also essential to consider the legal, regulatory, and operational requirements to calculate the business impact and reduce recovery time.
Depending on the nature of the incident, the decision to initialize a recovery process may involve the recovery personnel, the incident response team, the CISO, or the business owners.
As you can see, there are points of convergence between incident response, business continuity, and disaster recovery. Therefore, these processes must be aligned to develop an effective response to cyberattacks, as shown in the following figure:
It is essential to establish coordinated actions between the different areas to implement the plans at specific moments of the incident to balance the priorities of the business continuity and the DFIR (Digital Forensics and Incident Response) investigation.
As you can see, the common goal of the Incident Response (IR), Business Continuity (BC), and Disaster Recovery (DR) plans is to reduce the impact on the business because of cybersecurity incidents.
For example, suppose that your Security Operations Center (SOC) detects a connection from a file server on the network to an IP identified as a command and control server (C2) related to a malicious campaign. This incident triggers the first response procedures.
As part of the IR procedures, you may need to initialize triage procedures to collect information from the compromised server and perform memory and hard drive acquisition procedures. Therefore, the operation of the server must be interrupted. If you do not have a secondary server, it probably also implies the interruption of the functions associated with that server.
This scenario should be considered in the development of the business continuity plan and for the definition of the Recovery Time Objective (RTO); otherwise, there will be a conflict of interest between following incident response protocols and fulfilling the business continuity objectives.
In this chapter, you learned the importance of adopting a proactive posture for incident response and how these strategies can help you deal with different security breaches.
You learned about the correlation between people, processes, and technology to develop successful incident response programs and the importance of aligning business requirements with incident response procedures.
Finally, you learned about the relationship between incident response, business continuity, and disaster recovery plans and the importance of integrating them to respond more efficiently to cyberattacks.
In the next chapter, you will create an incident response policy, an incident response plan, and playbooks to respond to different categories of incidents.