Risk Analysis

I have determined that our greatest risk is this paperclip.

Response: Nice work.

Risk analysis, which is really a tool for risk management, is a method of identifying vulnerabilities and threats and assessing the possible damage to determine where to implement security safeguards. Risk analysis is used to ensure that security is cost-effective, relevant, timely, and responsive to threats. Security can be quite complex, even for well-versed security professionals, and it is easy to apply too much security, not enough security, or the wrong security components, and spend too much money in the process without attaining the necessary objectives. Risk analysis helps companies prioritize their risks and shows management the amount of money that should be applied to protecting against those risks in a sensible manner.

A risk analysis has four main goals:

  • Identify assets and their values

  • Identify vulnerabilities and threats

  • Quantify the probability and business impact of these potential threats

  • Provide an economic balance between the impact of the threat and the cost of the countermeasure

Risk analysis provides a cost/benefit comparison, which compares the annualized cost of safeguards to the potential cost of loss. A safeguard, in most cases, should not be implemented unless the annualized cost of loss exceeds the annualized cost of the safeguard itself. This means that if a facility is worth $100,000, it does not make sense to spend $150,000 trying to protect it.

It is important to figure out what you are supposed to be doing before you dig right in and start working. Anyone who has worked on a project without a properly defined scope can attest to this statement. Before an assessment and analysis is started, the team must carry out project sizing to understand what assets and threats should be evaluated. Most assessments are focused on physical security, technology security, or personnel security. Trying to assess all of them at the same time can be quite an undertaking.

One of the team’s tasks is to create a report that details the asset valuations. Senior management should review and accept the lists, and make them the scope of the IRM project. If management determines at this early stage that some assets are not important, the risk assessment team should not spend additional time or resources evaluating those assets. During discussions with management, everyone involved must have a firm understanding of the value of the security AIC triad—availability, integrity, and confidentiality—and how it directly relates to business needs.

Management should outline the scope, which most likely will be dictated by regulations and funds. Many projects have run out of funds, and consequently stopped, because proper project sizing was not conducted at the onset of the project. Don’t let this happen to you.

A risk analysis helps integrate the security program objectives with the company’s business objectives and requirements. The more the business and security objectives are in alignment, the more successful the two will be. The analysis also helps the company draft a proper budget for a security program and its constituent security components. Once a company knows how much its assets are worth and the possible threats they are exposed to, it can make intelligent decisions about how much money to spend protecting those assets.

A risk analysis must be supported and directed by senior management if it is to be successful. Management must define the purpose and scope of the analysis, appoint a team to carry out the assessment, and allocate the necessary time and funds to conduct the analysis. It is essential for senior management to review the outcome of the risk assessment and analysis and act on its findings. After all, what good is it to go through all the trouble of a risk assessment and not react to its findings? Unfortunately, this does happen all too often.

The Risk Analysis Team

Each organization has different departments, and each department has its own functionality, resources, tasks, and quirks. For the most effective risk analysis, an organization must build a risk analysis team that includes individuals from many or all departments to ensure that all of the threats are identified and addressed. The team members may be part of management, application programmers, IT staff, systems integrators, and operational managers—indeed, any key personnel from key areas of the organization. This mix is necessary because if the risk analysis team comprises only individuals from the IT department, it may not understand, for example, the types of threats the accounting department faces with data integrity issues, or how the company as a whole would be affected if the accounting department’s data files were wiped out by an accidental or intentional act. Or, as another example, the IT staff may not understand all the risks the employees in the warehouse would face if a natural disaster were to hit, or what it would mean to their productivity and how it would affect the organization overall. If the risk analysis team is unable to include members from various departments, it should, at the very least, make sure to interview people in each department so it fully understands and can quantify all threats.

The risk analysis team must also include people who understand the processes that are part of their individual departments, meaning individuals who are at the right levels of each department. This is a difficult task, since managers tend to delegate any sort of risk analysis task to lower levels within the department. However, the people who work at these lower levels may not have adequate knowledge and understanding of the processes that the risk analysis team may need to deal with.

When looking at risk, it’s good to keep several questions in mind. Raising these questions helps ensure that the risk analysis team and senior management know what is important. Team members must ask the following: What event could occur (threat event)? What could be the potential impact (risk)? How often could it happen (frequency)? What level of confidence do we have in the answers to the first three questions (certainty)? A lot of this information is gathered through internal surveys, interviews, or workshops.

Viewing threats with these questions in mind helps the team focus on the tasks at hand and assists in making the decisions more accurate and relevant.

Risk Ownership

One of the more important questions that face people working within an organization is who owns the risk? The answer really isn’t straightforward because it depends upon the situation and what kind of risk is being discussed. Senior management owns the risk present during the operation of the organization, but there may be times when senior management also relies upon data custodians or business units to conduct work and it is during this time that these other elements of the organization also shoulder some of the responsibility of risk ownership. Granted, it always ultimately rests on senior management, but they also must be able to trust that the work they have delegated is being handled in a manner that understands, accepts the existence of, and works to minimize the risks the organization faces in the course of its regular operations.

The Value of Information and Assets

If information does not have any value, then who cares about protecting it?

The value placed on information is relative to the parties involved, what work was required to develop it, how much it costs to maintain, what damage would result if it were lost or destroyed, what enemies would pay for it, and what liability penalites could be endured. If a company does not know the value of the information and the other assets it is trying to protect, it does not know how much money and time it should spend on protecting them. If you were in charge of making sure Russia does not know the encryption algorithms used when transmitting information to and from U.S. spy satellites, you would use more extreme (and expensive) security measures than you would use to protect your peanut butter and banana sandwich recipe from your next-door neighbor. The value of the information supports security measure decisions.

The previous examples refer to assessing the value of information and protecting it, but this logic applies toward an organization’s facilities, systems, and resources. The value of the company’s facilities must be assessed, along with all printers, workstations, servers, peripheral devices, supplies, and employees. You do not know how much is in danger of being lost if you don’t know what you have and what it is worth in the first place.

Costs That Make Up the Value

An asset can have both quantitative and qualitative measurements assigned to it, but these measurements need to be derived. The actual value of an asset is determined by the cost it takes to acquire, develop, and maintain it. The value is determined by the importance it has to the owners, authorized users, and unauthorized users. Some information is important enough to a company to go through the steps of making it a trade secret.

The value of an asset should reflect all identifiable costs that would arise if there were an actual impairment of the asset. If a server cost $4000 to purchase, this value should not be input as the value of the asset in a risk assessment. Rather, the cost of replacing or repairing it, the loss of productivity, and the value of any data that may be corrupted or lost must be accounted for to properly capture the amount the company would lose if the server were to fail for one reason or another.

The following issues should be considered when assigning values to assets:

  • Cost to acquire or develop the asset

  • Cost to maintain and protect the asset

  • Value of the asset to owners and users

  • Value of the asset to adversaries

  • Value of intellectual property that went into developing the information

  • Price others are willing to pay for the asset

  • Cost to replace the asset if lost

  • Operational and production activities affected if the asset is unavailable

  • Liability issues if the asset is compromised

  • Usefulness and role of the asset in the organization

Understanding the value of an asset is the first step to understanding what security mechanisms should be put in place and what funds should go toward protecting it. A very important question is how much it could cost the company to not protect the asset.

Determining the value of assets may be useful to a company for a variety of reasons, including the following:

  • To perform effective cost/benefit analyses

  • To select specific countermeasures and safeguards

  • To determine the level of insurance coverage to purchase

  • To understand what exactly is at risk

  • To conform to due care and comply with legal and regulatory requirements

Assets may be tangible (computers, facilities, supplies) or intangible (reputation, data, intellectual property). It is usually harder to quantify the values of intangible assets, which may change over time. How do you put a monetary value on a company’s reputation? Sometimes that’s harder to figure out than a Rubik’s Cube.

Identifying Threats

Okay, what should we be afraid of?

Earlier, it was stated that the definition of a risk is the probability of a threat agent exploiting a vulnerability to cause harm to a computer, network, or company and the resulting business impact. Many types of threat agents can take advantage of several types of vulnerabilities, resulting in a variety of specific threats, as outlined in Table 3-2, which represents only a sampling of the risks many organizations should address in their risk management programs.

Table 3-2. Relationship of Threats and Vulnerabilities
Threat AgentCan Exploit This VulnerabilityResulting in This Threat
VirusLack of antivirus softwareVirus infection
HackerPowerful services running on a serverUnauthorized access to confidential information
UsersMisconfigured parameter in the operating systemSystem malfunction
FireLack of fire extinguishersFacility and computer damage, and possibly loss of life
EmployeeLack of training or standards enforcement; Lack of auditingSharing mission-critical information; Altering data inputs and outputs from data processing applications
ContractorLax access control mechanismsStealing trade secrets
AttackerPoorly written application; Lack of stringent firewall settingsConducting a buffer overflow; Conducting a Denial-of-Service attack
IntruderLack of security guardBreaking windows and stealing computers and devices

Other types of threats can arise in a computerized environment that are much harder to identify than those listed in Table 3-2. These other threats have to do with application and user errors. If an application uses several complex equations to produce results, the threat can be difficult to discover and isolate if these equations are incorrect or if the application is using inputted data incorrectly. This can result in illogical processing and cascading errors as invalid results are passed on to another process. These types of problems can lie within applications’ code and are very hard to identify.

User errors, intentional or accidental, are easier to identify by monitoring and auditing user activities. Audits and reviews must be conducted to discover if employees are inputting values incorrectly into programs, misusing technology, or modifying data in an inappropriate manner.

Once the vulnerabilities and associated threats are identified, the ramifications of these vulnerabilities being exploited must be investigated. Risks have loss potential, meaning what the company would lose if a threat agent were actually to exploit a vulnerability. The loss may be corrupted data, destruction of systems and/or the facility, unauthorized disclosure of confidential information, a reduction in employee productivity, and so on. When performing a risk analysis, the team also must look at delayed loss when assessing the damages that can occur. Delayed loss has negative effects on a company after a vulnerability is initially exploited. The time period can be anywhere from 15 minutes to years after the exploitation. Delayed loss may include reduced productivity over a period of time, damage to the company’s reputation, reduced income to the company, accrued late penalties, extra expense to get the environment back to proper working conditions, the delayed collection of funds from customers, and so forth.

For example, if a company’s web servers are attacked and taken offline, the immediate damage could be data corruption, the man-hours necessary to place the servers back online, and the replacement of any code or components required. The company could lose revenue if it usually accepts orders and payments via its web site. If it takes a full day to get the web servers fixed and back online, the company could lose a lot more sales and profits. If it takes a full week to get the web servers fixed and back online, the company could lose enough sales and profits to not be able to pay other bills and expenses. This would be a delayed loss. If the company’s customers lose confidence in it because of this activity, it could lose business for months or years. This is a more extreme case of delayed loss.

These types of issues make the process of properly quantifying losses that specific threats could cause more complex, but they must be taken into consideration to ensure reality is represented in this type of analysis.

Methodologies for Risk Assessment

Risk assessment has several different methodologies. Let’s take a look at a couple of them.

NIST SP 800-30 and 800-66 are methodologies that can be used by the general public, but their initial creation was designed to be implemented in the healthcare field or other regulated industries. While they were designed to be used by HIPAA clients, they can also be readily adopted and used by other regulated industries. 800-66, specifically, is an example of the kind of methodology that was intended for one regulated industry but that can be adopted and used by another.

A second type of risk assessment methodology is called FRAP, which stands for Facilitated Risk Analysis Process. It is designed with the intention of exploring a qualitative risk assessment process in a manner that allows for tests to be conducted on different aspects and variations of the methodology. The intent of this methodology is to provide an organization with the means of deciding what course and actions must be taken in specific circumstances to deal with various issues. This will allow, through the use of a prescreening process, users to determine the areas that really demand and need risk analysis within an organization. FRAP is designed in such a manner that it claims anyone with good facilitation skills will be capable of operating it successfully.

Another methodology called OCTAVE was created by Carnegie Mellon University’s Software Engineering Institute. It is a methodology that is intended to be used in situations where people manage and direct the risk evaluation for information security within their company. This places the people that work inside the organization in the power positions as being able to make the decisions regarding what is the best approach for evaluating the security of their organization. This relies on the idea that the people working in these environments best understand what is needed and what kind of risks they are facing.

CRAMM is yet another kind of methodology. The acronym stands for CCTA Risk Analysis and Management Method. Though implemented in a manner similar to other methodologies we’ve discussed, it is divided into three segments: countermeasure selection, threat and vulnerability analysis, and valuation and identification of assets. This is intended to deal with the technical aspects of an organization as well as the nontechnical portions.

Spanning Tree Analysis is a methodology that develops a tree of all the potential threats and faults that can disrupt a system. Each of the branches is a general topic or category, and as the risk analysis is conducted, the branches that do not apply can be removed (or “pruned” if you care to stay with the tree motif).


Failure and Fault Analysis

Failure Modes and Effect Analysis (FMEA) is a method for determining functions, identifying functional failures, and assessing the causes of failure and their failure effects through a structured process. The application of this process to a chronic failure enables the determination of where exactly the failure is most likely to occur. This is very helpful in pinpointing where a vulnerability exists, as well as determining exactly what kind of scope the vulnerability entails—meaning, what would be the secondary ramifications of its exploitation? This in turn makes it not only easier to apply a corrective fix to the vulnerability, but it also allows for a much more effective application of resources to the issue. Think of it as being able to look into the future and locate areas that have the potential for failure, or find vulnerabilities and then apply corrective measures to them before they do become actual liabilities.

By following a specific order of steps, the best results can be maximized for a Failure Mode Analysis.

1.
Start with a block diagram of a system or control.

2.
Consider what happens if each block of the diagram fails.

3.
Draw up a table in which failures are paired with their effects and an evaluation of the effects.

4.
Correct the design of the system and adjust the table until the system is not known to have unacceptable problems.

5.
Have several engineers review the failure modes and effects analysis.

Table 3-3 is an example of how an FMEA can be carried out and documented. Although most companies will not have the resources to do this level of detailed work for each and every system and control, it should be carried out on critical functions and systems that can drastically affect the company.

Table 3-3. How an FMEA Can Be Carried Out and Documented
Prepared by:
Approved by:
Date:
Revision:
    Failure Effect on... 
Item IdentificationFunctionFailure ModeFailure CauseComponent or Functional AssemblyNext Higher AssemblySystemFailure Detection Method
IPS application content filterInline perimeter protectionFails to closeTraffic overloadSingle point of failure; Denial of serviceIPS blocks ingress traffic streamIPS is brought downHealth check status sent to console and e-mail to security administrator
Central antivirus signature update enginePush updated signatures to all servers and workstationsFails to provide adequate, timely protection against malwareCentral server goes downIndividual node’s antivirus software is not updatedNetwork is infected with malwareCentral server can be infected and/or infect other systemsHeartbeat status check sent to central console and page network administrator
Fire suppression water pipesSuppress fire in building 1 in 5 zonesFails to closedWater in pipes freezeNoneBuilding 1 has no suppression agent availableFire suppression system pipes breakSuppression sensors tied directly into fire system central console
Etc.       

Note

Compliance auditors review the documentation of processes, controls, testing activities, and results. This type of documentation (as long as it is accurate) will illustrate to the auditors how well your organization knows its systems and how you plan to address failures that may take place.


It is important to look at a control or system from the micro to the macro level to fully understand where a vulnerability or potential fault resides and the full ramifications of its exploitation. Each computer system is potentially made up of many different time bombs at different layers of its makeup. At the component level, a buffer overflow or dangerous ActiveX control could cause the system to be controlled by an attacker after exploitation. At the program level, an application may not be carrying out proper authorization steps or may not protect its cryptographic keys properly. At the systemwide level, the kernel of an operating system may be flawed, allowing root access to be easily accomplished. Scary stuff can arise at each level, which is why such a detailed approach is necessary.

FMEA was first developed for systems engineering. Its purpose is to examine the potential failures in products and the processes involved with them. This approach proved to be successful and has been more recently adapted for use in evaluating of risk management priorities and mitigating known threat-vulnerabilities.

The reason for the use of FMEA in assurance risk management is because of the level of detail, variables, and complexity that continues to rise as corporations understand risk at more granular levels. This methodical way of identifying potential pitfalls is coming into play more as the need of risk awareness—down to the tactical and operational levels—continues to expand.

While FMEA is most useful as a survey method in order to identify major failure modes in a given system, the method is not as useful in discovering complex failure modes that may be involved in multiple systems or subsystems. A fault tree analysis usually proves to be a more useful approach to identifying failures that can take place within more complex environments and systems.

Fault tree analysis follows this general process. First, an undesired effect is taken as the root or top event of a tree of logic. Then, each situation that has the potential to cause that effect is added to the tree as a series of logic expressions. Fault trees are then labeled with actual numbers pertaining to failure probabilities. This is typically done by using computer programs that can calculate the failure probabilities from a fault tree.

Figure 3-7 shows a simplistic fault tree and the different logic symbols used to represent what must take place for a specific fault event to occur.

Figure 3-7. Fault tree and logic components


When setting up the tree, it must accurately list all the threats or faults that can occur with a system. The branches of the tree can be divided into general categories such as physical threats, networks threats, software threats, Internet threats, and component failure threats. Then, once all possible general categories are in place, you can trim them down and effectively prune the branches from the tree that won’t apply to the system in question. In general, if a system is not connected to the Internet by any means, remove that general branch from the tree.

Some of the most common software failure events that can be explored through a fault tree analysis are the following:

  • False alarms

  • Insufficient error handling

  • Sequencing or order

  • Timing outputs are incorrect

  • Outputs are valid but not expected

Of course, because of the complexity of software and heterogeneous environments, this is a very small list.

Note

Six Sigma is a process improvement methodology. It is the “new and improved” Total Quality Management (TQM) that hit the business sector in the 1980s. Its goal is to improve process quality by using statistical methods of measuring operation efficiency and reducing variation, defects, and waste. Six Sigma is being used in the assurance industry in some instances to measure the success factors of different controls and procedures.


So, up to now, we have secured management’s support of the risk analysis, constructed our team so it represents different departments in the company, placed a value on each of the company’s assets, and identified all the possible threats that could affect the assets. We have also taken into consideration all potential and delayed losses the company may endure per asset per threat. We have carried out a failure mode analysis and/or a fault tree analysis to understand the underlying causes of the identified threats. The next step is to use qualitative or quantitative methods to calculate the actual risk the company faces.

Quantitative Risk Analysis

The two types of approaches to risk analysis are quantitative and qualitative. Quantitative risk analysis attempts to assign real and meaningful numbers to all elements of the risk analysis process. These elements may include safeguard costs, asset value, business impact, threat frequency, safeguard effectiveness, exploit probabilities, and so on. When all of these are quantified, the process is said to be quantitative. Quantitative risk analysis also provides concrete probability percentages when determining the likelihood of threats. Each element within the analysis (asset value, threat frequency, severity of vulnerability, impact damage, safeguard costs, safeguard effectiveness, uncertainty, and probability items) is quantified and entered into equations to determine total and residual risks.

Purely quantitative risk analysis is not possible because the method attempts to quantify qualitative items, and there are always uncertainties in quantitative values. How do you know how often a vulnerability will be exploited? How do you know the exact monetary business impact that would arise?

Quantitative and qualitative approaches have their own pros and cons, and each applies more appropriately to some situations than others. Company management and the risk analysis team, and the tools they decide to use, will determine which approach is best.

Note

Quantitative analysis uses risk calculations that attempt to predict the level of monetary losses and the percentage of chance for each type of threat. Qualitative analysis does not use calculations. Instead, it is more opinion- and scenario-based.


Automated Risk Analysis Methods

Collecting all the necessary data that needs to be plugged into risk analysis equations and properly interpreting the results can be overwhelming if done manually. Several automated risk analysis tools on the market can make this task much less painful and, hopefully, more accurate. The gathered data can be reused, greatly reducing the time required to perform subsequent analyses. The risk analysis team can also print out reports and comprehensive graphs to be presented to the management.

Note

Vulnerability assessment and risk analysis tools are available in freeware and commercial versions. Obtaining serious results often requires taking a serious approach to finding the tools that best serve the accuracy of the project.


The objective of these tools is to reduce the manual effort of these tasks, perform calculations quickly, estimate future expected losses, and determine the effectiveness and benefits of the security countermeasures chosen. Most automatic risk analysis products port information into a database and run several types of scenarios with different parameters to give a panoramic view of what the outcome will be if different threats come to bear. For example, after such a tool has all the necessary information inputted, it can be rerun several times with different parameters to compute the potential outcome if a large fire were to take place; the potential losses if a virus were to damage 40 percent of the data on the main file server; how much the company would lose if an attacker were to steal all the customer credit card information held in three databases; and so on. Running through the different risk possibilities gives a company a more detailed understanding of which risks are more critical than others, and thus which ones to address first. Figure 3-8 shows a simple ouput of this process.

Figure 3-8. A simplistic example showing the severity of current threats versus the probability of them occurring


Steps of a Risk Analysis

Many methods and equations can be used when performing a quantitative risk analysis, and many different variables can be inserted into the process. This section covers some of the main steps that should take place in every risk analysis.

Step 1: Assign Value to Assets

For each asset, answer the following questions to determine its value:

  • What is the value of this asset to the company?

  • How much does it cost to maintain?

  • How much does it make in profits for the company?

  • How much would it be worth to the competition?

  • How much would it cost to re-create or recover?

  • How much did it cost to acquire or develop?

  • How much liability do you face if the asset is compromised?

Step 2: Estimate Potential Loss per Threat

To estimate potential losses posed by threats, answer the following questions:

  • What physical damage could the threat cause and how much would that cost?

  • How much loss of productivity could the threat cause and how much would that cost?

  • What is the value lost if confidential information is disclosed?

  • What is the cost of recovering from this threat?

  • What is the value lost if critical devices were to fail?

  • What is the single loss expectancy (SLE) for each asset, and each threat?

This is just a small sample of questions that should be answered. The specific questions will depend upon the types of threats the team uncovers.

Step 3: Perform a Threat Analysis

Take the following steps to perform a threat analysis:

  • Gather information about the likelihood of each threat taking place from people in each department. Examine past records and official security resources that provide this type of data.

  • Calculate the annualized rate of occurrence (ARO), which is how many times the threat can take place in a 12-month period.

Step 4: Derive the Overall Annual Loss Potential per Threat

To derive the overall loss potential per threat, do the following:

  • Combine potential loss and probability.

  • Calculate the annualized loss expectancy (ALE) per threat by using the information calculated in the first three steps.

  • Choose remedial measures to counteract each threat.

  • Carry out cost/benefit analysis on the identified countermeasures.

Step 5: Reduce, Transfer, Avoid, or Accept the Risk

For each risk, you can choose whether to reduce, transfer, or accept the risk:

  • Risk reduction methods

    • Install security controls and components.

    • Improve procedures.

    • Alter the environment.

    • Provide early detection methods to catch the threat as it’s happening and reduce the possible damage it can cause.

    • Produce a contingency plan of how business can continue if a specific threat takes place, reducing further damages of the threat.

    • Erect barriers to the threat.

    • Carry out security-awareness training.

  • Risk transfer Buy insurance to transfer some of the risk, for example.

  • Risk acceptance Live with the risks and spend no more money toward protection.

  • Risk avoidance Discontinue the activity that is causing the risk.

Because we are stepping through a quantitative risk analysis, real numbers are used and calculations are necessary. Single loss expectancy (SLE) and annualized loss expectancy (ALE) were mentioned in the previous analysis steps. The SLE is a dollar amount that is assigned to a single event that represents the company’s potential loss amount if a specific threat were to take place:

asset value × exposure factor (EF) = SLE

The exposure factor (EF) represents the percentage of loss a realized threat could have on a certain asset. So, for example, if a data warehouse has the asset value of $150,000, it might be estimated that if a fire were to occur, 25 percent of the warehouse would be damaged (and not more, because of a sprinkler system and other fire controls, proximity of a firehouse, and so on), in which case the SLE would be $37,500. This figure is derived to be inserted into the ALE equation:

SLE × annualized rate of occurrence (ARO) = ALE

Accepting Risk

When a company decides to accept a risk, the decision should be based on cost (countermeasure costs more than potential loss) and an acceptable level of pain (company can live with the vulnerability and threat). But the company must also understand this is a visibility decision, insofar as accepting a specific risk may impact its industry reputation.


The annualized rate of occurrence (ARO) is the value that represents the estimated frequency of a specific threat taking place within a one-year timeframe. The range can be from 0.0 (never) to 1.0 (at least once a year) to greater than one (several times a year) and anywhere in between. For example, if the probability of a flood taking place in Mesa, Arizona is once in 1000 years, the ARO value is 0.001.

So, if a fire taking place within a company’s data warehouse facility can cause $37,500 in damages, and the frequency (or ARO) of a fire taking place has an ARO value of 0.1 (indicating once in ten years), then the ALE value is $3750 ($37,500 × 0.1 = $3750).

The ALE value tells the company that if it wants to put in controls or safeguards to protect the asset from this threat, it can sensibly spend $3750 or less per year to provide the necessary level of protection. Knowing the real possibility of a threat and how much damage, in monetary terms, the threat can cause is important in determining how much should be spent to try and protect against that threat in the first place. It would not make good business sense for the company to spend more than $3750 per year to protect itself from this threat.

Now that we have all these numbers, what do we do with them? Let’s look at the example in Table 3-4, which shows the outcome of a risk analysis. With this data, the company can make intelligent decisions on what threats must be addressed first because of the severity of the threat, the likelihood of it happening, and how much could be lost if the threat were realized. The company now also knows how much money it should spend to protect against each threat. This will result in good business decisions, instead of just buying protection here and there without a clear understanding of the big picture. Because the company has a risk of losing up to $6500 if data is corrupted by virus infiltration, up to this amount of funds can be earmarked toward providing antivirus software and methods to ensure that a virus attack will not happen.

Table 3-4. Breaking Down How SLE and ALE Values Are Used
AssetThreatSingle Loss Expectancy (SLE)Annualized Rate of Occurrence (ARO)Annual Loss Expectancy (ALE)
FacilityFire$230,0000.1$23,000
Trade secretStolen$40,0000.01$400
File serverFailed$11,5000.1$1150
DataVirus$65001.0$6500
Customer credit card infoStolen$300,0003.0$900,000

We have just explored the ways of performing risk analysis through quantitative means. This method tries to measure the loss in monetary value and assign numeric sums to each component within the analysis. As stated previously, however, a pure quantitative analysis is difficult to achieve all the time, as well as all the resources required to gather all of the necessary information and values.

A quantitative analysis is also considered subjective, not objective, to many people. Although we can look at past events, do our best to assess the value of the assets, and contact agencies that provide frequency estimates of disasters happening in our area, we still cannot say for a fact that we have a 10 percent chance of a fire happening in a year and that it will cause exactly $230,000 in damage. In quantitative risk analysis, we can do our best to provide all the correct information, and by doing so we will come close to the risk values, but we cannot predict the future and how much the future will cost us or the company.

Results of a Risk Analysis

The risk analysis team should have clearly defined goals. The following is a short list of what generally is expected from the results of a risk analysis:

  • Monetary values assigned to assets

  • Comprehensive list of all possible and significant threats

  • Probability of the occurrence rate of each threat

  • Loss potential the company can endure per threat in a 12-month time span

  • Recommended safeguards, countermeasures, and actions

Although this list looks short, there is usually an incredible amount of detail under each bullet item. This report will be presented to senior management, which will be concerned with possible monetary losses and the necessary costs to mitigate these risks. Although the reports should be as detailed as possible, there should be executive abstracts so senior management can quickly understand the overall findings of the analysis.

Note

A risk analysis is considered fully quantitative if all elements of the process are quantified (asset value, business impact, frequency, countermeasure effectiveness, countermeasure costs, probability, and uncertainty).


Qualitative Risk Analysis

I think we are secure.

Response: Great! Let’s all go home.

Another method of risk analysis is qualitative, which does not assign numbers and monetary values to components and losses. Instead, qualitative methods walk through different scenarios of risk possibilities and rank the seriousness of the threats and the validity of the different possible countermeasures based on opinions. (A wide sweeping analysis can include hundreds of scenarios.) Qualitative analysis techniques include judgment, best practices, intuition, and experience. Examples of qualitative techniques to gather data are Delphi, brainstorming, storyboarding, focus groups, surveys, questionnaires, checklists, one-on-one meetings, and interviews. The risk analysis team will determine the best technique for the threats that need to be assessed, as well as the culture of the company and individuals involved with the analysis.

Uncertainty

In risk analysis, uncertainty refers to the degree to which you lack confidence in an estimate. This is expressed as a percentage, from 0 to 100 percent. If you have a 30 percent confidence level in something, then it could be said you have a 70 percent uncertainty level. Capturing the degree of uncertainty when carrying out a risk analysis is important, because it indicates the level of confidence the team and management should have in the resulting figures.


The team that is performing the risk analysis gathers personnel who have experience and education on the threats being evaluated. When this group is presented with a scenario that describes threats and loss potential, each member responds with their gut feeling and experience on the likelihood of the threat and the extent of damage that may result.

A scenario approximately one page in length is written for each major threat. The “expert,” who is most familiar with this type of threat, should review the scenario to ensure it reflects how an actual threat would be carried out. Safeguards that would diminish the damage of this threat are then evaluated, and the scenario is played out for each safeguard. The exposure possibility and loss possibility can be ranked as high, medium, or low on a scale of 1 to 5 or 1 to 10. Once the selected personnel rank the possibility of a threat happening, the loss potential, and the advantages of each safeguard, this information is compiled into a report and presented to management to help it make better decisions on how best to implement safeguards into the environment. The benefits of this type of analysis are that communication must happen among team members to rank the risks, safeguard strengths, and identify weaknesses, and the people who know these subjects the best provide their opinions to management.

Let’s look at a simple example of a qualitative risk analysis.

The risk analysis team writes a one-page scenario explaining the threat of a hacker accessing confidential information held on the five file servers within the company. The risk analysis team then distributes the one-page scenario to a team of five people (the IT manager, database administrator, application programmer, system operator, and operational manager), who are also given a sheet to rank the threat’s severity, loss potential, and each safeguard’s effectiveness, with a rating of 1 to 5, 1 being the least severe, effective, or probable. Table 3-5 shows the results.

Table 3-5. Example of a Qualitative Analysis
Threat = Hacker Accessing Confidential InformationSeverity of ThreatProbability of Threat Taking PlacePotential Loss to the CompanyEffectiveness of FirewallEffectiveness of Intrusion Detection SystemEffectiveness of Honeypot
IT manager424432
Database administrator444341
Application programmer233421
System operator343421
Operational manager544442
Results3.63.43.63.831.4

This data is compiled and inserted into a report and presented to management. When management is presented with this information, it will see that its staff (or a chosen set of security professionals) feels that purchasing a firewall will protect the company from this threat more than purchasing an intrusion detection system, or setting up a honeypot system.

This is the result of looking at only one threat, and management will view the severity, probability, and loss potential of each threat so it knows which threats cause the greatest risk and should be addressed first.

The Delphi Technique

The Delphi technique is a group decision method used to ensure each member gives an honest opinion of what he or she thinks the result to a particular threat will be. This avoids a group of individuals feeling pressured to go along with others’ thought processes and enables them to participate in an independent and anonymous way. Each member of the group provides his or her opinion of a certain threat and turns it in to the team that is performing the analysis. The results are compiled and distributed to the group members, who then write down their comments anonymously and return them back to the analysis group. The comments are compiled and redistributed for more comments until a consensus is formed. This method is used to obtain an agreement on cost, loss values, and probabilities of occurrence without individuals having to agree verbally.

Delphi Methods

In this text we are describing the consensus Delphi method, where experts help to identify the highest-priority security issues and corresponding countermeasures. Another Delphi method, the Modified Delphi technique, is a silent form of brainstorming. Participants develop ideas individually and silently with no group interaction. The ideas are submitted to a group of decision makers for consideration and action.


Quantitative vs. Qualitative

So which method should we use?

Each method has its advantages and disadvantages, some of which are outlined in Table 3-6 for purposes of comparison.

Table 3-6. Quantitative vs. Qualitative Characteristics
AttributeQuantitativeQualitative
Requires no calculations X
Requires more complex calculationsX 
Involves high degree of guesswork X
Provides general areas and indications of risk X
Is easier to automate and evaluateX 
Used in risk management performance trackingX 
Provides credible cost/benefit analysisX 
Uses independently verifiable and objective metricsX 
Provides the opinions of the individuals who know the processes best X
Shows clear-cut losses that can be accrued within one year’s timeX 

The risk analysis team, management, risk analysis tools, and culture of the company will dictate which approach—quantitative or qualitative—will be used. The goal of either method is to estimate a company’s real risk and rank the severity of the threats so the correct countermeasures can be put into place within a practical budget.

Table 3-6 refers to some of the positive aspects of the qualitative and quantitative approaches. However, not everything is always easy. In deciding to use either a qualitative or quantitative approach, the following points might need to be considered:

Qualitative Cons
  • The assessments and results are basically subjective.

  • Usually eliminates the opportunity to create a dollar value for cost/benefit discussions.

  • Difficult to track risk management objectives with subjective measures.

  • Standards are not available. Each vendor has its own way of interpreting the processes and their results.

Quantitative Cons
  • Calculations are more complex. Can management understand how these values were derived?

  • Without automated tools, this process is extremely laborious.

  • More preliminary work is needed to gather detailed information about environment.

  • Standards are not available. Each vendor has its own way of interpreting the processes and their results.

Protection Mechanisms

Okay, so we know we are at risk, and we know the probability of it happening. Now, what do we do? Response: Run.

The next step is to identify the current security mechanisms and evaluate their effectiveness.

Because a company has such a wide range of threats (not just computer viruses and attackers), each threat type must be addressed and planned for individually. Access control mechanisms used as security safeguards are discussed in Chapter 4. Software applications and data malfunction considerations are covered in Chapters 5 and 11. Site location, fire protection, site construction, power loss, and equipment malfunctions are examined in detail in Chapter 6. Telecommunication and networking issues are analyzed and presented in Chapter 7. Business continuity and disaster recovery concepts are addressed in Chapter 9. All of these subjects have their own associated risks and planning requirements.

This section addresses identifying and choosing the right countermeasures for computer systems. It gives the best attributes to look for and the different cost scenarios to investigate when comparing different types of countermeasures. The end product of the analysis of choices should demonstrate why the selected control is the most advantageous to the company.

Countermeasure Selection

A security countermeasure, sometimes called a safeguard, must make good business sense, meaning it is cost-effective (its benefit outweighs its cost). This requires another type of analysis: a cost/benefit analysis. A commonly used cost/benefit calculation for a given safeguard is

(ALE before implementing safeguard) – (ALE after implementing safeguard) – (annual cost of safeguard) = value of safeguard to the company

For example, if the ALE of the threat of a hacker bringing down a web server is $12,000 prior to implementing the suggested safeguard, and the ALE is $3000 after implementing the safeguard, while the annual cost of maintenance and operation of the safeguard is $650, then the value of this safeguard to the company is $8350 each year.

The cost of a countermeasure is more than just the amount filled out on the purchase order. The following items should be considered and evaluated when deriving the full cost of a countermeasure:

  • Product costs

  • Design/planning costs

  • Implementation costs

  • Environment modifications

  • Compatibility with other countermeasures

  • Maintenance requirements

  • Testing requirements

  • Repair, replacement, or update costs

  • Operating and support costs

  • Effects on productivity

  • Subscription costs

  • Extra man- or woman-hours for monitoring and responding to alerts

  • Beer for the headaches that this new tool will bring about

Warning

As a consultant, I have repeatedly seen companies purchase new security products without understanding that they will need the staff to maintain those products. Although tools automate tasks, many companies were not even carrying out these tasks before, so they do not save on man-hours, but many times require more.


Consider an example. Company A decides that to protect many of its resources, purchasing an IDS is warranted. So, the company pays $5500 for an IDS. Is that the total cost? Nope. This software should be tested in an environment that is segmented from the production environment to uncover any unexpected activity. After this testing is complete and the IT group feels it is safe to insert the IDS into its production environment, the IT group must install the monitoring management software, install the sensors, and properly direct the communication paths from the sensors to the management console. The IT group may also need to reconfigure the routers to redirect traffic flow, and it definitely needs to ensure that users cannot access the IDS management console. Finally, the IT group should configure a database to hold all attack signatures, and then run simulations.

Costs associated with an IDS alert response should most definitely be considered. Now that Company A has an IDS in place, security administrators may need additional alerting equipment such as pagers or Blackberrys. And then there are the time costs associated with a response to an IDS event.

Anyone who has worked in an IT group knows that some adverse reaction almost always takes place in this type of scenario. Network performance can take an unacceptable hit after installing a product, if it is an inline or proactive product. Users may no longer be able to access the Unix server for some mysterious reason. The IDS vendor may not have explained that two more service patches are necessary for the whole thing to work correctly. Staff time will need to be allocated for training, and to respond to all of the positive and false positive alerts the new IDS sends out.

So, for example, the cost of this countermeasure could be $5500 for the product, $2500 for training, $3400 for the lab and testing time, $2600 for the loss in user productivity once the product was introduced into production, and $4000 in labor for router reconfiguration, product installation, troubleshooting, and installation of the two service patches. The real cost of this countermeasure is $18,000. If our total potential loss was calculated at $9000, we went over budget by 100 percent when applying this countermeasure for the identified risk. Some of these costs may be hard or impossible to identify before they are incurred, but an experienced risk analyst would account for many of these possibilities.

Functionality and Effectiveness of Countermeasures

The countermeasure doesn’t work, but it’s pretty.

Response: Good enough.

The risk analysis team must evaluate the safeguard’s functionality and effectiveness. When selecting a safeguard, some attributes are more favorable than others. Table 3-7 lists and describes attributes that should be considered before purchasing and committing to a security protection mechanism.

Table 3-7. Characteristics to Seek When Obtaining Safeguards
CharacteristicDescription
Modular in natureIt can be installed or removed from an environment without adversely affecting other mechanisms.
Provides uniform protectionA security level is applied to all mechanisms it is designed to protect in a standardized method.
Provides override functionalityAn administrator can override the restriction if necessary.
Defaults to least privilegeWhen installed, it defaults to a lack of permissions and rights instead of installing with everyone having full control.
Independent of safeguards and the asset it is protectingThe safeguard can be used to protect different assets, and different assets can be protected by different safeguards.
Flexibility and securityThe more security the safeguard provides, the better. This functionality should come with flexibility, which enables you to choose different functions instead of all or none.
User interactionDoes not panic users.
Clear distinction between user and administratorA user should have fewer permissions when it comes to configuring or disabling the protection mechanism.
Minimum human interventionWhen humans have to configure or modify controls, this opens the door to errors. The safeguard should require the least amount of input from humans as possible.
Asset protectionAsset is still protected even if countermeasure needs to be reset.
Easily upgradedSoftware continues to evolve, and updates should be able to happen painlessly.
Auditing functionalityThere should be a mechanism that is part of the safeguard that provides minimum and/or verbose auditing.
Minimizes dependence on other componentsThe safeguard should be flexible and not have strict requirements about the environment into which it will be installed.
Easily useable, acceptable, and tolerated by personnelIf the safeguards provide barriers to productivity or add extra steps to simple tasks, users will not tolerate it.
Must produce output in usable and understandable formatImportant information should be presented in a format easy for humans to understand and use for trend analysis.
Must be able to reset safeguardThe mechanism should be able to be reset and returned to original configurations and settings without affecting the system or asset it is protecting.
TestableThe safeguard should be able to be tested in different environments under different situations.
Does not introduce other compromisesThe safeguard should not provide any covert channels or backdoors.
System and user performanceSystem and user performance should not be greatly affected.
Universal applicationThe safeguard can be implemented across the environment and does not require many, if any, exceptions.
Proper alertingThresholds should be able to be set as to when to alert personnel of a security breach, and this type of alert should be acceptable.
Does not affect assetsThe assets in the environment should not be adversely affected by the safeguard.

Safeguards can provide deterrence attributes if they are highly visible. This tells potential evildoers that adequate protection is in place and that they should move on to an easier target. Although the safeguard may be highly visible, attackers should not be able to discover the way it works, thus enabling them to attempt to modify the safeguard, or know how to get around the protection mechanism. If users know how to disable the antivirus program that is taking up CPU cycles or know how to bypass a proxy server to get to the Internet without restrictions, they will do so.

Putting It Together

To perform a risk analysis, a company first decides what assets must be protected and to what extent. It also indicates the amount of money that can go toward protecting specific assets. Next, it must evaluate the functionality of the available safeguards and determine which ones would be most beneficial for the environment. Finally, the company needs to appraise and compare the costs of the safeguards. These steps and the resulting information enable management to make the most intelligent and informed decisions about selecting and purchasing countermeasures. Figure 3-9 illustrates these steps.

Figure 3-9. The main three steps in risk analysis


We Are Never Done

Only by reassessing the risks, on a periodic basis, can a statement of safeguard performance be trusted. If the risk has not changed, and the safeguards implemented are functioning in good order, then it can be said that the risk is being properly mitigated. Regular IRM monitoring will support the information security risk ratings.

Vulnerability analysis and continued asset identification and valuation are also important tasks of risk management monitoring and performance. The cycle of continued risk analysis is a very important part of determining whether the safeguard controls that have been put in place are appropriate and necessary to safeguard the assets and environment.


Total Risk vs. Residual Risk

The reason a company implements countermeasures is to reduce its overall risk to an acceptable level. As stated earlier, no system or environment is 100 percent secure, which means there is always some risk left over to deal with. This is called residual risk.

Residual risk is different from total risk, which is the risk a company faces if it chooses not to implement any type of safeguard. A company may choose to take on total risk if the cost/benefit analysis results indicate this is the best course of action. For example, if there is a small likelihood that a company’s web servers can be compromised and the necessary safeguards to provide a higher level of protection cost more than the potential loss in the first place, the company will choose not to implement the safeguard, choosing to deal with the total risk.

There is an important difference between total risk and residual risk and which type of risk a company is willing to accept. The following are conceptual formulas:

threats × vulnerability × asset value = total risk (threats × vulnerability × asset value) × controls gap = residual risk

You may also see these concepts illustrated as the following:

(threats, vulnerability, and asset value) = total risk total risk – countermeasures = residual risk

Note

The previous formulas are not constructs you can actually plug numbers into. They are instead used to illustrate the relation of the different items that make up risk in a conceptual manner. This means no multiplication or mathematical functions actually take place. It is a “tool” to understand what items are involved when defining either total or residual risk.


During a risk assessment, the threats and vulnerabilities are identified. The possibility of a vulnerability being exploited is multiplied by the value of the assets being assessed, which results in the total risk. Once the controls gap (protection the control cannot provide) is factored in, the result is the residual risk. Implementing countermeasures is a way of mitigating risks. Because no company can remove all threats, there will always be some residual risk. The question is what level of risk the company is willing to accept.

Handling Risk

Now that we know about the risk, what do we do with it?

Response: Hide it behind that plant.

Once a company knows the amount of total and residual risk it is faced with, it must decide how to handle it. Risk can be dealt with in four basic ways: transfer it, reject it, reduce it, or accept it.

Many types of insurance are available to companies to protect their assets. If a company decides the total or residual risk is too high to gamble with, it can purchase insurance, which would transfer the risk to the insurance company.

If a company decides to terminate the activity that is introducing the risk, this is known as risk avoidance. For example, if a company allows employees to use instant messaging (IM) there are many risks surrounding this technology. The company could decide not to allow any IM activity by their users because there is not a strong enough business need for its continued use. Discontinuing this service is an example of risk avoidance.

Another approach is risk mitigation, where the risk is decreased to a level considered acceptable enough to continue conducting business. Examples of this kind of approach toward handling risk can be seen in many aspects of our lives. The implementation of firewalls, training, and intrusion/detection protection systems represent types of risk mitigation.

The last approach is to accept the risk, which means the company understands the level of risk it is faced with, as well as the potential cost of damage, and decides to just live with it and not implement the countermeasure. Many companies will accept risk when the cost/benefit ratio indicates that the cost of the countermeasure outweighs the potential loss value.

A crucial issue with risk acceptance is understanding why this is the best approach for a specific situation. Unfortunately, today many people in organizations are accepting risk and not understanding fully what they are accepting. This usually has to do with the immaturity of risk management in the security field and the lack of education and experience in those personnel making risk decisions. When business managers are charged with the responsibility of dealing with risk in their department, most of the time they will accept whatever risk is put in front of them because their real goals pertain to getting a project finished and out the door. They don’t want to be bogged down by this silly and irritating security stuff.

Risk acceptance should be based on several factors. For example, is the potential loss lower than the countermeasure? Can the organization deal with the “pain” that will come with accepting this risk? This second consideration is not purely a cost decision, but may entail noncost issues surrounding the decision. For example, if we accept this risk, we must add three more steps in our production process. Does that make sense for us? Or if we accept this risk, more security incidents may arise from it and are we prepared to handle those?

The individual, or group, accepting risk must also understand the potential visibility of this decision. Let’s say it has been determined that the company does not need to actually protect customers’ first names, but it does have to protect other items like Social Security numbers, account numbers, and so on. So these current activities are in compliance with the regulations and laws, but what if your customers find out you are not properly protecting their names and they associate such things with identity fraud because of their lack of education on the matter? The company may not be able to handle this potential reputation hit; even if it is doing all it is supposed to be doing. Perceptions of a company’s customer base are not always rooted in fact, but the possibility that customers will move their business to another company is a potential fact your company must comprehend.

Figure 3-10 shows how a risk management program can be set up, which ties together all the concepts covered in this section.

Figure 3-10. How a risk management program can be set up


References
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset