Chapter 2
Risk Assessment Basics

Street Calculus and Perceived Risk

Risk is around us everywhere. We cannot begin to understand specific risks without an evaluation of the nature of relative risk and our perception of it. We perform a risk analysis whenever we express fear, concern, or doubt. The fear of failure, injury, death, or loss is something we experience every day, and we express fear when our “street calculus” tells us that the risk we are about to take may be unacceptable and will result in injury, death, or loss.

Similarly, we look at plant operations and fail to consider the risks that are over and above what we consider “normal” operations. Familiarity with those risks often leads us to dismiss them, saying “It will never happen here,” or ignore them completely. Often, plant management wants to look only at “special” risks, which are unusual events not part of daily operations. True plant security must consider natural or routine risks as both internal and external to the plant environment.

Street Calculus

Street Calculus is the conscious and unconscious evaluation of risk. A collision, an accident, or a harmful event is a failure of risk assessment. We perform a “Street Calculus” or small risk assessment whenever we cross the street, drive through a strange neighborhood, or encounter someone on the street. There are as many different ways of assessing risk as there are people. There is no right or wrong way, but some ways are more complete than others.

Imagine yourself walking down a street at dusk when the visibility is not so good. In the gloom, you see someone approaching you, and they cross the street to come toward you. What do you do?

Do you avoid the person, turn around, and walk away quickly, or do you meet the person head on? What is your risk level before you recognize the individual? An example of a low risk might be when you get closer you recognize that the person approaching is either a police officer or a little old lady of small stature. Both are perceived as relatively low-risk persons.

However, if you are a criminal or carrying illegal drugs in your possession, you may view the policeman as a high-risk threat rather than a person of low risk.

There are mitigating factors in this scenario. You are carrying a FedEx package you have just picked up, and as the person approaches, you determine that it is a little old lady carrying groceries. The mitigating factors change the relative risk equation from one of potential suspicion and hazard to one of nonhazard.

Now, changing the scenario slightly, you are the police officer and have just been told to be on the lookout for a young man of swarthy complexion wearing a baseball cap, sweatshirt, and jeans, and that is precisely what you are wearing. Moreover, the police officer has been told that the suspect has committed an armed assault recently.

The mitigating factor is that the colors you are wearing are not those described to the officer, and the fact that you are black, while the perpetrator described to the officer was white. How does that visual identification change the relative risk?

Where is this going? It depends on the perception of relative risk for each of the individuals. If you are the police officer, you might approach cautiously and undo the flap on your holster while you are still far away. You might call out to the suspect and ask him to stop, and approach cautiously, ready for trouble. You might do any number of things to ascertain the subject’s identity.

If you are the young person and if you have committed no offenses or crimes, you might unhesitatingly stop to be questioned, and that would alleviate the officer’s concerns about your identity and behavior, or if you are guilty, you might run or take other actions to avoid the officer.

This is relative risk management. What do we know versus what is happening at the moment, and how does that affect us?

Perception of risk is not always rational. It is often the intangible factors that lead us to make a risk calculation. For example, the other night, I was driving home and passed a police car who was traveling below the speed limit. I accelerated to just a little bit above the speed limit in order to get by the police car within the passing zone and was back in the correct lane and well within the dividing line. But the police car turned on its lights and siren and pulled me over.

I did not have any illegal substances in the vehicle, and I was within the law, but the thought that flashed through my mind was myriad and analytical. Operating without any information until the officer approached and told me what the problem was. As it turned out, I had my headlights set too high, and he was commenting that the headlights needed adjustment or to be kept on low beams when following someone.

Risks are acceptable or unacceptable depending upon one’s tolerance for fear of what cannot be controlled. But risks are everywhere. Fear of flying is just one of many types of risks we encounter every day. Some of these risks are pointed out in Table 2.1. The purpose is to help put some of the risks in perspective. The data apply to the United States.

Table 2.1 Common Daily Risksa

ActivityType of riskProbability
Driving a carDeath by accident1/5,000 (0.02%) per year
Rock-climbingDeath by fall1:25,000 per hour
Motorcycle ridingDeath by accident1:55,000 per hour
Flying on an airlineDeath—all sources1:1,200,000 per hour
LivingDeath from heart disease (annual)1:340
LivingDeath by murder (annual)1:11,000
LivingDeath from cancer (annual)1:500

aFrom Robert R. Johnson. Background Information: A Scientific View of Risk. User-Centered Technology (Suny Series, Studies in Scientific & Technical Communication) Paperback—October 29, 1998. Oxford (OH): Center for Chemistry Education, Miami University. www.terrificscience.org.

The risks we must assess to keep a building, a company, or an industry safe are both internal and external. What we want to do is understand the nature of risks and the structure we use for assessment and look at the different ways we can begin to assign and evaluate risks.

All risks are relative. It is not the risk itself but the perception and presentation of the risk that drive our actions. These actions are influenced by internal and external factors. News, rumor, personal experiences, etc. all influence our perceptions. Flying is perceived as being more dangerous than driving. Other risks are relative as well. Table 2.2 reflects the perception of risk by the US Public and the USEPA.

Table 2.2 Relative ranking of perceived risksa

Relative risks as perceived by the US Public and by the US Environmental Protection Agency
Risks ranked from highest to lowest by the publicRisks (unranked list) by the USEPA
Hazardous waste sitesGlobal warming
Industrial water pollutionUrban smog
On job exposure to chemicalsOzone depletion (caused by CFCs)
Oil spillsToxic air pollutants
Ozone depletionAlteration of critical habitats
Nuclear power accidentsBiodiversity loss
Radioactive wastesIndoor air pollution
Air pollution from industrial sourcesDrinking water contamination
Leaking underground storage tanksIndustrial chemical exposure

aFrom Nebel B, Wright R. Environmental Science. Elsevier; 2000. p. 403.

The table ranked risks in 2000, but since that time, we have had global cooling, the Fukushima Daiichi nuclear disaster and radiation contamination of foods from that disaster, several tsunamis, earthquakes in California and the predictions of “The Big One” (earthquake), the influence of BPA and other chemicals in foods from the containers, genetically modified organisms, and global warming, to name a few. Whether or not these predicted disasters are real or imaginary, these events change the perceptions of risks because the media help shape our perceptions of risk either by repetition, exaggeration, or both. According to The Economist magazine, you have three times the risk from suicide (1:8447) than you do from an assault by a firearm (1:24,974).1

We are frightened of new things and especially new risks because of our unfamiliarity with them or their causes. However, risks tend to age poorly, and our perception of the risks is affected by our longevity with them. How many times have you experienced, “It will never happen here because I’ve been at this plant for X years, and it hasn’t happened yet!” While true, the statement provides false assurances and should not influence the risk assessor to derate the risk nor its potential consequences.

Security Risk Assessment Structure

Risk assessment documents can take many forms depending upon the type of information one wants to consider. Almost all the assessments consist of tables that summarize the risk in terms of facilities and cost.

The regulatory community often seeks to reduce public risk through increased regulation. That is slightly different than our focus, but it may be instructive even if only from the standpoint of the “law of unintended consequences.” Two quick examples will suffice:

  1. According to the National Highway Transportation Board Report Number DOTHS809835, the mandated new technologies for passenger cars and light trucks issued between 1968 and 2002 cost an estimated $750,782 dollars per life saved in 2002 by implementing these newer technologies.
  2. The proposed rules on installation of new backup television cameras in cars and light trucks could save 292 lives, on average, per year, and the cost of each of those lives saved would be approximately $18.5 million dollars.2

Value at Risk

Industry generally uses a more direct type of risk benefit analysis. It is often simply called risk analysis or risk assessment, and the scope, the focus, and the costs are generally better than most government figures because industrial practice forces constrain the estimates to be relevant, reasonable, and focused on the facilities and the outcomes of specific events on those facilities. This is much more constrained than the value at risk (VaR), which may include market share and financial risks.

Table 2.3 and Figures 2.12.3 present three types of risk assessment forms in current use. Each has their advantage and disadvantages. There is no one “right” form for data presentation.

Table 2.3 SANDIA National Laboratory risk assessment tablea

Image

The shades indicate relative importance the deeper shading indicates higher priority.

aFrom Duggan DP, Thomas SR, Veitch CKK, Woodard L. Categorizing threat: building and using a generic threat matrix. SANDIA Report, SAND2007-5791. Livermore: Sandia National Laboratories; 2007.

c2-fig-0001

Figure 2.1 Classical risk assessment form. The risk analysis matrix is usually in color. Red indicates high risk, yellow indicates moderate risk, and green indicates lower levels of risk, but we have chosen to use stripes, dots, and white spaces to highlight the risk levels, respectively.

c2-fig-0002

Figure 2.2 Cost-based risk assessment for annual loss expectancy.

Priorities are generally assigned by vulnerability (column) and frequency (top row). If a structure or an event is highly vulnerable and the frequency is high, it will be in the striped zone and should receive priority consideration. Costs are associated with these events and are generated separately for presentation.

Sandia Laboratory’s Risk Assessment Analysis

The Sandia National Laboratory suggests that threats be categorized in two very specific ways. The first is commitment attributes, which measure the attacker’s intent or willingness. The second is the ability or resource attribute, and that is a measure of the attacker’s ability to execute the intent.

The intent attribute is further classified into intensity, stealth, and time. The first two are measured on a high/medium/low scale, and the time is measured on the immediacy of the planned activity, which can range between days and years.

The ability or resource attribute also has three components, personnel, knowledge, and access. The personnel category has several ranges depending upon the number of people who can be applied to the task of planning or executing an attack: hundreds, tens of tens, ten, or ones. The knowledge category has two major groups, cyber knowledge and kinetic knowledge. Each of these is further refined into high, medium, and low categories.

The net result of this type of planning is a threat matrix table shown in Table 2.3.

The threats are ranked in order of their significance when the resources are accounted for in each of the categories. The purpose is to help delineate the threat categories and separate the rumored threats or theoretical threats from the actual threats.

There are modifiers for the threats that are known as force multipliers. These are funding, assets, and technology. Funding is a critical element and traditionally reflects the idea of capability, but if the attacker uses the funding to gain outside resources or make payments for information, those acts can make his activities more visible, decreasing his stealth and surprise element.

Assets are the ability to gain other forces or multiply his forces and capabilities. Gaining other assets can also involve the introduction of outside forces. “Two people can keep a secret if one of them is dead” was first voiced by Ben Franklin, and it exemplifies the idea that the more people are involved in a plot, the greater the chance that it will be discovered. Gaining assets may also serve to reduce stealth and be self-defeating.

Technology is rapidly changing. Is the company at or ahead of the technology curve, or does the adversary have the ability to penetrate and defeat the company’s security measures? This is especially true if the company is using unsecured and unencrypted wireless communications for controls and measurements inside the plant or for outside communications.

Annualized Cost Analysis of Risk

A third type of risk assessment considers cost and frequency in terms of annualized costs of the events. While the table form is the same, the data are in terms of powers of 10, and for the purpose of ease of interpretation, a year is assumed to be 333 days long or (3 years equals 1000 days). Similarly, the costs are in terms of powers of 10, and the costs are generally expressed in terms of millions of dollars of damage. This type of risk assessment requires more work because it estimates the frequency and the damage from an event.

Annual loss expectancy/estimated replacement cost
Cost expressed as $X.XX × 10N, Rating = N
Frequency of occurrence of undesirable event (3 years approximates 1000 days)
1/300 yearsf = 1Type of event
1/30 yearsf = 2Type of event
1/3 yearsf = 3Type of event
1/100 daysf = 4etc.
1/10 daysf = 5
1/dayf = 6
10/dayf = 7etc.
Calculated annual lost expectancy = 10(f + N − 3)/3

In this model, one way of calculating annual loss expectancy may require updating construction cost estimates. If the plant is old, a new engineering-based cost estimate may be required. If the replacement cost estimate is in the last 10 years, Engineering News-Record, RS Means, or other construction cost indices may be used to update the cost to the present.

Yet, another method of presenting annual cost data would depend upon the historical database and the confidence that one has in the predictions.

If you believe that it will be 30 years until the next major hurricane that would damage or destroy the plant, then use a 30-year capital recovery cost factor. If the plant will be destroyed in 30 years, and the annual interest rate plus cost of inflation is 8%, then the capital recovery factor tables use the following formula:

images

If the replacement cost of the plant in today’s dollars is $10,000,000, the time frame is 30 years, and the interest rate is 8%, then the estimated annual cost is $10,000,000 × 0.0888 = $888,274.

The problem with this is that the cost projections, even for partial damage, require an additional computation and are dependent upon the confidence one has on one’s ability to predict the likelihood of adverse events with any degree of accuracy. If the database is reliable, the method of predicting the method will reflect costs more accurately. This is true no matter what method you use to estimate risk. The ability to forecast future replacement costs is often more art than science, and is subject to interpretation and analysis (Figure 2.2).

Scenario-driven Cost Risk Analysis3

Scenario-driven cost risk analysis is a way of looking at risk costs using a simple technique to develop the ranges of costs associated with risk. The procedure calls for several scenarios to be developed and costed. The minimum number of scenarios to be analyzed is two, but more are better. Of these scenarios, select the one considered most critical to guard against, and refer to it as the prime scenario. The procedure for scenario-driven cost risk analysis follows in the steps below:

  • First: Start with the baseline or base replacement costs for the system or unit you are considering. Use current cost estimates for the value of the facility, adjusted to current dates, as if you were going to replace the facility as brand new. Do not allow any adjustment for risk or replacements. Define this as the base cost, or Cb.
  • Second: Define the prime scenario as the cost elements adjusted for the risk against which you want to guard. Be sure to include all costs and replacement and cleanup costs, including loss of product and other associated costs. Define this as Cps. Note that the damage and cleanup costs as well as replacement costs for the unit damaged should be included. Make sure that you have not incorporated other elements, especially support elements into the replacement and remedial costs.
  • Third: Subtract Cps from Cb. This is a measure of the amount of reserve money needed to guard against the prime scenario: Cg = Cps − Cb.
  • Fourth: Assume that a measure of probability D will fall between Cps and Cb. This establishes the upper and lower bounds of your cost estimate.
  • Fifth: Assume that the statistical distribution for the cost lies within the interval of D and that the cost for D is normally distributed between Cps and Cb. D is a probability percentage expressed as a number less than 1, that is, 0.0 < D < 1.00.
  • Sixth: Assume that the minimum cost for the system will be Cmin = Cb − Cg × (1 − D)/2 and the maximum cost for the system will be Cmax = Cps + Cg(1 − D)/2. This establishes the range of costs for probable upper and lower costs.

This gives you four data points:

  • Cmin, Cb, Cps, and Cmax and a probability (you selected) that the actual number will be somewhere between Cmin and Cmax. The probability that cost ≤ Cmax = 1/2 + 1/2D.
  • The probability that cost ≤ (Cmax + Cmin)/2 (average of your estimate of Cmin and Cmax) = 1/2.
  • The mean of the cost = (Cmax + Cmin)/2.
  • Variance of the cost = σ2(Cost) = (1/12) × [(Cps − Cb)/Cb]2.
  • The probability that the cost will be equal or less than a specific figure is
    images
    where X is a real number in currency, as are the other figures.

Real-world example

If Cb in Table 1.1 is $44.3 million, we can estimate that the total cost of the plant replacement with cleanup from a devastating incident would be $65 million. Assuming further that the probability of an attack might be 80% or D = 0.8, Cmin would be 44.3 − (75 − 44.3) × 0.8/2 = 32.02 million:

images

The equation plotting the cost for the scenario is

images

It is linear over the range of the costs, and the table and graph for our probability plot of costs are Table 2.4 and Figure 2.3:

c2-fig-0003

Figure 2.3 Cost versus probability of occurrence.

Table 2.4 Probability of occurrence

Estimated cost XProbability of occurrence (%)
351
4014
4526
5038
5551
6063
6575
7088
75100

So under the assumptions above, at an 80% risk level, there is a 70% chance that the total costs for the scenario we described is equal to or less than $65 million.

This represents one way of evaluating the financial risk. The technique can also be applied to subsystems.

Model-Based Risk Analysis

The model-based risk analysis (MBRA) is a way of prioritizing costs for reducing risk. The program was developed by the Naval Postgraduate School Center for Homeland Defense and Security (http://www.chds.us). The MBRA is a risk assessment model that takes a network approach to risk assessment.

The tutorial is excellent and easy to use, and the program is clear. While it is designed by and for the United States, other maps can be input and the model can be run without mapping. In the program, one assigns nodes and links, the likelihood of an attack, and the total amount of money that can be sent. The program can allocate and prioritize the amounts of money to be spent on each improvement. Sample inputs to outputs from the MBRA program are shown below.

MBRA example case

The MBRA test case was created to simulate the material flow in an ammonia plant. Artificial numbers were created using hypothetical inputs. The example followed the MBRA tutorial. The hypothetical example was based on the Ammonia Plant example in the previous chapter. The flow of materials in the plant is shown in Figure 2.4. Tables 2.5 and 2.6 show the allocation of resources for an attack scenario, and Figure 2.5 shows the allocation of resources to minimize the damage from an attack and prioritize the expenditures.

c2-fig-0004

Figure 2.4 Diagram of product flow in an ammonia plant.

In the table above, the data entered are shown in a shaded background (Tables 2.5 and 2.6). The balance of the data is calculated based upon those values. MBRA has allocated the proposed expenses to reduce the risk of the various units of the facility. It also calculates the reduced vulnerability from the expenditure of a portion of the budget for the proposed enhancements of the facility. It is a good guide, but alas not perfect. The problem is that projects have finite boundaries, and a planned upgrade costs what it costs, and cannot be shaded by a risk management program. So if the planned enhancements to a particular area cost more than the computer-allocated expenditures, then the budget will have to be adjusted by cutting things elsewhere.

Table 2.5 Part 1 of two-part data table for MBRA analysis

ThreatVulnerabilityConsequence ($)NamePrevention cost ($)Response cost ($)Risk initial ($)Risk reduced ($)Flow consequence ($)
10010030Receiving1575100.4810
90100150Gas reformer2004090.4810
9010080Urea452000.050
100100200Ammonia5001500.050
10010060Storage30023.330.173.33
10010085Shipping85300100.4810
10010010Incoming gas1515100.3210
10010050Gas/ammonia205000.050
1001000Ammonia/urea0000.050
10010010Reformer/urea15200.050
10010040Urea/storage256000.050
10010012NH3/storage1063.330.133.33
100100300Storage/out552506.990.336.99
1001005NH3/out152500.050
10010020Urea/out104000.050

Table 2.6 Part 2 of two-part data table for MBRA analysis

NamePrevention allocation ($)Response allocation ($)Attack allocation ($)Vulnerability reducedConsequence reduced ($)Calculated threat
Receiving1501551095
Gas reformer200020051095
Urea0001000.0590
Ammonia0001000.05100
Storage0201000.17100
Shipping8508551095
Incoming gas151.981556.7395
Gas/ammonia0001000.05100
Ammonia/urea0001000.05100
Reformer/urea0001000.05100
Urea/storage0001000.05100
NH3/storage100.431052.6995
Storage/out5505556.9995
NH3/out0001000.05100
Urea/out0001000.05100

The other significant feature of MBRA is the ability to prioritize the important links and nodes for reduction of risk. This is shown in Figure 2.5. The figure illustrates the significant links and relative importance of each of the links and nodes. Of course, the data are the limiting factor, and this also illustrates the limits of the program. In the ammonia plant example, the receiving is only a pipeline. The gas reformer and the ammonia conversion and urea conversion units are the heart of the plant and need to be protected. Urea cannot be made without ammonia, and the shipping and storage departments are relatively dispersed. The MBRA program can also analyze risk using the fault tree method.

c2-fig-0005

Figure 2.5 Diagram to prioritize the important links and nodes for reduction of risk.

Risk Management by Fault Tree Methods and Risk-informed Decision management

Fault tree analysis

Fault tree is a method of diagramming and assigning probabilities for risk of complex events by breaking them down into logical steps. The diagram, when complete, is very much treelike in that it has a single event (attack), and the steps that lead up to the attack are broken down in fine detail. Those of you who are familiar with project evaluation and review techniques (PERT) or critical path management (CPM) construction management techniques will be extremely comfortable with fault tree or event tree or root cause analyses. Event tree analysis (ETA) and root cause analysis techniques are focused on the past. The fault tree analysis (FTA) is forward-looking, trying to anticipate how things might occur. ETA, FTA, and RCA focus is on the past, seeking to understand; FTA is forward-looking, trying to understand what could occur or go wrong.

All of these techniques focus on a stepwise analysis of the logical progression of events. All of the techniques are binary, in that while there may be multiple events feeding a particular condition, that fault will either occur or not occur and pass through to the next level or to a conclusion.

When ETA is performed at the same time as the FTA with a central event, the result is generally referred to as bow-tie analysis. In most bow-tie analyses, the event is the central item that is displayed in graphical form, as will be shown later, and the faults are plotted to the left and the events are plotted to the right. The advantage of bow-tie analysis is that it displays faults and barriers in one graph. That will be covered later.

RIDM

RIDM is an acronym for risk-informed decision management, and this and continuous risk management (CRM) were used by NASA for risk management. The overall NASA formula for risk management is

images

Quoting from a NASA document4,

RIDM is a fundamentally deliberative process that uses a diverse set of performance measures, along with other considerations, to inform decision making. The RIDM process acknowledges the role that human judgment plays in decisions, and that technical information cannot be the sole basis for decision making. This is not only because of inevitable gaps in the technical information, but also because decision making is an inherently subjective, values-based enterprise. In the face of complex decision making involving multiple competing objectives, the cumulative wisdom provided by experienced personnel is essential for integrating technical and nontechnical factors to produce sound decisions.

Risk management by NASA’s standards is a continuous process because of the levels of uncertainty with which they deal. We have chosen NASA’s risk management process for further explanation and evaluation because many of the uncertainties in the security process and in NASA’s work product are similar in that they deal with many unknowns. NASA may have one of the best and most effective risk management systems, as it focuses on continuous improvements, and once mastered, it will lead to an understanding of many different types of risk management systems.

NASA defines risk as an “operational set of triplets,” scenarios, likelihoods, and consequences. NASA further defines the RIDM process as a set of continuous interactions between the stakeholders, risk analysts, subject matter experts, technical authorities, and decision maker. The RIDM process is conducted in the same manner as the conventional risk assessment processes. Figures 2.6, 2.7, and 2.8 illustrate the overall process.

c2-fig-0006

Figure 2.6 NASA’s risk-informed decision management process.

From Probabilistic Risk Assessment Procedures Guide for NASA Managers and Practitioners. 2nd ed. NASA/SP-2011-3421, NASA; 2011.

c2-fig-0007

Figure 2.7 Factors that go into a risk-informed decision management process.

From Stamatelatos M, Dezfuli H. Probabilistic Risk Assessment Procedures and Guidance for NASA Managers and Practitioners. NASA/SP-2011-3421, NASA; 2011. http://www.hq.nasa.gov/office/codeq/doctree/SP20113421.pdf.

c2-fig-0008

Figure 2.8 Steps in the RIDM process.

From Probabilistic Risk Assessment Procedures Guide for NASA Managers and Practitioners. 2nd ed. NASA/SP-2011-3421, NASA; 2011.

The RIDM process is very much like the ISO 9000, 14000, and 18000 processes and related systems, where the process evaluation and review is continuous until the process is either complete or that a consensus of optimization has been achieved. That consensus would find that the process is complete and that the optimum balance between risk and other factors has been achieved. The RIDM process was specifically designed for production or mission-related activities by NASA, such as shuttle launch decisions and manufacturing special parts and systems, but the processes are the same for a security system.5

The International Atomic Energy Agency has adopted the RIDM process and adapted it to nuclear works. The similarities are obvious and are shown in Figure 2.9.

c2-fig-0009

Figure 2.9 The IAEA’s adaptation of the RIDM process.

From IAEA, A framework for an integrated risk informed decision making process. INSAG-25.

RIDM process steps

The steps for the RIDM process are as follows.

Step 1: Identify

Capture stakeholder’s concerns regarding the performance requirements and or the critical elements that must be protected. (Note: the following section on CARVER + Shock may have some very good suggestions about ranking and setting priorities.) Each element must have an associated risk, and the probability of the success or failure is dictated by one or more scenarios. It is the consequences of the risk scenario that determines the outcome.

As an example, if we say that the gas reformer maximum scenario is the complete destruction of the reformer and the resulting shut down of the plant, then anything less is a partial success, and we can attach numbers to that in terms of percentages of shutdown and/or production lost of the other operations as well.

Other scenarios and multiple event scenarios are equally likely. There are lots of things that can take place, and some of them are routine, internal, or external and are often considered during the design stage. Examples might include power interruptions, an electrical voltage spike, simple corrosion, advanced corrosion due to electrolysis, and wear and tear on the system. These are technically “attacks,” but their nature and response are more in the realm of design deficiencies and maintenance and repair activities.

Sabotage, in large or small scale, is just one type of attack. Technically, there is little difference between sabotage and “mischief” except in the damage done. Mischief may be as simple as disabling equipment so that an employee gets additional overtime pay to plugging a vent just before the plant union goes out on strike to make it difficult to clean out a chamber. Sabotage can create actual and willful harm to major elements of the operating portion of the plant. The difficulty is that mischief is difficult to detect and is more of an inconvenience.

Other types of scenarios may include physical or other electronic interference with remote locations or communications. A remote pumping station is a vulnerable area easy to attack. The scenario might be just someone breaking in and stealing components or simply cutting the power or massive vandalism.

Any teenager with some talent and a modem and a wire wrapped around an orange juice container can make a transmitter, which could send signals to the unsecured transmitters and receivers, which open and close valves or control chemical or thermal reactions.

While some of these scenarios are very simple, the threats posed can be quite complex. The point is that they need to be considered in the planning and posed as potential occurrences.

In the overall preparation of scenarios, it is important to consider the effects of any environmental or other effects from unplanned releases of chemicals within the plant. If incidents within the plant result in the release of volatile chemicals the community notifications and environmental regulatory notifications may be required. There are several good programs that can assist in the planning for various types of environmental releases.6

NASA, because it is focused on specific missions, recommends that the scenario development be decomposed into a set of steps of consulting with the stakeholders to gain their concurrence that the scenario is realistic. Quoting from their RIDM document:

Stakeholder expectations result when they i) specify what is desired as a final outcome or as a thing to be produced and ii) establish bounds on the achievements of goals (these bounds may for example include costs, time to delivery, performance objectives, organizational needs). In other words, the stakeholder, expectations that are the outputs of this step consist of i) top-level objectives and ii) imposed constraints. Top-level objectives state what the stakeholders want to achieve from the activity: these are frequently qualitative and multifaceted, reflecting competing sub-objectives (e.g., increase reliability vs. decrease cost). Imposed constraints represent the top-level success criteria outside of which the top-level objectives are not achieved.

So, develop your scenarios as completely as you can, and involve various departments on the scope of the scenario. Get consensus where possible, especially if it involves outside notifications and support from others.

Step 2: Analyze

The analysis step requires the estimation of the magnitude and consequences of individual risk elements and working through scenarios to completion. These are the consequences of the scenarios in step 1 and should include recovery steps and related costs until the plant is fully restored and operating.

One of the chief problems in the analysis is related to the timing of the attack or the probability of success. If there are n people in the scenario committee, anticipate that the number of attack and timing scenarios to be evaluated will be between 120 and 250% of the number of people you have on your committee. These opinions are just that, and the essential difficulty in production of a coherent report is that you do not know what you do not know, and you may not ever know that you do not know it until after the attack has occurred. Without good intelligence from the community, you will never know whose scenario and timeline about an attack is more accurate until after it occurs.

The second area of possible disagreement is the assessment of the damage. One persons’ minor damage is another persons’ total destruction. Here is where you must involve the process engineering group with designations of the key elements of the plant, which are vulnerable to explosions, fire, weather, control failures, etc.

The third area will be the cost of reconstruction. If the plant has been built in the past 5 years or so, it might be possible to utilize a Construction Cost Index for estimation of the cost of replacement. Otherwise, you could be looking at a much more comprehensive engineering study to estimate the costs. Vendors can and will provide quotations for their good clients. It is also possible to get some indication of the cost of replacement by comparison of published data on comparably sized new equipment. Chemical engineering handbooks have some guidance on scaling up capital costs for process equipment.

The preparation of a detailed cost estimate for a facility can be several hundred to several thousand hours of estimator’s time just to get within 25% of the actual cost, depending upon the level of detail required, the complexity of the plant or process, and the accuracy of the estimate required.

We never have sufficient data to predict when an attack might occur, and a continuous vigilance is required, with the possibility of extreme rapid response when and if an event occurs. There are scenarios and then there are extreme scenarios, and the response must be proportionate to the type and kind of attack.

The analysis of the effects of an attack depends upon the scenario. Some attacks will occur quickly, and others will provide warning. Adverse weather, flood, storms, and weather-related events provide warning of their occurrence and severity. Other attacks will not provide warnings and their severity can be much greater. A fire in the pump house may have minor consequences, or it may lead to a plant-wide conflagration. It is sometimes difficult to ascertain the difference between plant safety issues and plant security issues. However, for the purposes of our discussion and planning, an attack will be an external event, caused by outside forces or outside persons.

Confining ourselves to events attributed to outside forces still leaves us with a problem related to the estimation of the timing and the duration. A blizzard or bad ice storm may cause major disruption, or it may be a minor event that will disrupt the power. A physical assault by a terrorist organization or a cyber attack designed to disrupt processes or steal information or divert shipments or conceal other misdeeds can be difficult to analyze because one does not know the scale of the event. In the case of the theft or diversion, the detection is often after the fact.

Sabotage as opposed to mischief is also an attack. Someone deliberately leaving a door or gate unlocked to allow an intruder into the facility could be classified as an attack or it could be simple theft or sabotage. It is hard to tell beforehand what can happen unless one gets down to the area-specific planning of intrusion attacks. Someone in the tank farm area may be harmful or not, whereas an unauthorized someone in the computer center probably will be a potentially serious event because of the harm that can be done.

The analysis process will have multiple outcomes based on each scenario. The timing and nature of an attack is a best guess. Where there is information available from police, military, or reliable intelligence sources, the timing of an attack may be known or estimated. Otherwise, the system must be ready for eternal vigilance.

The overall process of analysis includes several important elements:

  • What can go wrong—or how can an attack occur?
  • How big would the attack be?
  • What are the security systems to prevent, deter, or defeat an attack?
  • What is the margin or reserve to prevent an incident or follow-up attack?7

Obviously, the attack scenarios must be limited to those that have a realistic chance of occurring. Otherwise, a plant on the equator would be evaluating the possibility of being hit with a blizzard.

In planning the scenarios, you want to include the ultimate scenario even if it does seem remote. Unless you are confident that the likelihood of an occurrence of the ultimate destruction scenario is less than about 1 in 10 million or less, you would probably want to include the ultimate disaster scenario in your planning. That should, of course, include the ultimate destruction of the facility by explosion or fire or both, and community evacuations and cleanups.

Step 3: Planning

This is the step where the response to the attack scenario is plotted out. Logistics should include the manpower required for the type of response and should be specific to the scenario. Options tend to be the enemy of detailed planning but are necessary. It is an exercise in visualization in that one has to (i) put themselves in the place of the attacker, and (ii) if the attacker is detected, estimate how much force is required, or (iii) if the attacker is not detected, estimate what could or would be damaged by the attacker.

One of the easiest considerations in this type of exercise is to estimate how long the attacker can operate before the attack is detected. If, for example, an attacker manages to penetrate the outer perimeter of the plant, how far can he get before detection occurs and the plant is alerted? Think of the attacker’s presence as a wave spreading out from a pebble tossed into a pond. Gates, locked doors, fences, etc. are not necessarily deterrents, but they will slow an attacker down, making the area of probable access and search radius smaller.

Other questions in the response planning scenario include:

  • What resources are required?
  • How will outside resources be used, including fire, police, and emergency services?
  • How fast can these resources, both internal and external, be tapped, and what should their response be?
  • Are the equipment or personnel prepositioned so that they can respond rapidly to the event?

One of the other critical elements in the response planning scenario should include the availability of emergency services such as a hospital and its response time. A part of that series of questions should include analysis of the directions and transit times, and the ability of the hospital to handle or decontaminate injured personnel without damage to their emergency room or other facilities?

Sample planning exercise

Another way of looking at the planning exercise is to consider the types of elements outlined below:

Start hereContinue here
Deterministic elementDeterministic approach
Defense in depthAnswer the question:
Barriers and levelsAre the existing security systems meeting their required purpose and function?
Security margins
Flexibility of defense force
Experience and historyIf not, what are the consequences?
Communications systemsWhat are the constraints on the systems?
Backup communicationsWhat can be done to improve the systems?
Exercises and drillsHow much will the improvements cost?
Outside resources
Probabilistic elementsProbabilistic approaches
Provide analysis of likelihoods of various penetration and attack scenariosWhat can happen with a security breach or an attack?
Can the facility handle multiple events for both attack and disasters if they happen concurrently?How serious could it be?
How likely is it to occur?
How reliable are the outside resources?
What are the notification and compliance requirements?What are the capabilities of the outside resources to handle a catastrophe?
Compare both sets of answers to target goals to see if the security is adequate for the facility
If yes, you are done; now, monitor the systemIf no, start back at the beginning and reanalyze the system

The company’s tolerance for risk should be addressed, as it may be a determining factor in the overall security planning. In most cases, the risk statistics will be low, and the risks are probabilistic rather than actual.8 Risk levels, based upon experience, are generally of little use, because the history will likely suggest that the “attack” did not happen. “Did not happen during previous history” does not suggest that the “attack” cannot or will not happen. The probability maybe very low, and one will be trying to decide between the likelihood of an event with a time based probability of 0.001 and 0.0001% or less. One has to determine the tolerance for risk, even for small catastrophic events, as it will affect the resources available.

The risk management decisions should be elevated to the sponsoring organization at the next higher level of hierarchy when the risk cannot be managed by the present organizational unit. If the risk is plant-wide and may have major consequences on the corporation, the decision about levels of risk should be sent to the appropriate corporate management level for their sign-off or commitment. Risk can be no longer considered when the associated risk drivers are no longer considered potentially significant.

Step 4: Track

All risk decisions must develop a paper trail so that they can be examined and evaluated after an exercise, after an event or even after a failure. The purpose is not to see who failed, but to find out how the system worked and how well it worked in preventing the event. The meetings and documents should be captured and recorded so that the thought process of the risk managers can be evaluated and captured for later consideration.

Unfortunately, this can result in huge paper and electronic files. The most effective “paper trail” will be a time-stamped set of electronic communications files followed up with written reports every time there is an incident, no matter how small. This documentation should include copies of video files and radio communications both internally and external to the plant. Especially where there is an injury or an arrest, it will be vital to the defense of the plant and employee to insure that actions are defensible and the appropriate level of force has been employed.

Step 5: Control

Periodic review and editing of the risk management plans are required. Optimally, this will result in a book of procedures that is used in training the security force and key managers. Optimally, there should be a set of training requirements that relate to the procedures. This training document should specify the levels of education, positions, qualifications, and examinations for each level of security personnel. The procedure manual does not need to be highly detailed, but it should be specific enough so that the security force personnel know what actions to take and what appropriate levels of force, response, and reserve are required. The procedure manual should also specify the qualifications, responsibilities, duties, and authority for each level of the security force.

Much of the language in the procedure manual will be repetitive, and it may be helpful to have some prewritten descriptions of standard actions available to plug into the documents. CAUTION: This type of standard language can lead to unimaginative and preprogrammed actions where the plan is not taken seriously or is put on the shelf as an unused reference work, only to be used as a justification of actions taken whether they were appropriate or not.

If an armed guard force is used, the plan must specify the levels of force and specify the training requirements and the qualifications for the security officers before they are allowed to be armed. If the guards are not armed, the plan should specify the procedures for contact of the local police and/or fire department and the levels of supervisory control who can make that contact.

Ideally, the review process should be a continuous or semicontinuous basis. The security manager and/or the security team should review the existing plans on a regular basis. Depending on the size of the plans and the complexity of the facility, that may require almost continuous review and updating. But the plan needs to be reviewed at least yearly, or more frequently wherever possible.

Every incident needs to be reviewed to insure that the response was appropriate to the incident and that the plan is revised to insure appropriate force/response levels are employed. If a portion of the response or risk management plan becomes untenable or nonviable, it should be stricken from the master planning book.

The universal limitation of all the planning techniques we have discussed so far is their inability to tell us when an incident could occur. What we do not know, we do not know. The solution is planning and eternal vigilance against the unplanned, accidents, and incidents.

RIDM procedures

The procedures for preparing decisions under the RIDM process start with the preparation of boundary statements and objectives.

  • The objective of the plant security is to protect the facility and prevent damage, theft, sabotage, external aggression, and loss from other internal or external sources.
  • The security operations will be responsible for all of its own computer and related communications activities separate from the computer and communications resources employed by the information technology (IT) department.
  • The IT department shall support the security department where and when requested.
  • The security department is responsible for control of external events and internal threats including the prevention of theft, diversion of assets, and physical intrusion.
  • The security department has the responsibility to inspect and control movements into and from the plant, including all shipments.
  • The security department will address all potential events that can create a loss event greater than $______ (a specified sum).
  • Those types of statements establish the duties and the responsibilities and boundaries of the security division. The last statement establishes a performance objective or metric against which security can be measured. The next step would be to determine what type of response is required.
  • The next step is to establish what type of attack scenarios can create damage and the magnitude of that damage. For example:
    • An intruder in the plant could shut down the reformer.
    • A car bomb.
    • An intruder with a bomb could cripple or disable the power plant.
    • Smuggling a shipment of drugs or explosives into the warehouse.
    • A suicide bomber could take out the front gate.
    • Multiple incidents closely spaced (ripple attack) could provide a diversion and allow multiple entries into the plant.
    • Be sure to include the ultimate scenario for plant-wide destruction in your disaster and attack scenarios.
    • Etc.

Each of these has a damage associated with it, and depending upon the other assumptions associated with it, there may be several possibilities associated with the location, timing, etc. of the event. At this point, a review is needed to select the most likely events based upon the facilities, location, etc. The evaluation of the cost will require input by engineering and other specialists. The scenarios should also be prioritized and ranked with the likelihood of their occurrence and the damage associated with each using the techniques outlined earlier.

There are several ways to arrive at the probability of the occurrence or likelihood of the success of the attack and/or prioritize the target. Another way of predicting the prioritization of the target is CARVER + Shock, which will be discussed next. The FTA can be used to estimate the success or failure of an attack.

CARVER + Shock

CARVER + Shock is a planning tool that aids in developing and selecting priority targets for protection. It was first developed for the food industry, and it is now being expanded into free software that enables other process industries to use it with slight adaptation. The CARVER + Shock system uses a matrix formulation that has already been discussed. The interesting and different element is that it also considers the shock value of an attack and the damage as part of the matrix. The word CARVER is an acronym for criticality, accessibility, recuperability, vulnerability, effect, and recognizability. The Shock part comes in when we evaluate the psychological effect of the attack or incident on the affected facility or nation. It is rumored that during Desert Storm, US Special Operations Forces used CARVER to identify the critical air defense network system in Iraq. The thorough analysis and breaking the system into critical parts identified communications bunkers, which made the functioning of the Iraq’s RADAR system possible. When these bunkers were destroyed by small strikes, the larger air campaign against Iraq was made possible with minimal loss of aircraft.

Keep in mind that if you perform a CARVER + Shock analysis, you will need to make several iterations to resolve the scenarios into their smallest parts, which have critical impact upon the facility. As you work through the scenarios, some activities will be important, and others will not have an impact.

Defining the terms for Carver+Shock

Criticality is the measure of the public health or economic impacts of the attack. It should be a scaled evaluation of the cost of the attack in terms of budgets, corporation value, facility value, or any other thing.

Accessibility is the ease of ingress and egress to the facility or the site. A site with a fence will have a lower accessibility than one without. Similarly, natural barriers such as rivers, lakes, and natural terrain would give a site a lower score.

Recuperability or recoverability is a measure of the ease with which the facility can be repaired or recovered from an attack on a specific part of the facility or the entire facility.

Vulnerability is the ease of accomplishing the attack. The challenge to assigning a priority or rank to this item is that it also depends upon the type of attack. If a facility is relatively wide open and an intruder can penetrate the perimeter and walk around the facility unchallenged, then the vulnerability would be very high. Even if a facility had a relatively secure perimeter, it might be vulnerable to an attack by a standoff weapon such as a military-grade or homemade mortar.

Effect is the effect that the attack would have, in terms of direct loss, such as direct loss of production capacity or inventory. It might be important to separate inventory from production in this category for separate parts of the facility.

Recognizability is the ease of identifying the target. Much of that may depend upon the attacker’s familiarity with the plant and his ability to distinguish high-value targets from those having little or no value. A good example of a high-value target might be the gas reformer in an ammonia plant versus the bulk gas or liquid storage tanks. Often, the tanks will be also protected by a dike or other containment. The loss of the tank integrity if there is no fire, and if the tank containment is intact, may be of minimal consequence.

Shock is an attempt to evaluate the combined health, economic, and psychological impacts of an attack. While it is primarily for the food industry due to the impacts of an attack on an individual, the facility may be insignificant unless that facility is making something of national importance.

CARVER + Shock definitions need to be scaled to suit the plant size and other impacts. The top-end scale should be whatever is agreed upon by plant management, and it is suggested that the upper end of the scale might be, for example, the loss of the entire plant, a large corporation’s projected profits for a year, the net worth of the facility, or other appropriate measures. It is important not to set the upper levels of impact too high, and each category should have some 1s and 10s in it. It is important to recognize that there might be a tendency to set values artificially high to stimulate plant investment or to set them too low to avoid inspection and accountability. It is also important to consider that the various categories of attack may produce some casualties that are unavoidable.

Applying CARVER + Shock

Know yourself

This includes the entire infrastructure or facility to be assessed. Many other VA systems attempt to identify only critical systems to evaluate in the hopes of saving time and resources. This shortcut can overlook crucial vulnerabilities as proven time and again by users of this tool. Results show that significant vulnerability lies in areas that most experts never considered critical and may have overlooked.

Know the threat
  • Actual, localize threat for a specific target system.
  • Design basis threat for a higher-level assessment.
  • We must understand who the threat is, why they want to attack, how they will attack you, and what is the desired effect.
Know your environment

This is information about the physical, political, and legal environment that affects the target system and the threat.

Know what your enemy knows about you

This is an additional component to this preassessment. Information is sometimes called red teaming. It is not required to identify the actual vulnerabilities; it is used more to predict probability of attack.

CARVER + Shock produces a bar graph that indicates the levels of vulnerability of the facility, and it has a 108-question section interview that discusses recall and processing safety and is primarily designed for the food industry, but with some imagination, it can be used on other platforms and industries (Tables 2.7, 2.8, 2.9, 2.10, and 2.11).

Table 2.7 CARVER + Shock criticality table

Criticality criteriaScore or ranking
Loss of over 10,000 lives or loss of more than $100 billion (Note: if looking on a company level, loss of >90% of the total economic value for which you are concerned*)10–9
Loss of life is between 1,000 and 10,000 OR loss of between $10 and $100 billion (Note: if looking on a company level, loss of between 61 and 90% of the total economic value for which you are concerned*)8–7
Loss of life between 100 and 1000 OR loss of between $1 and $10 billion (Note: if looking on a company level, loss of between 31 and 60% of the total economic value for which you are concerned*)6–5
Loss of life less than 100 OR loss of between $100 million and $1 billion (Note: if looking on a company level, loss of between 10 and 30% of the total economic value for which you are concerned*)4–3
No loss of life OR loss of less than $100 million (Note: if looking on a company level, loss of <10% of the total economic value for which you are concerned*)2–1

Table 2.8CARVER + Shock accessibility criteria

Accessibility criteriaScore or ranking
Easily accessible (e.g., target is outside the building and no perimeter fence). Limited physical or human barriers or observation. Attacker has relatively unlimited access to the target. Attack can be carried out using medium or large volumes of contaminant without undue concern of detection. Multiple sources of information concerning the facility and the target are easily available10–9
Accessible (e.g., target is inside the building but in an unsecured part of the facility). Human observation and physical barriers limited. Attacker has access to the target for an hour or less. Attack can be carried out with moderate to large volumes of contaminant but requires the use of stealth. Only limited specific information is available on the facility and the target8–7
Partially accessible (e.g., inside the building but in a relatively unsecured, but busy, part of the facility). Under constant possible human observation. Some physical barriers may be present. Contaminant must be disguised, and time limitations are significant. Only general, nonspecific information is available on the facility and the target6–5
Hardly accessible (e.g., inside the building in a secured part of the facility). Human observation and physical barriers with an established means of detection. Access generally restricted to operators or authorized persons. Contaminant must be disguised and time limitations are extreme. Limited general information available on the facility and the target4–3
Not accessible. Physical barriers, alarms, and human observation. Defined means of intervention in place. Attacker can access target for <5 minutes with all equipment carried in pockets. No useful publicly available information concerning the target2–1

Table 2.9 CARVER + Shock recognizability criteria

RecognizabilityScore or ranking
The target is clearly recognizable from a distance and requires little or no training to identify it10–9
The target is clearly recognizable and requires a little bit of training to identify8–7
The target is difficult to recognize at night or in bad weather or might be confused with other targets or target components and requires some training for recognition6–5
The target is difficult to recognize at night or in bad weather. It is easily confused with other targets or components and requires extensive training for recognition4–3
The target cannot be recognized under any conditions, except by experts or insiders2–1

Table 2.10 CARVER + Shock vulnerability criteria and effect criteriaa

Vulnerability: This is a measure of how easy it would be to introduce weapons, bombs, poisons, or other foreign substances into the plant, near the target, or into the target processesScore or ranking
Target characteristics allow for easy introduction of sufficient agents to achieve aim10–9
Target characteristics almost always allow for introduction of sufficient agents to achieve aim8–7
Target characteristics allow 30–60% probability that sufficient agents can be added to achieve aim6–5
Target characteristics allow moderate probability (10–30%) that sufficient agents can be added to achieve aim4–3
Target characteristics allow low probability (≤10%) that sufficient agents can be added to achieve aim2–1
Effect criteriaScore or ranking
Greater than 50% of system’s production or function impacted9–10
25–50% of system’s production or function impacted7–8
10–25% of system’s production or function impacted6–5
1–10% of system’s production or function impacted4–3
Less than 1% of system’s production or function impacted2–1

aCriteria established by Carver + Shock Software.

Table 2.11 CARVER + Shock shock valuea

Shock value is the combined measure of the health, psychological, and collateral national economic impacts of a successful attack on the target system. The psychological impact will be increased if there are a large number of deaths or the target has historical, cultural, religious, or other symbolic significance. National economic damage and casualties of innocents (children and elderly) are also factorsScore or ranking
Target has major historical, cultural, religious, or other symbolic importance. Loss of over 10,000 lives. Major impact on sensitive subpopulations, for example, children or elderly. National economic impact more than $100 billion10–9
Target has high historical, cultural, religious, or other symbolic importance. Loss of between 1000 and 10,000 lives. Significant impact on sensitive subpopulations, for example, children or elderly. National economic impact between $10 and $100 billion8–7
Target has moderate historical, cultural, religious, or other symbolic importance. Loss of life between 100 and 1000. Moderate impact on sensitive subpopulations, for example, children or elderly. National economic impact between $1 and $10 billion6–5
Target has little historical, cultural, religious, or other symbolic importance. Loss of life <100. Small impact on sensitive subpopulations, for example, children or elderly. National economic impact between $100 million and $1 billion4–3
Target has no historical, cultural, religious, or other symbolic importance. Loss of life <10. No impact on sensitive subpopulations, for example, children or elderly. National economic impact <$100 million2–1

aCriteria established by Carver + Shock Software.

Fault tree analysis

Analysis of a failure by fault tree is yet another way of determining the likelihood and/or the probability of an attack. The fault tree system breaks down the planning and activity process into binary actions that have either “go” or “no-go” elements, and when probabilities are assigned to each of the elements, a picture of the overall probability emerges. One of the hardest things to do in an FTA is to decide what the individual probabilities are for the individual actions that lead into events. Monte Carlo methods are ideally suited to the FTA method because the variables can be stepped through to gain an overall probability of success. In order to estimate the probability of an attack, a large number of variables have to be examined, and scenarios run for realistic probabilities. The principal limitations are in the reliability of the assumptions regarding the possibility of an attack and the likelihood of the success or failure of that attack.

The FTA is the most comprehensive type of analysis and can be used whenever there is a logical relationship between events and consequences. Related to FTA is the event tree analysis (ETA). The ETA is the backward consideration of FTA because you are starting with a failure and evaluating various possible causes. FTA is before; ETA is after.

For each event, a fault tree is created using possible outcomes, and using a series of “and” and “or” gates with modifiers, an event sequence is created. While this book is not a fault tree tutorial, it is necessary to provide some background about uniform notation in order to understand FTA construction.

Each element in the event sequence, and each subelement in the event sequence, has a probability so that the event sequence adds up to 1 at each gate: this is symbolized by Π,

images

The FTA starts with an event, usually described by a rectangle but defined in a circle. The rectangle is generally used for information or description purposes. OR gates are sometimes chevrons or a rounded chevron shape (shown in the following).

Some of the element shapes are diamonds, sometimes indicating an OR gate and indicating an AND gate. The gates are indicating the likelihood of an event or the probability of the event occurring. Note that for an AND gate, A and B must occur for the activity to pass out of the gate. The presence of A or B will not activate the gate alone.

For an OR gate, A or B can cause the fault to propagate through the gate. It is an either or both condition. Consequently, the probability of an event passing through an OR gate is higher than the probability of either of the highest events leading to the gate—because one OR the other or both can take place (Fig. 2.10).

c2-fig-0010

Figure 2.10 Common fault tree analysis symbols in current usage.

  • The examples on the next page will help explain the functions.
  • The mathematical basis for an AND gate is Pand gate = P1 × P2 denoted by Pa.
  • The mathematical basis for an OR gate is Por gate = 1 − (1 − P1) × (1 − P2) denoted by Po (Figs. 2.11 and 2.12).
c2-fig-0011

Figure 2.11 Fault free analysis example after Lewis.

The example for this was taken from Dr. Ted Lewis work on Network Analysis, Op. Cit.

c2-fig-0012

Figure 2.12 Fault tree analysis example for different pathways of entry for a bomb in the plant.

Similarly, if there were three events, on an OR gate, and the probabilities of each threat were P1, P2, and P3, respectively, the probability of a successful attack would be

images

and the probability of failure of all three attacks failing is

images

In some instances, failures can be tracked to equipment or events, and the data on mean time between failures often helps us evaluate the reliability of a particular system. The data on specific items of equipment are available from the American Society for Quality (www.asq.org). For certain classes of equipment, there is a MIL-STD 781-C “Reviews of Standards and Specifications: MTBF Confidence Bounds Based on Fixed Length Test Results”: also available from the ASQ.

Event trees can be used to evaluate the cause of failures. An excellent example of the application of a detailed fault tree is found on the BPs analysis for the cause of the Deepwater Horizon failure9. The Deepwater Horizon Accident Investigation Fault Tree report has been deleted from the BP website and replaced with a discussion of the implementation of their “internal” report findings and recommendations to improve safety, in an attempt to manipulate and polish their poor corporate performance record on safety.

The Deepwater Horizon event was, for that facility, the ultimate disaster scenario, and after paying out several billions of dollars, we are quite sure that BP would agree. The point is that little consideration was given to the possible occurrence of that type of disaster and ultimate failure, and the consequences arising from the well failure, gas release, explosion, and fire, and the failure of the well-sealing valve.

In hindsight (and hindsight is always perfect), an event tree analysis which considered the consequences of well failure, and the following events, might have foreseen the events and could have led to installation of safety equipment and better procedures which would have prevented the catastrophe. The installation of additional controls and procedures which prevented the taking of shortcuts during the drilling operations and greater corporate emphasis and focus on safety and environmental protection should have led to a successful development of the well.

Conclusion

Risk assessment must be comprehensive and must be a continuous process performed with upper management’s knowledge and participation. The specific method for the assessment criteria is highly dependent upon the type of threat and the method of attack. In most cases, those elements are unknown until after they occur. Therefore, it is necessary to consider a number of scenarios for each important process unit, which addresses worst case, expected case, and minimal case set for the attacks. It is equally important to drill for the worst case and expected case, just as one might expect the plant fire brigade to practice fire drills. Only through planning and rehearsed drills can plant employees successfully handle attacks.

Steps for a good analysis

The steps for a good analysis include upper management participation and a thorough investigation of lots of cases. It is a team effort. Briefly, the steps for analysis are outlined below:

  1. List assets. Take an inventory of the assets.
  2. Select the type of analysis you are going to use and determine the best and most applicable type of analysis suited to your plant or corporation. Be sure to include an ultimate disaster scenario for each type of plant present at a site.
  3. Perform network analysis. Determine hubs and links to find out if there are critical elements that you may have missed.
  4. Build a model using fault tree, CARVER, RIDM, MBRA, or other software package.
  5. Analyze the model and prepare a fault tree or an event tree to verify the paths.
  6. From the event tree, you can calculate the optimum events and impacts to guide you in allocation of resources to reduce the hazards.
  7. Circulate the information and get consensus on the inputs, formalize the plan, make upgrades, and use the plan and the scenarios to train the plant personnel.
  8. Review the exercises, and then revise the plan as may be required.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset