8.2. Stochastic Modeling

At the highest level of a system’s description is the specification of the system’s functionality. The security policy is normally a part of this specification. This high-level description can be used to perform a qualitative assessment of system properties, such as the security levels obtained by Common Criteria evaluation [4]. Even though a qualitative evaluation can be used to rank a particular security design, its main focus is on the safeguards introduced during the development and design of a system. Moreover, such methods only evaluate static behavior of a system and do not consider dependencies of events or time aspects of failures. As a consequence, the achieved security level cannot be used to predict a system’s actual robustness, that is, its ability to withstand attacks when running in a certain threat environment. To create a model suitable for quantitative analysis and assessment of operational security and dependability, one needs to use a fine-granular system description, which is capable of incorporating the dynamic behavior of a system. This is the main strength of state transition models where, at a low level, a system is modeled as a finite state machine.

A state in this context means an operational mode of a system characterized by which units of the system are operational or have failed, whether there are ongoing attacks, active countermeasures or, operational and maintenance activities, whether parts of the system have been compromised or not, and so on. It is assumed that a finite number N of such states Si, i = 1, ...,N may be defined. States are mutually exclusive, that is, a system is at any time in one and only one state. Most systems consist of a set of interacting components and the system state is therefore the set of its component states. Making such subdivisions may simplify the modeling. It would, however, complicate the presentation of the modeling approach and is omitted here.

Normally, a system will be subject to multiple failure causes, so that the model will have multiple failure modes. During its operational lifetime, a system will alternate between its different states. This may be due to normal usage as well as misuse, administrative measures and maintenance, and software and hardware failures and repairs. The behavior of a system is, therefore, characterized by the transitions between the states, each transition triggered by an event. The event that will occur next, as well as the time until the next event, is random. Hence, the behavior of a system is a stochastic process.

In a state transition model, one usually discriminates between good states and failed states, depending on whether the required service is delivered or not. Note that in required service, we also include whether the security requirements of a system are met or not. We may, of course, define more than one “good and failed state” classification in order to analyze the system with respect to several criteria. In other words, the system delivers service irrespective of whether the security requirements are met or not, versus a secure service is delivered. This simple binary working failed classification and corresponding measures are used for illustration. However, once CTMCs models are established, there is a potential to include rewards and the improved details from performability modeling [19, 20].

8.2.1. Failure Process

It has been shown in Avizienis et al. [1], Jonsson [2], and Powell and Stroud [21] that the “fault-error-failure” pathology, which is commonly used for modeling the failure process in a dependability context, can be applied in the security domain as well. Based on the results from this research, we demonstrate how a stochastic process can be used to model security failures in a similar way as the dependability community usually treats accidental and unintentional failures.

By definition, the fault-error-failure process is a sequence of events. A fault is an atomic phenomenon that can be either internal or external, which causes an error in a system. An error is a deviation from the correct operation of a system. An error is always internal and will not be visible from outside a system. Even though a system is erroneous it still manages to deliver its intended services. An error may lead to a failure of a system.

  • In a dependability context, a failure is an event that causes the delivered service to deviate from the correct service, as described in a system’s functional specification.

  • Similarly, a security failure causes a system service to deviate from its security requirements, as specified in the security policy.

For each failure state, which conflicts with a system’s intended functionality, we can assign a corresponding property that is violated (e.g., confidentiality-failed or availability-failed). Both security and dependability failures can be caused by a number of accidental fault sources, such as erroneous user input, administrative misconfiguration, software bugs, hardware deterioration, and so on. The failures originating from most of these faults can be modeled as randomly distributed in time, as is common practice in dependability modeling and analysis. However, the ones hardest to predict are the external malicious human-made faults, which are introduced with the objective of altering the functioning of a system during use [1].

In a security context, the result of such a fault is generally referred to as an intrusion. Because they are intentional in nature, intrusions cannot be modeled as truly random processes. Even though the time, or effort, to perform an intrusion may be randomly distributed, the decision to perform the action is not. As pointed out in Nicol et al. [11], security analysis must assume that an attacker’s choice of action will depend on the system state, may change over time, and will result in security failures that are highly correlated.

8.2.2. Modeling Intrusion as Transitions

To be able to model the effect of an intrusion as a transition between a good system state and a failed system state, one needs to take a closer look at the intrusion process itself. According to Powell and Stroud [21], there are two underlying causes of any intrusion:

  • At least one vulnerability (i.e., weakness) in a system. The vulnerability is possible to exploit, however, it will require a certain amount of time from an attacker.

  • A malicious action that tries to exploit the vulnerability. Since the action is intentional, a decision is implicitly made by the attacker. All attackers will not choose the same course of action. Hence, there will be a probability that an attacker decides to perform a particular action.

An intrusion will, therefore, result from an action that has been successful in exploiting a vulnerability. Assume that i is a good (but vulnerable) system state and that j is a failed system state. To formalize the idea of an attacker’s decision, we define πi(a) as the probability that an attacker will choose action a when the system is in state i. In a low-level system abstraction model, the successful intrusion will cause a transition of the system state, from the good state i to the failed state j. We assume that the system regarded is networked and may be attacked by a large number of intruders, n, as soon as it enters a vulnerable state (i.e., when a vulnerability is exposed and there is no vulnerability detection time). We model the accumulated failure intensity if all n potential attackers always take action a as λij(a). This corresponds to having a survival time distribution until a specific attack succeeds (t), which for small t may be approximated by:

Equation 8.1


The survival time distribution under all attacks are (t)n. If we let the rate of an immediately successful intrusion λij(a)/n decrease at the same speed as the number of intruders increase, the survival time distribution until a successful attack becomes

Equation 8.2


That is, a negatively exponentially distributed time until failure/intrusion. Taking the attack probability into account, the failure rate between state i and j becomes

Equation 8.3


This is illustrated in Figure 8.3 where the good state i = 1 is depicted as a circle, and the failed state j = 2 as a square. A gray state symbol indicates that the system is vulnerable in this state.

Figure 8.3. A two-state Markov model with assigned failure rate.


By introducing the attack probability πi(a), the result from a successful intrusion can be modeled as one or more intentional state changes of the underlying stochastic process, which represents the dynamic behavior of the system. The adopted method for determining the attack probabilities will be explained in Sections 8.3 and 8.4.

In contrast to attack graphs, as used in e.g. Jha et al. [22] (see also Chapter 9), where each state transition corresponds to a single atomic step of a penetration, the modeling approach in this chapter aims to be more high level and focus on the impact of intrusions on a system rather than on the specific attack procedures themselves. This facilitates the modeling of unknown attacks in terms of generic state transitions. For example, in the stochastic model depicted in Figure 8.3, the attack a can simply be explained as “the action that seeks to transfer the system from the good state 1 to the failed state 2”.

During the modeling process, the granularity of the state space needs to be carefully considered. Too simple models (as the one shown in Figure 8.3) will not provide any valuable insight into a system’s behavior, whereas too complex models may quickly lead to state space explosion. The choice of what to include in the states definition will therefore be a trade-off between model representativeness and complexity. An example, primarily for illustration purposes, will be provided in Section 8.6.

8.2.3. Modeling the System

So far, the presentation has primarily been concentrated on how we may model attacks. In Section 8.4, we will discuss how actual attack intensities appear as the solution of a strategic game between attackers and a system, with its detection and defense mechanisms. The model of the entire system, including other events like failures, repairs, restorations, operations and maintenance (O&M) actions, and so on, defines the “battleground” (e.g., which are the vulnerable states) and influences the outcome (e.g., a restoration action may remove a vulnerability before it is exploited). The system model is also used to find the system properties of interest, like its availability or mean time to failure, as a result of the game and the parameters associated with the random events. Below we define the formalisms of the system model, and in Section 8.2.4, we explain how the measures of interest may be obtained from this model.

In the previous subsection it was established that successful attacks may be modeled as occurring with a constant intensity in a given state. With no further discussion, we also adopt the common assumption in dependability modeling and analysis that in a specific state of a system, all failure intensities, repair and restoration rates, intensities of operation and maintenance actions, and so on are either constant or may be approximated with a constant. Furthermore, these rates and intensities depend only on the current state of the system. Under these assumptions, we may formalize the ideas discussed above as a CTMC system model with a finite state space S = {S1,...,SN}. The CTMC is defined by its N × N state transition rate matrix Q, whose elements qij ϵ Q for ij are the transition rates between states i and j in the model and . The qij for ij represent the attack success rates in Eq. 8.3, the failure intensities, intensities of operation and maintenance actions, restoration rates, etc. discussed above. An example is presented in Section 8.6.

8.2.4. Obtaining System Measures

Denote the state of a system at time t by s(t) ϵ S. Denote the probability of being in state i at t by Xi(t), that is, Xi(t) = P(s(t) = Si). Let

Equation 8.4


be the state probability vector. It is then well known [23, 24] that this vector may be obtained as the solution to the set of linear differential equations:

Equation 8.5


with the general (but not very useful) solution X(t) = X(0)exp(Qt), where X(0) is the initial condition. However, a numerical solution to Eq. 8.5 is typically quite demanding due to the size of N and the stiffness of Q caused by the orders of magnitude in the ratios between the various rates in the system. Therefore, we concentrate on measures that may be obtained from the asymptotic (i.e., steady-state) behavior of the system. Let

Equation 8.6


The asymptotic state probabilities of the system may now be obtained [23] as the solution to the set of linear equations:

Equation 8.7


with one of the equations in Eq. 8.7 replaced by:

Equation 8.8


In the above equations, 0N and 1N represent vectors of length N constituted of elements all being 0 and 1, respectively, the latter being a column vector.

Having obtained the asymptotic state probabilities X in Eq. 8.6, we may obtain operational measures of the system like the availability (A), the mean time between failures (MTBF), the mean time spent in the good states (MUT), and so on. The system measures of special interest in our context are:

  • The mean time to first failure (MTFF) for the system (i.e., the expected time from the system is as new and until the first failure).

  • The mean time to failure (MTTF) (i.e., the expected duration until a failure occurs when we start to observe the system when it is asymptotically in a good state).

See Figure 8.4 for an illustration of these two system measures.

Figure 8.4. Sample behavior of a system alternating between good and failed states, where MTFF = E(TFF) and MTTF = E(TF).


To efficiently compute these measures, we adopt the approach of Buzacott [25]. The state space is partitioned into two disjoint sets S = {SG,SF}, where SG = {S1,...,SK} and SF = {SK+1,...,SN}, so that the states 1,...,K are good states and the states K + 1,...,N are failed states. Since the state set S is ordered, the Q matrix can be written in partitioned form as:

Equation 8.9


where the size of Q1 is K × K, the size of Q2 is K × (N – K), and so forth. To compute MTFF one assumes that the system is as new at t = 0. Let this be state S1 ϵ SG. Define T = {T1,...,TK}. By solving

Equation 8.10


the MTFF for the system can be computed as:

Equation 8.11


To obtain MTTF, the asymptotic state probabilities (Eq. 8.6) must be known. Since S is partitioned, X also can be partitioned as X = {XG,XF}, where XG = {X1,...,XK} and XF = {XK+1,...,XN}. The asymptotic probability of being in one of the good states is from the Markov property:

Equation 8.12


Hence,

Equation 8.13


Having obtained the asymptotic state probabilities, the availability of the system is straightforwardly obtained as:

Equation 8.14


8.2.5. Model Parametrization

In order to obtain measures (MTFF, MTTF), the stochastic model has to be parametrized (i.e., the elements qij ϵ Q need to be evaluated). The procedure of obtaining accidental failure and repair rates has been practiced for many years in traditional dependability analysis, and will therefore not be discussed in this chapter. However, choosing the accumulated attack intensities λij(a) remains a challenge. One solution is to let security experts assess the intensities based on subjective expert opinion, empirical data, or a combination of both. An example of empirical data is historical attack data collected from honeypots. The data can also be based on intrusion experiments performed by students in a controlled environment. Empirical data from such an experiment conducted at Chalmers University of Technology in Sweden [26] indicates that the time between successful intrusions during the standard attack phase is exponentially distributed. Another ongoing project at the Carnegie Mellon CyLab in Pittsburgh, PA [27] aims to collect information from a number of different sources in order to predict attacks. Even though the process of assessing the attack intensities is crucial, and an important research topic in itself, it is not the primary focus of this chapter.

Obtaining realistic πi(a) (i.e., the probabilities that an attacker chooses particular attack actions in certain system states) may be more difficult. In this chapter, we use game theory as a means for computing the expected attacker behavior. The procedure is summarized in Section 8.3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset