CHAPTER 8
Monte Carlo Simulation
Very often a reliability goal function cannot be expressed in a convenient analytical form and makes even calculation of reliability decrements practically impossible. For instance, such situations arise when system units are mutually dependent or their reliability simultaneously depends on some common environmental factors (temperature, mechanical impacts, etc.). In these cases, the Monte Carlo simulation is usually used for reliability indices calculation. However, the problem arises: how can one use the Monte Carlo simulation for optimization?
Roughly speaking, the idea is in observing the process of the spare unit expenditure (replacement of failed units) until specified restrictions allow one to do so. This may be a simulation process or an observation of the real deployment of the system. After the stopping moment, we start another realization of simulation process or observation of the real data. When the appropriate statistical data are collected, the process of finding the optimal solution starts.
Avoiding a formal description of the algorithm, let us demonstrate it on numerical examples which will make the idea of the method and its specific technique clearer.
Standard methods do not give a solution if the goal function is the mean time to failure:
(8.1)
or if units are dependent, for instance, via a vector of some external factors g (temperature, humidity, vibration, etc.):
(8.2)
where F(g) is the distribution function of some external parameter g and G is its domain.
For solution of the formulated problems, we need to have data obtained from real experiment (or system deployment) or from Monte Carlo simulation of the system model. Though this procedure is routine, we will briefly describe it for the presentation closeness. The procedure is as follows.
Consider a series system of n units. (For simplicity of explanation of the algorithm, we will assume that the units are independent. However, everything described below can be easily extended to the general case: it will effect only a mechanism of random sequence generation.)
Let us consider the process of spare unit expenditure as the process of changing the system states and the total cost of spare units at sequential replacement moments. After failure each unit is immediately replaced with a spare one. Let be the moment of the kth replacement during the jth Monte Carlo experiment. The number of spare units of type i spent at moment is denoted .
An initial state at is:
The total cost of spare units at the initial moment is C0 = 0. (Sometimes it might be reasonable to consider the initial cost of the system with no spare units as C0, that is, where ki is the number of units of type i within equipment before reliability improvement.)
Consider the step-by-step procedure of generating trajectories ψ(s), s = 1, 2, …. We begin with ψ(1) but the corresponding superscript, (1), will be omitted for the sake of convenience.
Thus completes the first cycle. GOTO step 2, that is, find t2 = min1≤i≤nbi2, and so on, until stopping the first realization.
The type of problem to be solved determines the stopping rule of each realization.
Stopping rule for the inverse problem of optimal redundancy: The process is stopped at the moment tN when the total cost of spare units exceeds the permitted C0.
Stopping rule for the direct problem of optimal redundancy: The simulation process for each realization stops at the moment tM−1 < t*≤tM where t* equals the required operational time t0 (if the reliability index is the probability of failure free operation) or t* is the required system's mean time to failure (MTTF).
After the termination of generating the first trajectory, (1), we start to generate (2) by the same rules. The number of needed realizations, N, is determined by the required accuracy of statistical estimates.
Thus, each trajectory j represents a set of the following data:
where is the set of spare units at moment , that is, .
After the description of the Monte Carlo simulating process, let us consider the optimization problems themselves. We can make an important remark: previously all problems were formulated in probabilistic terms, but dealing with statistical (empirical) functions has to be specific. The following problems are reformulated in an appropriate way.
Assume that we need to supply some system with spare units for a specified period of time. We have no prior knowledge of units' reliability but we have an opportunity to observe a real process (or simulation) of failure occurrence.
Consider the direct problem of optimal redundancy. What shall we do in this case? We observe the process of spare unit expenditure during time t*. This process can be described as a random path—call it trajectory—in a discrete n-dimensional space, X. An illustration of such a process in a two-dimensional case is presented in Figure 8.1.
Let us observe N such trajectories, j = 1, 2, … , N, in an n-dimensional space where n is the number of unit types. Each realization is stopped when the total cost of spare units exceeds the permitted amount, that is, each trajectory reaches or even penetrates a hyperplane:
determined by the restriction on the total system cost (an example for the two-dimensional case is given in Fig. 8.2).
After this, in the same n-dimensional space, we construct the hypercube xr, r = 1, 2, … , such that each of its vertices is lying under the hyperplane. In Figure 8.2 there are two such hypercubes, though there could be many of them: actually, in this case all pairs (x1, x2) that belong to hyper plane C(x1, x2) = C0 could be vertices of such a hypercube.
Denote the maximum time that trajectory (j) is spending within the hypercube χr by . Introduce an indicator:
Among all hypercubes above we choose the hypercube χr that maximizes the frequency of failure-free operation during required interval t0 under the cost restrictions:
where C0 is the admissible redundant group cost.
Maximization of the system average time to failure is reached by the hypercube that corresponds to the solution of the following problem of the conditional optimization:
Now consider the direct problem of optimal redundancy. In this case, the required time of the system failure-free operation equals t0. We observe the process of spare units expenditure N times until the system failure-free time exceeds t0 and record trajectories (j), j = 1, 2, … , N, in an n-dimensional space. Afterwards we construct such hypercubes χr, r = 1, 2, … , in the same n-dimensional space that includes (covers) R0 × 100% of all trajectories where R0 is the specified level of the reliability index. Among all hypercubes described above, we choose the one that is characterized by the minimum total cost. In other words, the hypercube must satisfy the solution of the following problem:
(8.7)
Now let the specified requirement be given for the system average time to failure, T*. The hypercube presenting the solution must be chosen corresponding to the solution of the problem:
(8.8)
Of course, one should take into account that the operation with the frequency differs from the operation with the probability. The proposed solution is asymptotically accurate. Thus, all these arguments are satisfactory only for a large enough sample size.
We need to find xopt that satisfies Equation (8.5). The algorithm for solution is as follows.
Unit 1 | Unit 2 | Unit 3 |
---|---|---|
Realization 1 | ||
0.07 | 0.75 | 0.24 |
0.6 | 3.43 | |
0.66 | ||
1.19 | ||
Realization 2 | ||
0.42 | 0.13 | 4.92 |
0.58 | 1.28 | |
1.03 | ||
1.31* | ||
Realization 3 | ||
0.62 | 3.47 | 3.22 |
4.28 | 6.2* | 7.14* |
9.39* | ||
Realization 4 | ||
1.45 | 5.85 | 0.51 |
2.58* | 9.15 | |
3.85* | ||
4.29* | ||
Realization 5 | ||
0.32 | 0.22 | 0.54 |
1.08 | 0.37 | 2.06 |
1.87 | ||
Realization 6 | ||
0.11 | 2.03 | 5.54 |
1.13 | 2.52* | |
2.01* | ||
2.41* | ||
Realization 7 | ||
1.22 | 0.11 | 2.69 |
3.09* | 1.02 | 2.79* |
3.13* | ||
Realization 8 | ||
0.27 | 1.49 | 22.49 |
0.71 | 2.02* | |
1.45 | ||
2.2* | ||
Realization 9 | ||
0.46 | 1.55 | 7.9 |
1.52 | 6.35* | |
3.42* | ||
3.59* | ||
Realization 10 | ||
0.83 | 1.08 | 0.58 |
1.23 | 2.84* | 4.33 |
2.23* |
x1 | x2 | x3 |
---|---|---|
1 | 1 | 1 |
2 | 1 | 1 |
2 | 1 | 1 |
2 | 1 | 1 |
2 | 1 | 1 |
2 | (1) | 2 |
3 | 2 | 2 |
3 | 2 | 2 |
3 | 2 | 2 |
(4) | 3 | (2) |
Note: Numbers in parentheses correspond to the first realization, which was not taken into account.
As a direct calculation using tables of Poisson distribution shows, this vector delivers the probability 0.804. Of course, such a coincidence with the observed frequency equal to 0.8 in a particular statistical experiment is not a proof of the method. However, multiple results obtained by the proposed method for other examples show a proper closeness to the exact solution, even for a relatively small sample size.
The asymptotical convergence of the solution to the optimal one was proved by Gordienko and Ushakov (1978).
We need to find that satisfies the solution of Equation (8.6). In this case, the algorithm almost completely coincides with the one described above. The only difference is in the absence of step 3. At step 4 we directly place in a counter of the survival time. After analyzing all of the trajectories, the estimate of the MTTF for the hypercube χ1 is calculated as
(8.9)
After this, we perform the analogous calculations for other hypercubes, finding those which are characterized by the maximum estimate of MTTF. The search for the maximum can be performed in the same way as was done previously.
Unit 1 | Unit 2 | Unit 3 |
---|---|---|
Realization 1 | ||
0.07 | 0.75 | 0.24 |
0.53 | 3.19 | |
0.06 | ||
0.53 | ||
Realization 2 | ||
0.42 | 0.13 | 4.92 |
0.16 | 1.15 | |
0.45 | ||
0.28 | ||
Realization 3 | ||
0.62 | 3.47 | 3.22 |
3.66 | 2.72 | 3.92 |
5.11 | ||
3.9 | ||
Realization 4 | ||
1.45 | 5.85 | 0.51 |
1.13 | 8.64 | |
1.27 | ||
0.45 | ||
Realization 5 | ||
0.32 | 0.22 | 0.54 |
0.75 | 0.15 | 1.53 |
1.49 | ||
Realization 6 | ||
0.11 | 2.03 | 5.54 |
1.03 | 0.48 | |
0.88 | ||
0.39 | ||
Realization 7 | ||
1.22 | 0.11 | 2.69 |
1.87 | 0.91 | 0.1 |
2.11 | ||
Realization 8 | ||
0.27 | 1.49 | 22.49 |
0.44 | 0.53 | |
0.74 | ||
0.76 | ||
Realization 9 | ||
0.46 | 1.55 | 7.9 |
1.06 | 4.8 | |
1.9 | ||
0.17 | ||
Realization 10 | ||
0.83 | 1.08 | 0.58 |
0.4 | 1.76 | 3.76 |
1 |
Unit 1 | Unit 2 | Unit 3 |
---|---|---|
5.77 | 16.68 | 48.63 |
11.03 | 12.5 | 21.14 |
11.41 | 3.6* | 3.9* |
2.58* |
* Units that are eliminated (x1 = 4, x2 = 3, and x3 = 3).
Realization number | TTF |
---|---|
1 | 0.66 |
2 | 1.03 |
3 | 6.2 |
4 | 3.85 |
5 | 0.37 |
6 | 2.01 |
7 | 1.02 |
8 | 1.45 |
9 | 3.42 |
10 | 2.23 |
We need to find which satisfies the solution of Equation (8.5). The algorithm of solution in this case is as follows.
where subscript k stands for the current step, subscript k − 1 for the previous step, and subscript k + 1 for the next step.
We need to find , which satisfies the solution of Equation (8.6). We could not find a convenient procedure for solving this particular problem. One might consider using an interactive procedure using the sequential solution of the second direct problem considered above. For instance, we can fix some restriction on the system cost, say, , and find the corresponding optimal solution for . If this value is smaller than the required , it means that the system cost must be increased, say, up to some . If , one must choose . This procedure continues until a satisfactory solution is obtained. At an intermediate step L for choosing , one can use the linear extrapolation method. For example, assume that in first situation described above, the value is still less than . Then the value of can be chosen from the following equation:
Obviously, one can also use the procedure similar to that in a solution of the direct problem. However, one should somehow find an initial hypercube and construct all trajectories within it. (There is no stopping rule in this case.) Then one should construct a system of embedded hypercubes and again use the steepest descent.
While solving this problem one must remember that the condition can be considered only in a probabilistic sense.
Chronological Bibliography
Ushakov, I. A., and Yasenovets, A. V. 1977. “Statistical methods of solving problems of optimal standby.” Soviet Journal of Computer and System Sciences, no. 6.
Ushakov, I. A., and Gordienko, E. I. 1978. “On statistical simulation approach to solution of some optimization problems.” Elektronische Informationsverarbeitung und Kybernetik, no. 3.
Ushakov, I. A., and Gordienko, E. I. 1978. “Solution of some optimization problems by means of statistical simulation.” Electronosche Infdormationsverarbeitung und Kybernetik, no. 11.
Mohan, C., and Shanker, K. 1988. “Reliability optimization of complex systems using random search technique.” Microelectronics and Reliability, no. 28.
Boland, P. J., El-Neweihi, E., and Proschan, F. 1992. “Stochastic order for redundancy allocations in series and parallel systems.” Advances in Applied Probability, no. 1.
Zhao, R., and Liu, B. 2003. “Stochastic programming models for general redundancy-optimization problems.” IEEE Transaction on Reliability, no. 52.