Chapter 20

Simulation and Risk Analysis

Abstract

This chapter introduces what Simulation is, more specifically the Monte Carlo method, which has been used in several fields of knowledge involving decision-making problems that present risks and uncertainties. The Monte Carlo method uses the generation of random numbers to assign values to variables. Through numerical experiments, it is possible to have a better understanding of the behavior of the system under analysis, as well as the forecasting of future scenarios.

Keywords

Simulation of systems; Monte Carlo method; Generation of random numbers; Analysis of risks and uncertainties

Taking the risk of failure, of disappointments, of disillusionments, but never giving up the search for love. Those who do not give up searching shall win!

Paulo Coelho

20.1 Introduction to Simulation

Computer simulation of systems, or simply simulation, consists in utilizing certain mathematical techniques used in computers that allow us to replicate the behavior of a real system through a model. A model is a simplified representation of reality that captures what is truly important in the system; however, it is always simpler than the real system itself.

There are several definitions of the term simulation. For Schriber (1974), simulation consists in the modeling of a process or system, in such a way that the model imitates the answers of the real system in a succession of events that happen throughout time.

According to Kelton et al. (1998, 2010), simulation is the process of designing and estimating a model of a real or proposed system on a computer aiming at conducting numerical experiments that can give us a better understanding of the behavior of a certain system given a series of conditions.

According to Pegden et al. (1990), simulation is the process of designing a computer model of a real system and conducting experiments with this model aiming at understanding its behavior and/or assessing strategies for its operation. Therefore, simulation does not limit itself only to the construction of a model. It proposes theories and hypotheses from observations carried out, and it also uses the model to predict future behavior, that is, it assesses the effects produced from changes in the system.

Thus, computer simulation is the representation of a real system through a model by using a computer. It has the advantage of being able to see the system as a whole, program changes, and analyze different scenarios, minimizing costs and time.

For Chwif and Medina (2014), computer simulation can be classified in three categories: the Monte Carlo simulation, continuous simulation, and discrete event simulation. The Monte Carlo simulation, which is the focus of this chapter, uses random number generators to simulate physical or mathematical systems, without explicitly considering time as a variable. In contrast, continuous simulation and discrete event simulation consider the status change of the system over time. Continuous simulation is used to model systems whose status varies continuously over time. On the other hand, discrete event simulation is used to model systems that change their status at discrete moments throughout time, that is, a leap in time occurs when advancing to the following event.

20.2 The Monte Carlo Method

The Monte Carlo method was developed in 1946, based on Stanislaw Ulam’s ideas. After using the traditional combinatorial analysis to calculate the probabilities of success in a game of solitaire, Ulam realized that there was a more practical alternative that consisted in carrying out several games, around 100 or 1000 times, and counting how many times each result occurred (Nasser, 2012; Machado and Ferreira, 2012).

Ulam disseminated the method when he worked in the Manhattan project at the Los Alamos Nuclear Weapons Laboratory, in the United States, during World War II, jointly with researchers Enrico Fermi, John Von Neumann, and Nicholas Metropolis. The project included the direct simulation of probabilistic problems related to the neutron diffusion coefficient in certain materials (Saraiva Jr. et al., 2011; Oliveira Jr. et al., 2013; Machado and Ferreira, 2012).

The name is a homage to the district of Monte Carlo in Monaco, city located in the South of France, known for its famous casinos and gambling that constantly use draws and data.

The method uses the generation of random numbers to assign values to the variables being studied. According to Oliveira Jr et al. (2013), the numerical values are obtained from random devices (as, for example, tables, roulettes, draws) or directly from software packages, by using specific functions. At each iteration, the result is stored and, at the end of all the repetitions, the sequence of results generated is transformed into a frequency distribution that makes it possible to calculate descriptive statistics, such as, the mean (expected value), minimum value, maximum value, and standard deviation, in addition to predicting future scenarios of the system under analysis.

For Nasser (2012), the method makes it unnecessary to write the differential equations that describe the behavior of complex systems. The only demand is that the physical or mathematical system be modeled in terms of probability density functions (PDF). Once these distributions are known, the Monte Carlo Simulation can proceed generating random samplings from them. This process is repeated several times, it may even get to millions of times, and the result desired is obtained through statistical techniques (mean, standard deviation, etc.) over a certain sample.

The Monte Carlo method has been used in several fields of knowledge that go from the simulation of complex physical phenomena to economic ones, involving decision making problems that present risks and uncertainties.

Applying the Monte Carlo Simulation technique, according to Nasser (2012) and Saraiva Jr et al. (2011), requires the following steps:

  1. 1) For each random variable of the problem being studied, define the format of the probability distribution that better represents its behavior, based on past data or on administrators’ subjective estimates.
  2. 2) For each random variable, generate and store pseudorandom values that adhere to the probability distribution function defined. The number of repetitions may interfere in the result of the simulation, for this reason, we recommend a really high number, around 10,000 repetitions.
  3. 3) Transform the sequence of results generated into a frequency distribution that makes it possible to calculate descriptive statistics and probabilities, obtaining an estimate of the problem’s solution.

In general, the result obtained can be translated into a histogram associated to a cumulative probability density curve that will quantify, in percentages, the probability of a certain value finding itself above, below, or between a confidence interval chosen.

20.3 Monte Carlo Simulation in Excel

20.3.1 Generation of Random Numbers and Probability Distributions in Excel

Assume a random variable (r.v.) that has a normal distribution with mean 8 and standard deviation 2, represented by N(8, 2). Now, let’s generate 100 data from this distribution, using Excel. We will also discuss the respective commands in Stata.

In Excel, click on the Data tab, followed by Data Analysis and choose the analysis tool Random Number Generation, conclude by clicking on Ok. Select the number of variables under analysis (1), the number of random numbers to be generated (100), the type of distribution (normal with mean 8 and standard deviation 2), and the output range (cell $A$1), as shown in Fig. 20.1.

Fig. 20.1
Fig. 20.1 Random number generation for an r.v. with distribution N(8, 2) in Excel.

As a result, 100 random numbers from cell $A$1 to cell $A$100 were generated. Part of these data can be found in Fig. 20.2.

Fig. 20.2
Fig. 20.2 Random number generation for an r.v. with distribution N(8, 2).

The evolution of the series of data generated can be seen in the line chart shown in Fig. 20.3. We can see that the highest frequency of data is around the mean (8), with standard deviation 2.

Fig. 20.3
Fig. 20.3 Line chart of the r.v. with distribution N(8, 2).

We have listed some commands that are available, based on the type of distribution, to generate random numbers in Stata:

gen var = runiform()
gen var = rnormal()
gen var = rbinomial(n,p)
gen var = rchi2(df)
gen var = rpoisson()

To generate a random variable var1, which contains pseudorandom numbers normally distributed with mean 8 and standard deviation 2, we must type the following command:

gen var1 = rnormal(8, 2)

To see the frequency distribution of the data resulting from the simulation graphically, we can generate a histogram in Stata (see Fig. 20.4), by using the command histogram or simply hist followed by the variable name:

Fig. 20.4
Fig. 20.4 Histogram generated from the r.v. with distribution N(8, 2).

hist var1

From Fig. 20.4, we can see that the variable analyzed follows a normal distribution, and that the highest frequency of data is located between 6 and 10.

Let’s describe the same procedure, but now considering a uniform distribution. In order to have the same comparison parameters, random values in the interval between 6 and 10 will be generated, represented by U[6, 10]. The steps to generate these random numbers in Excel are shown in Fig. 20.5.

Fig. 20.5
Fig. 20.5 Random number generation for an r.v. with distribution U[6, 10] in Excel.

The evolution of the series of data generated can be seen in the line chart shown in Fig. 20.6. Different from a normal distribution, we can see that the data are distributed uniformly in the interval [6, 10].

Fig. 20.6
Fig. 20.6 Line chart of the r.v. with distribution U[6, 10].

In order to generate pseudorandom numbers distributed uniformly in the interval [6, 10] in Stata, we must create random variable var2 and type the following command:

gen var2 = 6 +(10-6)⁎runiform()

To visualize the results of the simulation, let’s construct a new histogram, as shown in Fig. 20.7.

Fig. 20.7
Fig. 20.7 Histogram generated from the r.v. with distribution U[6, 10].

Different from a normal distribution, notice that the data are not concentrated around the average value, but they are uniformly distributed in the interval [6, 10].

Therefore, through the Monte Carlo Simulation we can estimate the probability distribution of the random variable that is the closest to the real case and, from then on, generate random phenomena for a certain event.

20.3.2 Practical Examples

Case 1: Consumption of Red Wine

We would like to estimate the total quantity of red wine to be consumed at a party for 500 people. In order to do that, we estimate that 80% of the guests will attend the party and that each person consumes an average of 75 mL of red wine. We will use the Monte Carlo simulation to estimate the total quantity of red wine consumed at the party.

The total consumption of red wine (TCONS) is the product between the number of people at the party (NP) and the average consumption per person (CONSP), represented by Expression (20.1).

TCONS=NP×CONSP

si1_e  (20.1)

A specific estimate for this case would be:

TCONS=500×0.80×75=30,000mL=30L

si2_e

However, in this case, we have random variables; so, a specific estimation does not quantify the uncertainties included in the data. If we obtain the probability distribution of each variable, we can estimate the possible results and the respective probabilities.

Therefore, we will estimate the probability distribution that better represents the behavior of variables NP and CONSP. Variable NP follows a binomial probability distribution with parameters n = 500 and P = .80, represented by NP ~ b(500, .80). Conversely, random variable CONSP follows a uniform probability distribution with an interval between 60 and 90, represented by CONSP ~ U[60, 90]. Fig. 20.8 summarizes such information in Excel.

Fig. 20.8
Fig. 20.8 Input data from Case 1.

Aiming at estimating the possible results and probabilities of the total consumption of red wine at the party, let’s start the Monte Carlo simulation in Excel, extracting a random sample with 10,000 elements from each input variable. In order to do that, click once again on the Data tab, then on Data Analysis, and select the analysis tool Random Number Generation. For random variables NP and CONSP, the parameters inserted are represented in Figs. 20.9 and 20.10, respectively.

Fig. 20.9
Fig. 20.9 Random number generation from variable NP.
Fig. 20.10
Fig. 20.10 Random number generation from variable CONSP.

Fig. 20.11 shows part of the results generated in the Monte Carlo simulation for random variables NP and CONSUP, in addition to the calculation of the total consumption (TCONS) as the product between NP and CONSUP.

Fig. 20.11
Fig. 20.11 Part of the sample drawn from each random variable and calculation of TCONS.

To see the results of the simulation more clearly, we can construct a histogram for each random variable (NP and CONSP) and for the output variable analyzed (TCONS), as shown in Figs. 20.12, 20.13, and 20.14, respectively.

Fig. 20.12
Fig. 20.12 Histogram of variable NP.
Fig. 20.13
Fig. 20.13 Histogram of variable CONSP.
Fig. 20.14
Fig. 20.14 Histogram of variable TCONS.

From the histogram of variable TCONS, we can define the total amount of red wine to be consumed at the party. Instead of using the mean (around 30,000 mL or 30 L), this value can be chosen based on a certain percentile of the distribution. For instance, if we sort the values of variable TCONS in descending order, we can select the first 500 cells that represent 5% of the distribution data, as shown in Fig. 20.15. With a 95% confidence interval, we can say that the total consumption of red wine will be less than 35,498.34 mL (35.5 L), that is:

PTCONS<35,498.34=95%

si3_e

Fig. 20.15
Fig. 20.15 Selecting the 5% highest values of variable TCONS.

Case 2: Profit × Loss Forecast

We would like to estimate the probability of loss in the financial analysis of a company that sells coffee makers. Next, we will describe the variables used in the analysis, the probability distributions in the case of random variables and their respective parameters.

Based on historical data, we estimated the monthly demand of the product (DEMAND), which can be represented by a normal distribution with mean 22,000 and standard deviation 4000.

Whereas the sale price (SALEP) follows a uniform distribution with a minimum value of US$ 14.00 and a maximum value of US$ 49.00.

Among the variable costs, we have raw materials costs (RMC) and labor costs (LABORC). The variable RMC follows a uniform distribution with values between US$ 5.00 and US$ 21.00. LABORC can be represented by a normal distribution with mean US$ 8.00 and standard deviation US$ 3.00.

The estimated fixed cost (FIXEDC) is US$ 70,000.00. Fig. 20.16 summarizes this information in Excel.

Fig. 20.16
Fig. 20.16 Input data from Case 2.

Aiming at estimating the possible results and the probability of loss, let’s start the Monte Carlo simulation in Excel, extracting a random sample with 10,000 elements from each input variable. In order to do that, once again, click on the Data tab, followed by Data Analysis, and select the analysis tool Random Number Generation. For random variables DEMAND, SALEP, RMC, and LABORC, the parameters inserted are represented in Figs. 20.17, 20.18, 20.19, and 20.20, respectively.

Fig. 20.17
Fig. 20.17 Random number generation from the variable DEMAND.
Fig. 20.18
Fig. 20.18 Random number generation from the variable SALEP.
Fig. 20.19
Fig. 20.19 Random number generation from the variable RMC.
Fig. 20.20
Fig. 20.20 Random number generation from the variable LABORC.

Fig. 20.21 shows part of the results generated in the Monte Carlo simulation for the random variables, besides the calculation of the profit, whose formula is:

PROFIT=SALEPRMCLABORC×DEMANDFIXEDC

si4_e

Fig. 20.21
Fig. 20.21 Part of the sample drawn from each random variable and calculation of PROFIT.

Now, we can calculate the probability of loss based on the 10,000 scenarios generated. First, we calculated the average profit and the standard deviation of the sample, which correspond to R$ 159,844.27 (cell I15) and R$ 257,685.82 (cell I16), respectively, as shown in Fig. 20.22. To calculate the probability of loss, we use the NORM.DIST function in Excel that gives us the normal cumulative distribution for the mean and standard deviation specified. The complete formula is NORM.DIST(x, mean, standard_dev, cumulative). Since we want to calculate the probability of loss, the value of x must be 0 and the cumulative value is 1 or TRUE. Hence, the loss formula represented in cell I17 in Fig. 20.22 is NORM.DIST(0, I15, I16, 1), and the calculated risk is 26.75%.

Fig. 20.22
Fig. 20.22 Probability of loss for Case 2.

If, instead of using the NORM.DIST function in Excel, we sort the data in column F in ascending order and select the cells with negative values, we can see that P(X < 0) = 29.42%. All the data in the spreadsheet are available in the file Case 2.xls.

20.4 Final Remarks

Computer simulation can be classified in three categories: the Monte Carlo simulation, continuous simulation, and discrete event simulation. The focus of this chapter was the Monte Carlo simulation that uses random number generators to simulate systems, without explicitly considering time as a variable.

The numerical values are obtained from random devices or directly from software packages, through specific functions. At each iteration, the result is stored and, at the end of all the repetitions, the sequence of results generated is transformed into a frequency distribution that makes it possible to calculate descriptive statistics, such as, the mean, minimum value, maximum value and standard deviation, besides predicting future scenarios of the system under analysis.

The Monte Carlo method has been used in several fields of knowledge that go from the simulation of complex physical phenomena to economic phenomena, involving decision-making problems with risks and uncertainties.

20.5 Exercises

  1. 1) Generate 1000 random numbers for a variable that has a normal distribution with mean 120 and standard deviation 34. Construct a line chart to see the evolution of the series of data generated, as well as a histogram to see the frequency distribution of the data resulting from the simulation.
  2. 2) Do the same for a variable with a uniform distribution between 54 and 88.
  3. 3) Do the same for a variable with a binomial distribution with values P = .8 and n = 100.
  4. 4) The daily demand of computers at a certain company is random. It may be represented by a normal distribution with mean 625 and standard deviation 40. The daily production of the product is also a random variable that follows a uniform distribution between 600 and 650 units. The inventory balance equation calculates the inventory at the end of each period. It corresponds to the sum of the final inventory of the previous period and the production of the current period minus the demand of the current period (It = It − 1 + P − Dt). By using the Monte Carlo simulation, determine the probability of having a negative inventory. Consider a random sample with 10,000 elements and I0 = 100.
  5. 5) Among the logistics performance indicators, we can use the transportation costs as a percentage of the sales, which gives us a general overview of the transportation costs within the company’s revenue. We can calculate it as the total transportation costs divided by the net revenue, multiplied by 100. The total transportation costs for a group of companies of a certain sector can be represented by a normal distribution with mean 4.5 million and standard deviation 0.6. Conversely, the net revenue can be represented by a normal distribution with mean 75 million and standard deviation 10. Calculate the probability of this indicator being greater than 7% using the Monte Carlo simulation.

References

Kelton W.D., Sadowski R.P., Swets N.B. Simulation with Arena. first ed. New York: McGraw-Hill; 1998.

Machado N.R.S., Ferreira A.O. Método de Simulação de Monte Carlo em Planilha Excel: Desenvolvimento de uma ferramenta versátil para análise quantitativa de riscos em gestão de projetos. Revista de Ciências Gerenciais. 2012;16(23):223–244.

Nasser R.B. Mccloud service framework: arcabouço para desenvolvimento de serviços baseados na simulação de Monte Carlo na cloud. Pontifícia Universidade Católica do Rio de Janeiro – PUC-RIO; 2012 Dissertação (Mestrado em Informática).

Oliveira Jr. P.A., Dantas M.J.P., Machado R.L. Aplicação da Simulação de Monte Carlo no Gerenciamento de Riscos em Projetos com o Cristal Ball. Simpósio de Administração da Produção, Logística e Operações Internacionais; 2013.

Pegden C.D., Shannon R.E., Sadowski R.P. Introduction to Simulation Using SIMAN. second ed. New York: McGraw-Hill; 1990.

Saraiva Jr. A.F., Tabosa C.M., Costa R.P. Simulação de Monte Carlo aplicada à análise econômica de pedido. Produção. 2011;21(1):149–164.

Schriber T.J. Simulation Using GPSS. Ed. Ft. Belvoir Defense Technical Information New York: Wiley; 1974.


"To view the full reference list for the book, click here"

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset