Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6

Random Variables and Probability Distributions

Abstract

This chapter discusses the concepts related to discrete and continuous random variables. Initially, we calculate the expected value, the variance, and the cumulative distribution function of discrete and continuous random variables. After that, the main types of probability distribution for discrete random variables are described: discrete uniform, Bernoulli, binomial, geometric, negative binomial, hypergeometric, and Poisson. For continuous random variables, the distributions studied are: uniform, normal, exponential, gamma, chi-square (χ²), Student’s t, and Snedecor’s F. Thus, it will be possible to determine the most suitable distribution for a certain set of data.

Keywords

Discrete random variables; Continuous random variables; Cumulative distribution function; Probability distributions for discrete variables; Probability distributions for continuous variables

What we call chance can only be the unknown cause of a known effect.

Voltaire

6.1 Introduction

In Chapters 3 and 4, we discussed several statistics to describe the behavior of quantitative and qualitative data, including sample frequency distributions. In this chapter, we are going to study population probability distributions (for quantitative variables). The frequency distribution of a sample is an estimate of the corresponding population probability distribution. When the sample size is large, the sample frequency distribution approximately follows the population probability distribution (Martins and Domingues, 2011).

According to the authors, for the study of empirical researches, as well as for solving several practical problems, the study of descriptive statistics is essential. However, when the main goal is to study a population’s variables, the probability distribution is more suitable.

This chapter discusses the concept of discrete and continuous random variables, the main probability distributions for each type of random variable, and also the calculation of the expected value and the variance of each probability distribution.

For discrete random variables, the most common probability distributions are the discrete uniform, Bernoulli, binomial, geometric, negative binomial, hypergeometric, and Poisson. On the other hand, for continuous random variables, we are going to study the uniform, normal, exponential, gamma, chi-square (χ²), Student’s t, and Snedecor’s F distributions.

6.2 Random Variables

As studied in the previous chapter, the set of all possible results of a random experiment is called sample space. To describe a random experiment, it is convenient to associate numerical values to the elements of the sample space. A random variable can be characterized as being a variable that presents a single value for each element, and this value is determined randomly.

Assume that ɛ is a random experiment and S is the sample space associated to this experiment. Function X that associates to each element s ∈ S a real number X (s) is called random variable. Random variables can be discrete or continuous.

6.2.1 Discrete Random Variable

A discrete random variable can only take on countable numbers of distinct values, usually counts. Therefore, it cannot assume decimal or noninteger values. As examples of discrete random variables, we can mention the number of children in a family, the number of employees in a company, or the number of vehicles produced in a certain factory.

6.2.1.1 Expected Value of a Discrete Random Variable

Let X be a discrete random variable that can take on the values {x₁, x₂, …, x_n} with the respective probabilities {p(x₁), p(x₂), …, p(x_n)}. Function {x_i, p(x_i), i = 1, 2, …, n} is called random variable X probability function and associates, to each value of x_i, its probability of occurrence:

$p (x_{i}) = P (X = x_{i}) = p_{i}, i = 1, 2, \dots, n$

(6.1)

so p(x_i) ≥ 0 for every x_i and $\sum_{i = 1}^{n} p (x_{i}) = 1$ .

The expected or average value of X is given by the expression:

$E (X) = \sum_{i = 1}^{n} x_{i} \cdot P (X = x_{i}) = \sum_{i = 1}^{n} x_{i} . p_{i}$

si30_e (6.2)

Expression (6.2) is similar to the one used for the mean in Chapter 3, in which instead of probabilities p_i we had relative frequencies Fr_i. The difference between p_i and Fr_i is that the former corresponds to the values from an assumed theoretical model and the latter to the variable values observed. Since p_i and Fr_i have the same interpretation, all of the measures and charts presented in Chapter 3, based on the distribution of Fr_i, have a corresponding one in the distribution of a random variable. The same interpretation is valid for other measures of position and variability, such as, the median and the standard deviation (Bussab and Morettin, 2011).

6.2.1.2 Variance of a Discrete Random Variable

The variance of a discrete random variable X is a weighted mean of the distances between the values that X can take on and X’s expected value, where the weights are the probabilities of the possible values of X. If X assumes the values {x₁, x₂, …, x_n} with the respective probabilities {p₁, p₂, …, p_n}, then its variance is given by:

$Var (X) = σ^{2} (X) = E [{(X - E (X))}^{2}] = \sum_{i = 1}^{n} {[x_{i} - E (X)]}^{2} . p_{i}$

si31_e (6.3)

In some cases, it is convenient to use the standard deviation of a random variable as a measure of variability. The standard deviation of X is the square root of the variance:

$σ (X) = \sqrt{Var (X)}$

(6.4)

Example 6.1

Assume that the monthly real estate sales for a certain real estate agent follow the probability distribution seen in Table 6.E.1. Determine the expected value of monthly sales, as well as its variance.

Table 6.E.1

Monthly Real Estate Sales and Their Respective Probabilities
x_i(sales)	0	1	2	3
p(x_i)	2/10	4/10	3/10	1/10

Unlabelled Table

Solution

The expected value of monthly sales is:

$E (X) = 0 \times 0.20 + 1 \times 0.40 + 2 \times 0.30 + 3 \times 0.10 = 1.3$

The variance can be calculated as:

$Var (X) = {(0 - 1.3)}^{2} \times 0.2 + {(1 - 1.3)}^{2} \times 0.4 + {(2 - 1.3)}^{2} \times 0.3 + {(3 - 1.3)}^{2} \times 0.1 = 0.81$

6.2.1.3 Cumulative Distribution Function of a Discrete Random Variable

The cumulative distribution function (c.d.f.) of a random variable X, denoted by F(x), corresponds to the sum of the x_i values probabilities that are less than or equal to x:

$F (x) = P (X \leq x) = \sum_{x_{i} \leq x} p (x_{i})$

si35_e (6.5)

The following properties are valid for the cumulative distribution function of a discrete random variable:

$0 \leq F (x) \leq 1$

(6.6)

$lim_{x \to \infty} F (x) = 1$

(6.7)

$lim_{x \to - \infty} F (x) = 0$

(6.8)

$a < b \to F (a) \leq F (b)$

(6.9)

Example 6.2

For the data in Example 6.1, calculate F(0, 5), F(1), F(2, 5), F(3), F(4), and F(− 0, 5).

Solution

a) $F (0.5) = P (X \leq 0.5) = \frac{2}{10}$
b) $F (1) = P (X \leq 1) = \frac{2}{10} + \frac{4}{10} = \frac{6}{10}$
c) $F (2.5) = P (X \leq 2.5) = \frac{2}{10} + \frac{4}{10} + \frac{3}{10} = \frac{9}{10}$
d) $F (3) = P (X \leq 3) = \frac{2}{10} + \frac{4}{10} + \frac{3}{10} + \frac{1}{10} = 1$
e) F(4) = P(X ≤ 4) = 1
f) F(− 0.5) = P(X ≤ − 0.5) = 0

In short, the cumulative distribution function of random variable X in Example 6.1 is given by:

$F (x) = \{\begin{cases} 0 if x < 0, \\ 2 / 10 if 0 \leq x < 1, \\ 6 / 10 if 1 \leq x < 2, \\ 9 / 10 if 2 \leq x < 3, \\ 1 if x \geq 3 \end{cases}$

si44_e

6.2.2 Continuous Random Variable

A continuous random variable can take on several different values in an interval of real numbers. As examples of continuous random variables, we can mention a family’s income, the revenue of a company, or the height of a certain child.

A continuous random variable X is associated to an f(x) function, called a probability density function (p.d.f.) of X, which meets the following condition:

$\int_{- \infty}^{+ \infty} f (x) dx = 1, f (x) \geq 0$

si45_e (6.10)

For any a and b, such that − ∞ < a < b < + ∞, the probability of random variable X taking on values within this interval is:

$P (a \leq X \leq b) = \int_{a}^{b} f (x) dx$

si46_e (6.11)

which can be graphically represented as shown in Fig. 6.1.

6.2.2.1 Expected Value of a Continuous Random Variable

The mathematical expected or average value of a continuous random variable X with a probability density function f(x) is given by the expression:

$E (X) = \int_{- \infty}^{+ \infty} xf (x) dx$

si47_e (6.12)

6.2.2.2 Variance of a Continuous Random Variable

The variance of a continuous random variable X with a probability density function f(x) is calculated as:

$Var (X) = E (X^{2}) - {[E (X)]}^{2} = \int_{- \infty}^{\infty} {(x - E (X))}^{2} f (x) dx$

si48_e (6.13)

Example 6.3

The probability density function of a continuous random variable X is given by:

$f (x) = \{\begin{cases} 2 x, 0 < x < 1 \\ 0, for any other values \end{cases}$

si49_e

Calculate E(X) and Var(X).

Solution

$E (X) = \int_{0}^{1} (x .2 x) dx = \int_{0}^{1} (2 x^{2}) dx = \frac{2}{3}$

si50_e

$E (X^{2}) = \int_{0}^{1} (x^{2} .2 x) dx = \int_{0}^{1} (2 x^{3}) dx = \frac{1}{2}$

si51_e

$VAR (X) = E (X^{2}) - {[E (X)]}^{2} = \frac{1}{2} - {(\frac{2}{3})}^{2} = \frac{1}{18}$

si52_e

6.2.2.3 Cumulative Distribution Function of a Continuous Random Variable

As in the discrete case, we can calculate the probabilities associated to a continuous random variable X from a cumulative distribution function.

Cumulative distribution function F(x) of a continuous random variable X with probability density function f(x) is defined by:

$F (x) = P (X \leq x), - \infty < x < \infty$

(6.14)

Expression (6.14) is similar to the one presented for the discrete case, in Expression (6.5). The difference is that, for continuous variables, the cumulative distribution function is a continuous function, without jumps.

From (6.11) we have:

$F (x) = \int_{- \infty}^{x} f (x) dx$

si54_e (6.15)

As in the discrete case, the following properties for the cumulative distribution function of a continuous random variable are valid:

$0 \leq F (x) \leq 1$

(6.16)

$lim_{x \to \infty} F (x) = 1$

(6.17)

$lim_{x \to - \infty} F (x) = 0$

(6.18)

$a < b \to F (a) \leq F (b)$

(6.19)

Example 6.4

Once again, let us consider the probability density function in Example 6.3:

$f (x) = \{\begin{cases} 2 x, 0 < x < 1 \\ 0, for any other values \end{cases}$

si49_e

Calculate the cumulative distribution function of X.

Solution

$F (x) = P (X \leq x) = \int_{- \infty}^{x} f (x) dx = \int_{- \infty}^{x} 2 xdx = \{\begin{cases} 0 if x \leq 0 \\ x^{2} if 0 < x \leq 1 \\ 1 if x > 1 \end{cases}$

si60_e

6.3 Probability Distributions for Discrete Random Variables

For discrete random variables, the most common probability distributions are the discrete uniform, Bernoulli, binomial, geometric, negative binomial, hypergeometric, and Poisson.

6.3.1 Discrete Uniform Distribution

It is the simplest discrete probability distribution and receives the name uniform because all of the possible values of the random variable have the same probability of occurrence.

A discrete random variable X that takes on the values x₁, x₂, …, x_n has a discrete uniform distribution with parameter n, denoted by X ~ U_d{x₁, x₂, …, x_n}, if its probability function is given by:

$P (X = x_{i}) = p (x_{i}) = \frac{1}{n}, i = 1, 2, \dots, n$

si61_e (6.20)

which may be graphically represented as shown in Fig. 6.2.

The mathematical expected value of X is given by:

$E (X) = \frac{1}{n} . \sum_{i = 1}^{n} x_{i}$

si62_e (6.21)

The variance of X is calculated as:

$Var (X) = \frac{1}{n} . [\sum_{i = 1}^{n} x_{i}^{2} - \frac{{(\sum_{i = 1}^{n} x_{i})}^{2}}{n}]$

si63_e (6.22)

And the cumulative distribution function (c.d.f.) is:

$F (X) = P (X \leq x) = \sum_{x_{i} \leq x} \frac{1}{n} = \frac{n (x)}{n},$

si64_e (6.23)

where n(x) is the number of x_i ≤ x, as shown in Fig. 6.3.

Example 6.5

A totally balanced and clean die is thrown and random variable X represents the value on the face that is facing up. Determine the distribution of X, in addition to X’s expected value and variance.

Solution

The distribution of X is shown in Table 6.E.2.

Table 6.E.2

Distribution of X
X	1	2	3	4	5	6	Sum
f(x)	1/6	1/6	1/6	1/6	1/6	1/6	1

Unlabelled Table

Therefore, we have:

$E (X) = \frac{1}{6} (1 + 2 + 3 + 4 + 5 + 6) = 3.5$

si65_e

$Var (X) = \frac{1}{6} [(1 + 2^{2} + \dots + 6^{2}) - \frac{{(21)}^{2}}{6}] = \frac{35}{12} = 2.917$

si66_e

Fig. 6.3 Cumulative distribution function.

6.3.2 Bernoulli Distribution

The Bernoulli trial is a random experiment that only offers two possible results, conventionally called success or failure. As an example of a Bernoulli trial, we can mention tossing a coin, whose only possible results are head and tail.

For a certain Bernoulli trial, we will consider the random variable X that takes on the value 1 in case of success, and 0 in case of failure. The probability of success is represented by p and the probability of failure by (1 − p) or q. The Bernoulli distribution, therefore, provides the probability of success or failure of variable X when carrying out a single experiment. Therefore, we can say that variable X follows a Bernoulli distribution with parameter p, denoted by X ~ Bern(p), if its probability function is given by:

$P (X = x) = p (x) = \{\begin{cases} q = 1 - p, if x = 0 \\ p, if x = 1 \end{cases}$

si67_e (6.24)

which can also be represented in the following way:

$P (X = x) = p (x) = p^{x} . {(1 - p)}^{1 - x}, x = 0, 1$

(6.25)

The probability function of random variable X is represented in Fig. 6.4.

It is easy to see that the expected value of X is:

$E (X) = p$

(6.26)

with X’s variance being:

$Var (X) = p . (1 - p)$

(6.27)

Bernoulli’s cumulative distribution function (c.d.f.) is given by:

$F (x) = P (X \leq x) = \{\begin{cases} 0, if x < 0 \\ 1 - p, if x \leq 0 < 1 \\ 1, if x \geq 1 \end{cases}$

si71_e (6.28)

which can be represented by Fig. 6.5.

It is important to mention that we are going to use all knowledge on Bernoulli’s distribution when discussing binary logistics regression models (Chapter 14).

Example 6.6

The Interclub Indoor Soccer Cup final match is going to be between teams A and B. Random variable X represents the team that will win the Cup. We know that the probability of team A winning is 0.60. Determine the distribution of X, in addition to X’s expected value and variance.

Solution

Random variable X can only take on two values:

$X = \{\begin{cases} 1, if team A wins \\ 0, if team B wins \end{cases}$

si72_e

Since it is a single game, variable X follows a Bernoulli distribution with parameter p = 0.60, denoted by X ~ Bern(0.6), so:

$P (X = x) = p (x) = \{\begin{cases} q = 0.4, if x = 0 (team B) \\ p = 0.6, if x = 1 (team A) \end{cases}$

si73_e

We have:

$E (X) = p = 0.6$

$Var (X) = p (1 - p) = 0.6 \times 0.4 = 0.24$

6.3.3 Binomial Distribution

A binomial experiment consists in n independent repetitions of a Bernoulli trial with probability of success p, probability that remains constant in all repetitions.

Discrete random variable X of a binomial model corresponds to the number of successes (k) in the n repetitions of the experiment. Therefore, X follows a binomial distribution with parameters n and p, denoted by X ~ b(n, p), if its probability distribution function is given by:

$f (k) = P (X = k) = (\begin{array}{l} n \\ k \end{array}) . p^{k} . {(1 - p)}^{n - k}, k = 0, 1, \dots, n$

si76_e (6.29)

where $(\begin{array}{l} n \\ k \end{array}) = \frac{n!}{k! (n - k)!}$

The mean of X is given by:

$E (X) = n . p$

(6.30)

On the other hand, the variance of X can be expressed as:

$Var (X) = n . p . (1 - p)$

(6.31)

Note that the mean and variance of the binomial distribution are equal to the mean and variance of the Bernoulli distribution, multiplied by n, the number of repetitions in a Bernoulli trial.

Fig. 6.6 shows the probability function of the binomial distribution for n = 10 and p varying between 0.3, 0.5, and 0.7.

From Fig. 6.6, we can see that, for p = 0.5, the probability function is symmetrical around the mean. If p < 0.5, the distribution is positive asymmetrical, observing a higher frequency of smaller values of k and a longer tail to the right. If p > 0.5, the distribution is negative asymmetrical, observing a higher frequency of larger values of k and a longer tail to the left.

It is important to mention that we are going to use all knowledge on the binomial distribution when studying multinomial logistics regression models (Chapter 14).

6.3.3.1 Relationship Between the Binomial and the Bernoulli Distributions

A binomial distribution with parameter n = 1 is equivalent to a Bernoulli distribution:

$X ~ b (1, p) \equiv X ~ Bern (p)$

Example 6.7

A certain part is produced in a production line. The probability of the part not having defects is 99%. If 30 parts are produced, what is the probability of at least 28 of them being in good conditions? Also determine the random variable's mean and variance.

Solution

We have:

X = random variable that represents the number of successes (parts in good conditions) in the 30 repetitions

p = 0.99 = probability of the part being in good conditions

q = 0.01 = probability of the part being defective

n = 30 repetitions

k = number of successes

The probability of at least 28 parts not being defective is given by:

$P (X \geq 28) = P (X = 28) + P (X = 29) + P (X = 30)$

$P (X = 28) = \frac{30!}{28! 2!} \times {(\frac{99}{100})}^{28} \times {(\frac{1}{100})}^{2} = 0.0328$

si82_e

$P (X = 29) = \frac{30!}{29! 1!} \times {(\frac{99}{100})}^{29} \times {(\frac{1}{100})}^{1} = 0.224$

si83_e

$P (X = 30) = \frac{30!}{30! 0!} \times {(\frac{99}{100})}^{30} \times {(\frac{1}{100})}^{0} = 0.7397$

si84_e

$P (X \geq 28) = 0.0328 + 0.224 + 0.7397 = 0.997$

The mean of X is expressed as:

$E (X) = n . p = 30 \times 0.99 = 29.7$

And the variance of X is:

$Var (X) = n . p . (1 - p) = 30 \times 0.99 \times 0.01 = 0.297$

6.3.4 Geometric Distribution

The geometric distribution, as well as the binomial, considers successive independent Bernoulli trials, all of them with probability of success p. However, instead of using a fixed number of trials, they will carry out the experiment until the first success is obtained. The geometric distribution presents two distinct parameterizations, described here.

The first parameterization considers successive independent Bernoulli trials, with probability of success p in each trial, until a success occurs. In this case, we cannot include zero as a possible result, so, the domain is supported by using the set {1, 2, 3, …}. For example, we can consider how many times we tossed a coin until we got the first head, the number of parts manufactured until a defective one was produced, among others.

The second parameterization of the geometric distribution counts the number of failures or unsuccessful attempts before the first success. Since here it is possible to obtain success in the first Bernoulli trial, we include zero as being a possible result, so, the domain is supported by the set {0, 1, 2, 3, …}.

Let X be the random variable that represents the number of trials until the first success. Variable X has a geometric distribution with parameter p, denoted by X ~ Geo(p), if its probability function is given by:

$f (x) = P (X = x) = p . {(1 - p)}^{x - 1}, x = 1, 2, 3, \dots$

(6.32)

For the second case, let us consider Y the random variable that represents the number of failures or unsuccessful attempts before the first success. Variable Y has a geometric distribution with parameter p, denoted by Y ~ Geo(p), if its probability function is given by:

$f (y) = P (Y = y) = p . {(1 - p)}^{y}, y = 0, 1, 2, \dots$

(6.33)

In both cases, the sequence of probabilities is a geometric progression.

The probability function of variable X is graphically represented in Fig. 6.7, for p = 0.4.

The calculation of X’s expected value and variance is:

$E (X) = \frac{1}{p}$

si90_e (6.34)

$Var (X) = \frac{1 - p}{p^{2}}$

si91_e (6.35)

In a similar way, for variable Y, we have:

$E (Y) = \frac{1 - p}{p}$

si92_e (6.36)

$Var (Y) = \frac{1 - p}{p^{2}}$

si93_e (6.37)

The geometric distribution is the only discrete distribution that has the memoryless property (in the case of continuous distributions, we will see that the exponential distribution also has this property). This means that if an experiment is repeated before the first success, then, given that the first success has not happened yet, the conditional distribution function of the number of additional trials does not depend on the number of failures that occurred until then.

Thus, for any two positive integers s and t, if X is greater than s, then, the probability of X being greater than s + t is equal to the unconditional probability of X being greater than t:

$P (X > s + t |X > s) = P (X > t)$

(6.38)

Example 6.8

A company manufactures a certain electronic component and, at the end of the process, each component is tested, one by one. Assume that the probability of one electronic component being defective is 0.05. Determine the probability of the first defect being found in the eighth component tested. Also determine the random variable’s expected value and variance.

Solution

We have:

X = random variable that represents the number of electronic components tested until the first defect is found;

p = 0.05 = probability of the component being defective;

q = 0.95 = probability of the component being in good conditions.

The probability of the first defect being found in the eighth component tested is given by:

$P (X = 8) = 0.05 {(1 - 0.05)}^{8 - 1} = 0.035$

The mean of X is expressed as:

$E (X) = \frac{1}{p} = 20$

si96_e

And the variance of X is:

$Var (X) = \frac{1 - p}{p^{2}} = \frac{0.95}{0.0025} = 380$

si97_e

6.3.5 Negative Binomial Distribution

The negative binomial distribution, also known as the Pascal distribution, carries out successive independent Bernoulli trials (with a constant probability of success in all the trials) until it achieves a prefixed number of successes (k), that is, the experiment continues until k successes are achieved.

Let X be the random variable that represents the number of attempts carried out (Bernoulli trials) until the k-th success is reached. Variable X has a negative binomial distribution, denoted by X ~ nb(k, p), if its probability function is given by:

$f (x) = P (X = x) = (\begin{array}{l} x - 1 \\ k - 1 \end{array}) . p^{k} . {(1 - p)}^{x - k}, x = k, k + 1, \dots$

si98_e (6.39)

The graphical representation of a negative binomial distribution with parameter k = 2 and p = 0.4 can be found in Fig. 6.8.

The expected value of X is:

$E (X) = \frac{k}{p}$

si99_e (6.40)

and the variance is:

$Var (X) = \frac{k . (1 - p)}{p^{2}}$

si100_e (6.41)

6.3.5.1 Relationship Between the Negative Binomial and the Binomial Distributions

The negative binomial distribution is related to the binomial distribution. In the binomial, we must set the sample size (number of Bernoulli trials) and observe the number of successes (random variable). In the negative binomial, we must set the number of successes (k) and observe the number of Bernoulli trials necessary to obtain k successes.

6.3.5.2 Relationship Between the Negative Binomial and the Geometric Distributions

The negative binomial distribution with parameter k = 1 is equivalent to the geometric distribution:

$X ~ n b (1, p) ≡ X ~ G e o (p)$

Or, a negative binomial series can be considered to be a sum of geometric series.

It is important to mention that we are going to use all knowledge on the negative binomial distribution when studying the regression models for count data (Chapter 15).

Example 6.9

Assume that a student gets three questions right every five tests. Let X be the number of attempts until the twelfth correct answer. Determine the probability of the student having to answer 20 questions in order to get 12 right.

Solution

We have:

k = 12, p = 3/5 = 0.6, q = 2/5 = 0.4

X = number of attempts until the twelfth correct answer, that is, X ~ nb(12; 0.6). Therefore:

$f (20) = P (X = 20) = (\begin{array}{l} 20 - 1 \\ 12 - 1 \end{array}) \times {0.6}^{12} \times {0.4}^{20 - 12} = 0.1078 = 10.78 %$

si102_e

6.3.6 Hypergeometric Distribution

The hypergeometric distribution is also related to a Bernoulli trial. However, differently from the binomial sampling, in which the probability of success is constant, in the hypergeometric distribution, since the sampling is without replacement, as the elements are removed from the population to form the sample, the population size diminishes, making the probability of success vary.

The hypergeometric distribution describes the number of successes in a sample with n elements, drawn from a finite population without replacement. For example, let us consider a population with N elements, from which M have a certain attribute. The hypergeometric distribution describes the probability of exactly k elements having such attribute (k successes and n − k failures), in a sample with n distinct elements randomly drawn from the population without replacement.

Let X be a random variable that represents the number of successes obtained from the n elements drawn from the sample. Variable X follows a hypergeometric distribution with parameters N, M, n, denoted by X ~ Hip(N, M, n), if its probability function is given by:

$f (k) = P (X = k) = \frac{(\begin{array}{l} M \\ k \end{array}) . (\begin{array}{l} N - M \\ n - k \end{array})}{(\begin{array}{l} N \\ n \end{array})}, 0 \leq k \leq min (M, n)$

si103_e (6.42)

The graphical representation of a hypergeometric distribution with parameters N = 200, M = 50, and n = 30 can be found in Fig. 6.9.

The mean of X can be calculated as:

$E (X) = \frac{n . M}{N}$

si104_e (6.43)

with variance:

$Var (X) = \frac{n . M}{N} . \frac{(N - M) . (N - n)}{N . (N - 1)}$

si105_e (6.44)

6.3.6.1 Approximation of the Hypergeometric Distribution by the Binomial

Let X be a random variable that follows a hypergeometric distribution with parameters N, M, and n, denoted by X ~ Hip(N, M, n). If the population is large when compared to the sample size, the hypergeometric distribution can be approximated by a binomial distribution with parameters n and p = M/N (probability of success in a single trial):

$X ~ Hip (N, M, n) \approx X ~ b (n, p), with p = M / N$

Example 6.10

A gravity-pick machine contains 15 balls and 5 of them are red. 7 balls are chosen randomly, without replacement. Determine:

a) The probability of exactly two red balls being drawn.
b) The probability of at least two red balls being drawn.
c) The expected number of red balls drawn.
d) The variance of the number of red balls drawn.

Solution

Let X be the random variable that represents the number of red balls drawn. We have N = 15, M = 5, and n = 7.

a) $P (X = 2) = \frac{(\begin{array}{l} M \\ k \end{array}) . (\begin{array}{l} N - M \\ n - k \end{array})}{(\begin{array}{l} N \\ n \end{array})} = \frac{(\begin{array}{l} 5 \\ 2 \end{array}) . (\begin{array}{l} 10 \\ 5 \end{array})}{(\begin{array}{l} 15 \\ 7 \end{array})} = 39.16 %$
b) $P (X \geq 2) = 1 - P (X < 2) = 1 - [P (X = 0) + P (X = 1)] = 1 - \frac{(\begin{array}{l} 5 \\ 0 \end{array}) . (\begin{array}{l} 10 \\ 7 \end{array})}{(\begin{array}{l} 15 \\ 7 \end{array})} - \frac{(\begin{array}{l} 5 \\ 1 \end{array}) . (\begin{array}{l} 10 \\ 6 \end{array})}{(\begin{array}{l} 15 \\ 7 \end{array})} = 81.82 %$
c) $E (X) = \frac{n . M}{N} = \frac{7.5}{15} = 2.33$
d) $Var (X) = \frac{n . M}{N} . \frac{(N - M) . (N - n)}{N . (N - 1)} = \frac{7 \times 5}{5} \times \frac{10 \times 8}{15 \times 14} = 0.8889 = 88.89 %$

6.3.7 Poisson Distribution

The Poisson distribution is used to register the occurrence of rare events, with a very low probability of success (p → 0), in a certain time interval or space.

Differently from the binomial model that provides the probability of the number of successes in a discrete interval (n repetitions of an experiment), the Poisson model provides the probability of the number of successes in a certain continuous interval (time, area, among other possibilities). As examples of variables that represent a Poisson distribution, we can mention the number of customers that arrive in a line per unit of time, the number of defects per unit of time, the number of accidents per unit of area, among others. Note that the measurement units (time and area, in these situations) are continuous, but the random variable (number of occurrences) is discrete.

The Poisson distribution presents the following hypotheses:

(i) Events defined in nonoverlapping intervals are independent;
(ii) In intervals with the same length, the probabilities that the same number of successes will occur are equal;
(iii) In very small intervals, the probability that more than one success will occur is insignificant;
(iv) In very small intervals, the probability of one success is proportional to the length of the interval.

Let us consider a discrete random variable X that represents the number of successes (k) in a certain unit of time, unit of area, among other possibilities. Random variable X, with parameter λ ≥ 0, follows a Poisson distribution, denoted by X ~ Poisson (λ), if its probability function is given by:

$f (k) = P (X = k) = \frac{e^{- λ} . λ^{k}}{k!}, k = 0, 1, 2, \dots$

si111_e (6.45)

where:

e: base of the Napierian (or natural) logarithm, and e ≅ 2.718282;
λ: estimated average rate of occurrence of the event we are interested in for a certain exposition (time interval, area, among other examples).

Fig. 6.10 shows the Poisson distribution probability function for λ = 1, 3, and 6.

In the Poisson distribution, the mean is equal to the variance:

$E (X) = VAR (X) = λ$

(6.46)

It is important to mention that we are going to use all knowledge on the Poisson distribution when studying the regression models for count data (Chapter 15).

6.3.7.1 Approximation of the Binomial by the Poisson Distribution

Let X be a random variable that follows a binomial distribution with parameters n and p, denoted by X ~ b(n, p). When the number of repetitions of a random experiment is very high (n → ∞) and the probability of success is very low (p → 0), such that n. p = λ = constant, the binomial distribution gets closer to the Poisson distribution:

$X ~ b (n, p) \approx X ~ Poisson (λ), com λ = n . p$

Example 6.11

Assume that the number of customers that arrive at a bank follows a Poisson distribution. We verified that, on average, 12 customers arrive at the bank per minute. Calculate: (a) the probability of 10 customers arriving in the next minute; (b) the probability of 40 customers arriving in the next 5 minutes; (c) X’s mean and variance.

Solution

We have λ = 12 customers per minute.

a) $P (X = 10) = \frac{e^{- 12} \times 12^{10}}{10!} = 0.1048$
b) $P (X = 8) = \frac{e^{- 12} \times 12^{8}}{8!} = 0.0655$
c) E(X) = VAR(X) = λ = 12

Example 6.12

A certain part is produced in a production line. The probability of the part being defective is 0.01. If 300 parts are produced, what is the probability of none of them being defective?

Solution

This example is characterized by a binomial distribution. Since the number of repetitions is high and the probability of success is low, the binomial distribution can be approximated by a Poisson distribution with parameter λ = n. p = 300 × 0.01 = 3, such that:

$P (X = 0) = \frac{e^{- 3} \times 3^{0}}{0!} = 0.05$

si116_e

6.4 Probability Distributions for Continuous Random Variables

For continuous random variables, we are going to study the uniform, normal, exponential, gamma, chi-square (χ²), Student's t, and Snedecor's F distributions.

6.4.1 Uniform Distribution

The uniform model is the simplest model for continuous random variables. It is used to model the occurrence of events whose probability is constant in intervals with the same range.

A random variable X follows a uniform distribution in the interval [a, b], denoted by X ~ U[a, b], if its probability density function is given by:

$f (x) = \{\begin{cases} 1 / (b - a), if a \leq x \leq b \\ 0, otherwise \end{cases}$

si117_e (6.47)

which can be graphically represented as seen in Fig. 6.11.

The expected value of X is calculated by the expression:

$E (X) = \int_{a}^{b} x \frac{1}{b - a} dx = \frac{a + b}{2}$

si118_e (6.48)

Table 6.1 presents a summary of the discrete distributions studied in this section, including the calculation of the random variable's probability function, the distribution parameters, besides the calculation of X’s expected value and variance.

Table 6.1

Models for Discrete Variables
Distribution	Probability Function – P(X)	Parameters	E(X)	Var(X)
Discrete uniform	$\frac{1}{n}$	n	$\frac{1}{n} . \sum_{i = 1}^{n} x_{i}$	$\frac{1}{n} . [\sum_{i = 1}^{n} x_{i}^{2} - \frac{{(\sum_{i = 1}^{n} x_{i})}^{2}}{n}]$
Bernoulli	p^x. (1 − p)^1 − x, x = 0, 1	p	p	p. (1 − p)
Binomial	$(\begin{array}{l} n \\ k \end{array}) . p^{k} . {(1 - p)}^{n - k}, k = 0, 1, \dots, n$	n, p	n.p	n. p. (1 − p)
Geometric	P(X) = p. (1 − p)^x − 1, x = 1, 2, 3, … P(Y) = p. (1 − p)^y, y = 0, 1, 2, …	p	$E (X) = \frac{1}{p}$ $E (Y) = \frac{1 - p}{p}$	$V a r (X) = \frac{1 - p}{p^{2}} V a r (Y) = \frac{1 - p}{p^{2}}$
Negative binomial	$(\begin{array}{l} x - 1 \\ k - 1 \end{array}) . p^{k} . {(1 - p)}^{x - k}, x = k, k + 1, \dots$	k, p	$\frac{k}{p}$	$\frac{k . (1 - p)}{p^{2}}$
Hypergeometric	$\frac{(\begin{array}{l} M \\ k \end{array}) . (\begin{array}{l} N - M \\ n - k \end{array})}{(\begin{array}{l} N \\ n \end{array})}, 0 \leq k \leq min (M, n)$	N, M, n	$\frac{n . M}{N}$	$\frac{n . M}{N} . \frac{(N - M) . (N - n)}{N . (N - 1)}$
Poisson	$\frac{e^{- λ} . λ^{k}}{k!}, k = 0, 1, 2, \dots$	λ	λ	λ

Table 6.1

And the variance of X is:

$Var (X) = E (X^{2}) - {[E (X)]}^{2} = \frac{{(b - a)}^{2}}{12}$

si119_e (6.49)

On the other hand, the cumulative distribution function of the uniform distribution is given by:

$F (x) = P (X \leq x) = \int_{a}^{x} f (x) d x = \int_{a}^{x} \frac{1}{b - a} d x = \{\begin{cases} 0, if x < a \\ \frac{x - a}{b - a}, if a \leq x < b \\ 1, if x \geq b \end{cases}$

si120_e (6.50)

Example 6.13

Random variable X represents the time a bank's ATM machines are used (in minutes), and it follows a uniform distribution in the interval [1, 5]. Determine:

a) P(X < 2)
b) P(X > 3)
c) P(3 < X < 5)
d) E(X)
e) VAR(X)

Solution

a) P(X < 2) = F(2) = (2 − 1)/(5 − 1) = 1/4
b) P(X > 3) = 1 − P(X < 3) = 1 − F(3) = 1 − (3 − 1)/(5 − 1) = 1/2
c) P(3 < X < 4) = F(4) − F(3) = (4 − 1)/(5 − 1) − (3 − 1)/(5 − 1) = 1/4
d) $E (X) = \frac{(1 + 5)}{2} = 3$
e) $VAR (X) = \frac{{(5 - 1)}^{2}}{12} = \frac{4}{3}$

6.4.2 Normal Distribution

The normal distribution, also known as Gaussian, is the most widely used and important probability distribution, because it allows us to model a myriad of natural phenomena, studies of human behavior, industrial processes, among others. In addition to allowing us to use approximations to calculate the probabilities of many random variables.

A random variable X, with mean μ ∈ ℜ and standard deviation σ > 0, follows a normal or Gaussian distribution, denoted by X ~ N (μ, σ²), if its probability density function is given by:

$f (x) = \frac{1}{σ . \sqrt{2 π}} . e^{- \frac{{(x - μ)}^{2}}{2 . σ^{2}}}, - \infty \leq x \leq + \infty,$

si123_e (6.51)

whose graphical representation is shown in Fig. 6.12.

Fig. 6.13 shows the area under the normal curve based on the number of standard deviations.

From Fig. 6.13, we can see that the curve has the shape of a bell and is symmetrical around parameter μ, and the smaller parameter σ is, the more concentrated the curve is around μ.

Therefore, in a normal distribution, the mean of X is:

$E (X) = μ$

(6.52)

And the variance of X is:

$Var (X) = σ^{2}$

(6.53)

In order to obtain the standard normal distribution or the reduced normal distribution, the original variable X is transformed into a new random variable Z, with mean 0 (μ = 0) and variance 1 (σ² = 1):

$Z = \frac{X - μ}{σ} ~ N (0, 1)$

si126_e (6.54)

Score Z represents the number of standard deviations that separates a random variable X from the mean.

This kind of transformation, known as Zscores, is broadly used to standardize variables, because it does not change the shape of the original variable's normal distribution, and it generates a new variable with mean 0 and variance 1. Therefore, when many variables with different orders of magnitude are being used in a certain type of modeling, the Zscores standardization process will make all the new standardized variables have the same distribution, with equal orders of magnitude (Fávero et al., 2009).

The probability density function of random variable Z is reduced to:

$f (z) = \frac{1}{\sqrt{2 π}} . e^{\frac{- z^{2}}{2}}, - \infty \leq z \leq + \infty$

si127_e (6.55)

whose graphical representation is shown in Fig. 6.14.

The cumulative distribution function F(x_c) of a normal random variable X is obtained by integrating Expression (6.51) from −∞ to x_c, that is:

$F (x_{c}) = P (X \leq x_{c}) = \int_{- \infty}^{x_{c}} f (x) dx$

si128_e (6.56)

Integral (6.56) corresponds to the area under f(x) from −∞ to x_c, as shown in Fig. 6.15.

In the specific case of the standard normal distribution, the cumulative distribution function is:

$F (z_{c}) = P (Z \leq z_{c}) = \int_{- \infty}^{z_{c}} f (z) dz = \frac{1}{\sqrt{2 π}} \int_{- \infty}^{z_{c}} e^{- \frac{z^{2}}{2} dz}$

si129_e (6.57)

For a random variable Z with a standard normal distribution, let us suppose that the main goal now is to calculate P(Z > z_c). So, we have:

$P (Z > z_{c}) = \int_{z_{c}}^{\infty} f (z) dz = \frac{1}{\sqrt{2 π}} \int_{z_{c}}^{\infty} e^{\frac{- z^{2}}{2} dz}$

si130_e (6.58)

Fig. 6.16 represents this probability.

Table E in the Appendix shows the value of P(Z > z_c), that is, the cumulative probability from z_c to +∞ (the gray area under the normal curve).

6.4.2.1 Approximation of the Binomial by the Normal Distribution

Let X be a random variable that has a binomial distribution with parameters n and p, denoted by X ~ b(n, p). As the average number of successes and the average number of failures tend to infinity (n. p → ∞ and n. (1 − p) → ∞), the binomial distribution gets closer to a normal one with mean μ = n. p and variance σ² = n. p. (1 − p):

$X ~ b (n, p) \approx X ~ N (μ, σ^{2}), com μ = n . p e σ^{2} = n . p . (1 - p)$

Some authors admit that the approximation of the binomial by the normal distribution is good when n. p > 5 and n. (1 − p) > 5, or when n. p. (1 − p) ≥ 3. A better and more conservative rule requires n. p > 10 and n. (1 − p) > 10.

However, since it is a discrete approximation through a continuous one, we recommend greater accuracy, carrying out a continuity correction that consists in, for example, transforming P(X = x) into the interval P(x − 0.5 < X < x + 0.5).

6.4.2.2 Approximation of the Poisson by the Normal Distribution

Analogous to the binomial distribution, the Poisson distribution can also be approximated by a normal one. Let X be a random variable that follows a Poisson distribution with parameter λ, denoted by X ~ Poisson(λ). Since λ → ∞, the Poisson distribution gets closer to a normal one with mean μ = λ and variance σ² = λ:

$X ~ Poisson (λ) \approx X ~ N (μ, σ^{2}), with μ = λ and σ^{2} = λ$

In general, we admit that the approximation of the Poisson distribution by the normal distribution is good when λ > 10.

Once again, we recommend using the continuity correction x − 0.5 and x + 0.5.

Example 6.14

We know that the average thickness of the hose storage units produced in a factory (X) follows a normal distribution with a mean of 3 mm and a standard deviation of 0.4 mm. Determine:

a) P(X > 4.1)
b) P(X > 3)
c) P(X ≤ 3)
d) P(X ≤ 3.5)
e) P(X < 2.3)
f) P(2 ≤ X ≤ 3.8)

Solution

The probabilities will be calculated based on Table E in the Appendix, which provides the value of P(Z > z_c):

a) $P (X > 4.1) = P (Z > \frac{4.1 - 3}{0.4}) = P (Z > 2.75) = 0.0030$
b) $P (X > 3) = P (Z > \frac{3 - 3}{0.4}) = P (Z > 0) = 0.5$
c) P(X ≤ 3) = P(Z ≤ 0) = 0.5
d) $\begin{array}{l} P (X \leq 3.5) = P (Z \leq \frac{3.5 - 3}{0.4}) = P (Z \leq 1.25) = 1 - P (Z > 1.25) \\ = 1 - 0.1056 = 0.8944 \end{array}$
e) $P (X < 2.3) = P (Z < \frac{2.3 - 3}{0.4}) = P (Z < - 1.75) = P (Z > 1.75) = 0.04$
f) $\begin{array}{l} P (2 \leq X \leq 3.8) = P (\frac{2 - 3}{0.4} \leq Z \leq \frac{3.8 - 3}{0.4}) = P (- 2.5 \leq Z \leq 2) \\ = P (Z \leq 2) - P (Z < - 2.5) = [1 - P (Z > 2)] - P (Z > 2.5) = \\ = [1 - 0.0228] - 0.0062 = 0.971 \end{array}$

6.4.3 Exponential Distribution

Another important distribution, which has applications in system reliability and in the queueing theory, is the exponential distribution. It has as its main characteristic the property of being memoryless, that is, the future lifetime (t) of a certain object has the same distribution, regardless of its past lifetime (s), for any s, t > 0, as shown in Expression (6.38), once again shown below:

$P (X > s + t |X > s) = P (X > t)$

A continuous random variable X has an exponential distribution with parameter λ > 0, denoted by X ~ exp(λ), if its probability density function is given by:

$f (x) = \{\begin{cases} λ . e^{- λ . x}, if x \geq 0 \\ 0, if x < 0 \end{cases}$

si139_e (6.59)

Fig. 6.17 represents the probability density function of the exponential distribution for parameters λ = 0.5, λ = 1, and λ = 2.

We can see that the exponential distribution is positive asymmetrical (to the right), observing a higher frequency for smaller values of x and a longer tail to the right. The density function assumes value λ when x = 0, and tends to zero as x → ∞. The higher the value of λ, the more quickly the function tends to zero.

In the exponential distribution, the mean of X is:

$E (X) = \frac{1}{λ}$

si140_e (6.60)

and the variance of X is:

$Var (X) = \frac{1}{λ^{2}}$

si141_e (6.61)

And the cumulative distribution function F(x) is given by:

$F (x) = P (X \leq x) = \int_{0}^{x} f (x) dx = \{\begin{cases} 1 - e^{- λ . x}, if x \geq 0 \\ 0, if x < 0 \end{cases}$

si142_e (6.62)

From (6.62) we can conclude that:

$P (X > x) = e^{- λ . x}$

(6.63)

In system reliability, random variable X represents the lifetime, that is, the time during which a component or system remains operational, outside the interval for repairs and above a specified limit (yield, pressure, among other examples). On the other hand, parameter λ represents the failure rate, that is, the number of components or systems that failed in a pre-established time interval:

$λ = \frac{number of failures}{operation time}$

si144_e (6.64)

The main measures of reliability are: (a) Mean time to failure (MTTF) and (b) Mean time between failures (MTBF). Mathematically, MTTF and MTBF are equal to the mean of the exponential distribution and represent the mean lifetime. Thus, the failure rate can also be calculated as:

$λ = \frac{1}{MTTF . (MTBF)}$

si145_e (6.65)

In the queueing theory, random variable X represents the mean waiting time until the next arrival (mean time between two customers’ arrivals). On the other hand, parameter λ represents the mean arrivals rate, that is, the expected number of arrivals per unit of time.

6.4.3.1 Relationship Between the Poisson and the Exponential Distribution

If the number of occurrences in a counting process follows a Poisson distribution (λ), then, the random variables “time until the first occurrence” and “time between any successive occurrences” of the aforementioned process have an exp(λ) distribution.

Example 6.15

The life span of an electronic component follows an exponential distribution with a mean lifetime of 120 hours. Determine:

a) The probability of a component failing in the first 100 hours of use;
b) The probability of a component lasting more than 150 hours.

Solution

Assume that λ = 1/120 and X ~ exp(1/120). Therefore:

a) $P (X \leq 100) = \int_{0}^{100} 120 . e^{- \frac{x}{120}} dx = {- \frac{120 . e^{- \frac{x}{120}}}{120}|}_{0}^{100} = {- e^{- \frac{x}{120}}|}_{0}^{100} = - e^{- \frac{100}{120}} + 1 = 0.5654$
b) $P (X > 150) = \int_{150}^{\infty} 120 . e^{- \frac{x}{120}} dx = {- \frac{120 . e^{- \frac{x}{120}}}{120}|}_{150}^{\infty} = {- e^{- \frac{x}{120}}|}_{150}^{\infty} = e^{- \frac{150}{120}} = 0.2865$

6.4.4 Gamma Distribution

The gamma distribution is one of the most general, such that, other distributions, as the Erlang, exponential, and chi-square (χ²) are particular cases of it. As the exponential distribution, it is also widely used in system reliability. The gamma distribution also has applications in physical phenomena, in meteorological processes, insurance risk theory, and economic theory.

A continuous random variable X has a gamma distribution with parameters α > 0 and λ > 0, denoted by X ~ Gamma(α, λ), if its probability density function is given by:

$f (x) = \{\begin{cases} \frac{λ^{α}}{Γ (α)} . x^{α - 1} . e^{- λ . x}, if x \geq 0 \\ 0, if x < 0 \end{cases}$

si148_e (6.66)

where Γ(α) is the Gamma function, given by:

$Γ (α) = \int_{0}^{\infty} e^{- x} . x^{α - 1} dx, α > 0$

si149_e (6.67)

The gamma probability density function for some values of α and λ is represented in Fig. 6.18.

We can see that the gamma distribution is positive asymmetrical (to the right), observing a higher frequency for smaller values of x and a longer tail to the right. However, as α tends to infinity, the distribution becomes symmetrical. We can also observe that when α = 1, the gamma distribution is equal to the exponential. Moreover, the greater the value of λ, the more quickly the density function tends to zero.

The expected value of X can be calculated as:

$E (X) = α . λ$

(6.68)

On the other hand, the variance of X is given by:

$Var (X) = α . λ^{2}$

(6.69)

The cumulative distribution function is:

$F (x) = P (X \leq x) = \int_{0}^{x} f (x) dx = \frac{λ^{α}}{Γ (α)} \int_{0}^{x} x^{α - 1} . e^{- λx} dx$

si152_e (6.70)

6.4.4.1 Special Cases of the Gamma Distribution

A gamma distribution with parameter α, a positive integer, is called an Erlang distribution, such that:

If α is a positive integer ⇒ X ~ Gamma(α, λ) ≡ X ~ Erlang(α, λ)

As mentioned before, a gamma distribution with parameter α = 1 is called an exponential distribution:

If α = 1 ⇒ X ~ Gamma(α, λ) ≡ X ~ exp(λ)

Or, a gamma distribution with parameter α = n/2 and λ = 1/2 is called a chi-square distribution with ν degrees of freedom:

If α = n/2, λ = 1/2 ⇒ X ~ Gamma(n/2, 1/2) ≡ $χ ~ χ_{v = n}^{2}$

6.4.4.2 Relationship Between the Poisson and the Gamma Distribution

In the Poisson distribution, we try to determine the number of occurrences of a certain event within a fixed period. On the other hand, the gamma distribution determines the time necessary to obtain a specified number of occurrences of the event.

6.4.5 Chi-Square Distribution

A continuous random variable X has a chi-square distribution with ν degrees of freedom, denoted by X ~ χ_ν², if its probability density function is given by:

$f (x) = \{\begin{cases} \frac{1}{2^{ν / 2} . Γ (ν / 2)} . x^{ν / 2 - 1} . e^{- x / 2}, x > 0 \\ 0, x < 0 \end{cases}$

si154_e (6.71)

where $Γ (α) = \int_{0}^{\infty} e^{- x} . x^{α - 1} dx$

The χ² distribution can be simulated from a normal distribution. Consider Z₁, Z₂, …, Z_ν independent random variables with a standard normal distribution (mean 0 and standard deviation 1). So, the sum of the squares of the ν random variables is a chi-square distribution with ν degrees of freedom:

$χ_{ν}^{2} = Z_{1}^{2} + Z_{2}^{2} + \dots + Z_{ν}^{2}$

(6.72)

The χ² distribution has a positive asymmetrical curve. The graphical representation of the χ² distribution, for different values of ν, is shown in Fig. 6.19.

Fig. 6.19 χ² distribution for different values of ν.

Since the χ² distribution comes from the sum of the squares of ν random variables that have a normal distribution with mean zero and variance 1, for high values of ν, the χ² distribution becomes more similar to a standard normal distribution, as can be observed in Fig. 6.19 (Fávero et al., 2009). We can also see that the χ² distribution with 2 degrees of freedom is equivalent to an exponential distribution with λ = 1/2.

The expected value of X can be calculated as:

$E (X) = ν$

(6.73)

On the other hand, the variance of X is given by:

$Var (X) = 2 . ν$

(6.74)

The cumulative distribution function is:

$F (x_{c}) = P (X \leq x_{c}) = \int_{- \infty}^{x_{c}} f (x) dx = \frac{γ (ν / 2, x_{c} / 2)}{Γ (ν / 2)}$

si159_e (6.75)

where $γ (a, x_{c}) = \int_{0}^{x_{c}} x^{a - 1} . e^{- x} dx$

If the main goal is to calculate P(X > x_c), we have:

$P (X > x_{c}) = \int_{x_{c}}^{\infty} f (x) dx$

si161_e (6.76)

which can be represented by Fig. 6.20.

The χ² distribution has several applications in statistical inference. Due to its importance, the χ² distribution can be found in Table D in the Appendix, for different values of parameter ν. This table provides the critical values of x_c such that P(X > x_c) = α. In other words, we can obtain the calculation of the probabilities and of the cumulative probability density function for different values of x from random variable X.

Example 6.16

Assume that random variable X follows a chi-square distribution (χ²) with 13 degrees of freedom. Determine:

a) P(X > 5)
b) The x value such that P(X ≤ x) = 0.95
c) The x value such that P(X > x) = 0.95

Solution

Through the χ² distribution table (Table D in the Appendix), for ν = 13, we have:

a) P(X > 5) = 97.5%
b) 22.362
c) 5.892

6.4.6 Student’s t Distribution

Student’s t distribution was developed by William Sealy Gosset, and it is one of the main probability distributions, with several applications in statistical inference.

We are going to assume a random variable Z that has a normal distribution with mean 0 and standard deviation 1, and a random variable X with a chi-square distribution with ν degrees of freedom, such that, Z and X are independent. Continuous random variable T is then defined as:

$T = \frac{Z}{\sqrt{X / ν}}$

si162_e (6.77)

We can say that variable T follows Student’s t distribution with ν degrees of freedom, denoted by T ~ t_ν, if its probability density function is given by:

$f (t) = \frac{Γ (\frac{ν + 1}{2})}{Γ (\frac{ν}{2}) . \sqrt{πν}} \cdot {(1 + \frac{t^{2}}{ν})}^{- \frac{ν + 1}{2}}, - \infty < t < \infty$

si163_e (6.78)

where $Γ (α) = \int_{0}^{\infty} e^{- x} . x^{α - 1} dx$

Fig. 6.21 shows the behavior of Student’s t distribution probability density function for different degrees of freedom ν, in comparison to the standardized normal distribution.

Note that Student’s t distribution is symmetrical around the mean, with the shape of a bell, and is similar to a standard normal distribution. However, it has wider tails, which can generate more extreme values than a normal distribution.

Parameter ν (number of degrees of freedom) defines and characterizes Student’s t distribution’s shape. The greater the value of ν, the more Student’s t distribution gets closer to a standardized normal distribution.

The expected value of T is given by:

$E (T) = 0$

(6.79)

On the other hand, the variance of T can be calculated as:

$Var (T) = \frac{ν}{ν - 2}, ν > 2$

(6.80)

The cumulative distribution function is:

$F (t_{c}) = P (T \leq t_{c}) = \int_{- \infty}^{t_{c}} f (t) dt$

si167_e (6.81)

If the main goal is to calculate P(T > t_c), we have:

$P (T > t_{c}) = \int_{t_{c}}^{\infty} f (t) dt$

si168_e (6.82)

as shown in Fig. 6.22.

Just as the normal and chi-square (χ²) distributions, Student’s t distribution has several applications in statistical inference, such that, there is a table to obtain the probabilities, based on different values of parameter ν (Table B in the Appendix). This table provides the critical values of t_c such that P(T > t_c) = α. In other words, we can obtain the calculation of the probabilities and of the cumulative probability density function for different values of t from random variable T.

We are going to use Student's t distribution when studying simple and multiple regression models (Chapter 13).

Example 6.17

Assume that random variable T follows Student’s t distribution with 7 degrees of freedom. Determine:

a) P(T > 3.5)
b) P(T < 3)
c) P(T < − 0.711)
d) The t value such that P(T ≤ t) = 0.95
e) The t value such that P(T > t) = 0.10

Solution

a) 0.5%
b) 99%
c) 25%
d) 1.895
e) 1.415

6.4.7 Snedecor’s F Distribution

Snedecor's F distribution, also known as Fisher’s distribution, is frequently used in tests associated to the analysis of variance (ANOVA), to compare the means of more than two populations.

Let us consider continuous random variables Y₁ and Y₂, such that:

• Y₁ and Y₂ are independent;
• Y₁ follows a chi-square distribution with ν₁ degrees of freedom, denoted by Y₁ ~ χ_ν₁²;
• Y₂ follows a chi-square distribution with ν₂ degrees of freedom, denoted by Y₂ ~ χ_ν₂².

We are going to define a new continuous random variable X such that:

$X = \frac{Y_{1} / ν_{1}}{Y_{2} / ν_{2}}$

si169_e (6.83)

So, we say that X has a Snedecor’s F distribution with ν₁ and ν₂ degrees of freedom, denoted by X ~ F_{ν₁, ν₂}, if its probability density function is given by:

$f (x) = \frac{Γ (\frac{ν_{1} + ν_{2}}{2}) \cdot {(\frac{ν_{1}}{ν_{2}})}^{ν_{1} / 2} \cdot x^{(ν_{1} / 2) - 1}}{Γ (\frac{ν_{1}}{2}) \cdot Γ (\frac{ν_{2}}{2}) \cdot {[(\frac{ν_{1}}{ν_{2}}) . x + 1]}^{(ν_{1} + ν_{2}) / 2}}, x > 0$

si170_e (6.84)

where

$Γ (α) = \int_{0}^{\infty} e^{- x} . x^{α - 1} dx$

si171_e

Fig. 6.23 shows the behavior of Snedecor’s F distribution probability density function, for different values of ν₁ and ν₂.

We can see that Snedecor's F distribution is positive asymmetrical (to the right), observing a higher frequency for smaller values of x and a longer tail to the right. However, as ν₁ and ν₂ tend to infinity, the distribution becomes symmetrical.

The expected value of X is calculated as:

$E (X) = \frac{ν_{2}}{ν_{2} - 2}, for ν_{2} > 2$

(6.85)

On the other hand, the variance of X is given by:

$Var (X) = \frac{2 . ν_{2}^{2} . (ν_{1} + ν_{2} - 2)}{ν_{1} . (ν_{2} - 4) . {(ν_{2} - 2)}^{2}}, for ν_{2} > 4$

si173_e (6.86)

Just as the normal, χ², and Student’s t distributions, Snedecor’s F distribution has several applications in statistical inference. And there is a table from which we can obtain the probabilities and the cumulative distribution function, based on different values of parameters ν₁ and ν₂ (Table A in the Appendix). This table provides the critical values of F_c such that P(X > F_c) = α (Fig. 6.24).

Fig. 6.24 Critical values of Snedecor's F distribution.

We are going to use Snedecor’s F distribution when studying simple and multiple regression models (Chapter 13).

6.4.7.1 Relationship Between Student’s t and Snedecor’s F Distribution

Let us consider a random variable T with Student's t distribution with ν degrees of freedom. So, the square of variable T follows Snedecor's F distribution with ν₁ = 1 and ν₂ degrees of freedom, as shown by Fávero et al. (2009). Thus:

If T ~ t_ν, then T² ~ F_{1, ν₂}

Example 6.18

Assume that random variable X follows Snedecor’s F distribution with ν₁ = 6 degrees of freedom in the numerator, and ν₂ = 12 degrees of freedom in the denominator, that is, X ~ F_{6, 12}. Determine:

a) P(X > 3)
b) F_{6, 12} with α = 10%
c) The x value such that P(X ≤ x) = 0.975

Solution

Through Snedecor’s F distribution table (Table A in the Appendix), for ν₁ = 6 and ν₂ = 12, we have:

a) P(X > 3) = 5%
b) 2.33
c) 3.73

Table 6.2 shows a summary of the continuous distributions studied in this section, including the calculation of the random variable’s probability function, the distribution parameters, besides the calculation of X’s expected value and variance.

Table 6.2

Models for Continuous Variables
Distribution	Probability Function – P(X)	Parameters	E(X)	Var(X)
Uniform	$\frac{1}{b - a}, a \leq x \leq b$	a, b	$\frac{a + b}{2}$	$\frac{{(b - a)}^{2}}{12}$
Normal	$\frac{1}{σ . \sqrt{2 π}} . e^{- \frac{{(x - μ)}^{2}}{2 σ^{2}}}$ , − ∞ ≤ x ≤ + ∞	μ, σ	μ	σ²
Exponential	λ. e^{− λ. x}, x ≥ 0	λ	$\frac{1}{λ}$	$\frac{1}{λ^{2}}$
Gamma	$\frac{λ^{α}}{Γ (α)} . x^{α - 1} . e^{- λx}, x \geq 0$	α, λ	α. λ	α. λ²
Chi-square (χ²)	$\frac{1}{2^{ν / 2} . Γ (ν / 2)} . x^{ν / 2 - 1} . e^{- x / 2}, x > 0$	ν	ν	2. ν
Student's t	$\frac{Γ (\frac{ν + 1}{2})}{Γ (\frac{ν}{2}) . \sqrt{πν}} \cdot {(1 + \frac{t^{2}}{ν})}^{- \frac{ν + 1}{2}}$ ,− ∞ < t < ∞	ν	E(T) = 0	$Var (T) = \frac{ν}{ν - 2}$
Snedecor's F	$\frac{Γ (\frac{ν_{1} + ν_{2}}{2}) \cdot {(\frac{ν_{1}}{ν_{2}})}^{ν_{1} / 2} \cdot x^{(ν_{1} / 2) - 1}}{Γ (\frac{ν_{1}}{2}) \cdot Γ (\frac{ν_{2}}{2}) \cdot {[(\frac{ν_{1}}{ν_{2}}) . x + 1]}^{(ν_{1} + ν_{2}) / 2}}$ x > 0	ν₁, ν₂	$\frac{ν_{2}}{ν_{2} - 2}$	$\frac{2 . ν_{2}^{2} . (ν_{1} + ν_{2} - 2)}{ν_{1} . (ν_{2} - 4) . {(ν_{2} - 2)}^{2}}$

Table 6.2

6.5 Final Remarks

This chapter discussed the main probability distributions used in statistical inference, including the distributions for discrete random variables (discrete uniform, Bernoulli, binomial, geometric, negative binomial, hypergeometric, and Poisson) and for continuous random variables (uniform, normal, exponential, gamma, chi-square (χ²), Student's t, and Snedecor’s F).

When characterizing probability distributions, it is extremely important to use measures that indicate the most relevant aspects of the distribution, such as, measures of position (mean, median, and mode), measures of dispersion (variance and standard deviation), and measures of skewness and kurtosis.

Understanding the concepts related to probability and to probability distributions helps the researcher in the study of topics related to statistical inference, including parametric and nonparametric hypotheses tests, multivariate analysis through exploratory techniques, and estimation of regression models.

6.6 Exercises

1) In a shoe production line, the probability of a defective item being produced is 2%. For a batch with 150 items, determine the probability of a maximum of two items being defective. Also determine the mean and the variance.
2) The probability of a student solving a certain problem is 12%. If 10 students are selected randomly, what is the probability of exactly one of them being successful?
3) A telemarketing salesman sells one product every 8 customers he contacts. The salesman prepares a list of customers. Determine the probability of the first product being sold in the fifth call, in addition to the expected sales value and the respective variance.
4) The probability of a player scoring a penalty is 95%. Determine the probability of the player having to take a penalty kick 33 times to score 30 goals, besides the mean of penalty kicks.
5) Assume that, in a certain hospital, 3 patients undergo stomach surgery daily, following a Poisson distribution. Calculate the probability of 28 patients undergoing surgery next week (7 business days).
6) Assume that a certain random variable X follows a normal distribution with μ = 8 and σ² = 36. Determine the following probabilities:
1. a) P(X ≤ 12)
2. b) P(X < 5)
3. c) P(X > 2)
4. d) P(6 < X ≤ 11)
7) Consider random variable Z with a standardized normal distribution. Determine critical value z_c such thatP(Z > z_c) = 80%.
8) When tossing 40 balanced coins, determine the following probabilities:
1. a) Of getting exactly 22 heads.
2. b) Of getting more than 25 heads.
Solve this exercise by approximating the distribution through a normal distribution.
9) The time until a certain electronic device fails follows an exponential distribution with a failure rate per hour of 0.028. Determine the probability of a device chosen randomly remaining operational for:
1. a) 120 hours;
2. b) 60 hours.
10) A certain type of device follows an exponential distribution with a mean lifetime of 180 hours. Determine:
1. a) The probability of the device lasting more than 220 hours;
2. b) The probability of the device lasting a maximum of 150 hours.
11) The arrival of patients in a lab follows an exponential distribution with an average rate of 1.8 clients per minute. Determine:
1. a) The probability of the next client’s arrival taking more than 30 seconds;
2. b) The probability of the next client’s arrival taking a maximum of 1.5 minutes.
12) The time between clients’ arrivals in a restaurant follows an exponential distribution with a mean of 3 minutes. Determine:
1. a) The probability of more than 3 clients arriving in 6 minutes;
2. b) The probability of the time until the fourth client arrives being less than 10 minutes.
13) A random variable X has a chi-square distribution with ν = 12 degrees of freedom. What is critical value x_c such that P(X > x_c) = 90%?
14) Now, assume that X follows a chi-square distribution with ν = 16 degrees of freedom. Determine:
1. a) P(X > 25)
2. b) P(X ≤ 32)
3. c) P(25 < X ≤ 32)
4. d) The x value such that P(X ≤ x) = 0.975
5. e) The x value such that P(X > x) = 0.975
15) A random variable T follows Student’s t distribution with ν = 20 degrees of freedom. Determine:
1. a) Critical value t_c such that P(− t_c < t < t_c) = 95%
2. b) E(T)
3. c) Var(T)
16) Now, assume that T follows Student’s t distribution with ν = 14 degrees of freedom. Determine:
1. a) P(T > 3)
2. b) P(T ≤ 2)
3. c) P(1.5 < T ≤ 2)
4. d) The t value such that P(T ≤ t) = 0.90
5. e) The t value such that P(T > t) = 0.025
17) Consider a random variable X that follows Snedecor’s F distribution with ν₁ = 4 and ν₂ = 16 degrees of freedom, that is, X ~ F_{4, 16}. Determine:
1. a) P(X > 3)
2. b) F_{4, 16} with α = 2.5%
3. c) The x value such that P(X ≤ x) = 0.99
4. d) E(X)
5. e) Var(X)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Random Variables and Probability Distributions

Create new playlist

Sign In

Sign Up

6.1 Introduction

6.2 Random Variables

6.2.1 Discrete Random Variable

6.2.1.1 Expected Value of a Discrete Random Variable

6.2.1.2 Variance of a Discrete Random Variable

6.2.1.3 Cumulative Distribution Function of a Discrete Random Variable

6.2.2 Continuous Random Variable

6.2.2.1 Expected Value of a Continuous Random Variable

6.2.2.2 Variance of a Continuous Random Variable

6.2.2.3 Cumulative Distribution Function of a Continuous Random Variable

6.3 Probability Distributions for Discrete Random Variables

6.3.1 Discrete Uniform Distribution

6.3.2 Bernoulli Distribution

6.3.3 Binomial Distribution

6.3.3.1 Relationship Between the Binomial and the Bernoulli Distributions

6.3.4 Geometric Distribution

6.3.5 Negative Binomial Distribution

6.3.5.1 Relationship Between the Negative Binomial and the Binomial Distributions

6.3.5.2 Relationship Between the Negative Binomial and the Geometric Distributions

6.3.6 Hypergeometric Distribution

6.3.6.1 Approximation of the Hypergeometric Distribution by the Binomial

6.3.7 Poisson Distribution

6.3.7.1 Approximation of the Binomial by the Poisson Distribution

6.4 Probability Distributions for Continuous Random Variables

6.4.1 Uniform Distribution

6.4.2 Normal Distribution

6.4.2.1 Approximation of the Binomial by the Normal Distribution

6.4.2.2 Approximation of the Poisson by the Normal Distribution

6.4.3 Exponential Distribution

6.4.3.1 Relationship Between the Poisson and the Exponential Distribution

6.4.4 Gamma Distribution

6.4.4.1 Special Cases of the Gamma Distribution

6.4.4.2 Relationship Between the Poisson and the Gamma Distribution

6.4.5 Chi-Square Distribution

6.4.6 Student’s t Distribution

6.4.7 Snedecor’s F Distribution

6.4.7.1 Relationship Between Student’s t and Snedecor’s F Distribution

6.5 Final Remarks

6.6 Exercises

Table of Contents for
Chapter 6: Random Variables and Probability Distributions