2.5 Probability Distributions

Earlier we discussed the probability values of an event. We now explore the properties of probability distributions. We see how popular distributions, such as the normal, Poisson, binomial, and exponential probability distributions, can save us time and effort. Since a random variable may be discrete or continuous, we consider discrete probability distributions and continuous probability distributions seperately.

Probability Distribution of a Discrete Random Variable

When we have a discrete random variable, there is a probability value assigned to each event. These values must be between 0 and 1, and they must sum to 1. Let’s look at an example.

The 100 students in Pat Shannon’s statistics class have just completed a math quiz that he gives on the first day of class. The quiz consists of five very difficult algebra problems. The grade on the quiz is the number of correct answers, so the grades theoretically could range from 0 to 5. However, no one in this class received a score of 0, so the grades ranged from 1 to 5. The random variable X is defined to be the grade on this quiz, and the grades are summarized in Table 2.7. This discrete probability distribution was developed using the relative frequency approach presented earlier.

Table 2.7 Probability Distribution for Quiz Scores

RANDOM VARIABLE X (SCORE) NUMBER PROBABILITY P(X)
5 10 0.1=10/100
4 20 0.2=20/100
3 30 0.3=30/100
2 30 0.3=30/100
1 10 0.1=10/100
Total 100 1.0=100/100

The distribution follows the three rules required of all probability distributions: (1) the events are mutually exclusive and collectively exhaustive, (2) the individual probability values are between 0 and 1 inclusive, and (3) the total of the probability values is 1.

Although listing the probability distribution as we did in Table 2.7 is adequate, it can be difficult to get an idea about characteristics of the distribution. To overcome this problem, the probability values are often presented in graph form. The graph of the distribution in Table 2.7 is shown in Figure 2.4.

A bar graph that provides a graphic summary of the probability distribution of scores on Dr. Shannon’s algebra quiz.

Figure 2.4 Probability Distribution for Dr. Shannon’s Class

The graph of this probability distribution gives us a picture of its shape. It helps us identify the central tendency of the distribution, called the mean or expected value, and the amount of variability or spread of the distribution, called the variance.

Expected Value of a Discrete Probability Distribution

Once we have established a probability distribution, the first characteristic that is usually of interest is the central tendency of the distribution. The expected value, a measure of central tendency, is computed as the weighted average of the values of the random variable:

E(X)=i=1nXiP(Xi)=X1P(X1)+X2P(X2)++XnP(Xn)
(2-6)

where

  • Xi=random variable's possible values

  • P(Xi)=probability of each possible value of the random variable

  • i=1n=summation sign indicating we are adding all n possible values

  • E(X)=expected value or mean of the random variable

The expected value or mean of any discrete probability distribution can be computed by multiplying each possible value of the random variable, Xi, times the probability, P(Xi), that outcome will occur and summing the results, . Here is how the expected value can be computed for the quiz scores:

E(X)=i=15XiP(Xi)=X1P(X1)+X2P(X2)+X3P(X3)+X4P(X4)+X5P(X5)=(5)(0.1)+(4)(0.2)+(3)(0.3)+(2)(0.3)+(1)(0.1)=2.9

The expected value of 2.9 is the mean score on the quiz.

Variance of a Discrete Probability Distribution

In addition to the central tendency of a probability distribution, most people are interested in the variability or the spread of the distribution. If the variability is low, it is much more likely that the outcome of an experiment will be close to the average or expected value. On the other hand, if the variability of the distribution is high, which means that the probability is spread out over the various random variable values, there is less chance that the outcome of an experiment will be close to the expected value.

The variance of a probability distribution is a number that reveals the overall spread or dispersion of the distribution. For a discrete probability distribution, it can be computed using the following equation:

σ2=Variance=i=1n[XiE(X)]2P(Xi)
(2-7)

where

  • Xi=random variable's possible values

  • E(X)=expected value of the random variable

  • [XiE(X)]=difference between each value of the random variable and the expected value

  • P(Xi)=probability of each possible value of the random variable

To compute the variance, each value of the random variable is subtracted from the expected value, squared, and multiplied times the probability of occurrence of that value. The results are then summed to obtain the variance. Here is how this procedure is done for Dr. Shannon’s quiz scores:

Variance=i=15[Xi-E(X)]2P(Xi)Variance=(5-2.9)2(0.1)+(4-2.9)2(0.2)+(3-2.9)2(0.3)+(2-2.9)2(0.3)+(1-2.9)2(0.1)=(2.1)2(0.1)+(1.1)2(0.2)+(0.1)2(0.3)+(-0.9)2(0.3)+(-1.9)2(0.1)=0.441+0.242+0.003+0.243+0.361=1.29

A related measure of dispersion or spread is the standard deviation. This quantity is also used in many computations involved with probability distributions. The standard deviation is just the square root of the variance:

σ=Variance=σ2
(2-8)

where

=square root
σ=standard deviation

The standard deviation for the random variable X in the example is

σ=Variance=1.29=1.14

These calculations are easily performed in Excel. Program 2.1A provides the output for this example. Program 2.1B shows the inputs and formulas in Excel for calculating the mean, variance, and standard deviation in this example.

Probability Distribution of a Continuous Random Variable

There are many examples of continuous random variables. The time it takes to finish a project, the number of ounces in a barrel of butter, the high temperature during a given day, the exact length of a given type of lumber, and the weight of a railroad car of coal are all examples of continuous random variables. Since random variables can take on an infinite number of values, the fundamental probability rules for continuous random variables must be modified.

As with discrete probability distributions, the sum of the probability values must equal 1. Because there are an infinite number of values of the random variables, however, the probability of each value of the random variable must be 0. If the probability values for the random variable values were greater than 0, the sum would be infinitely large.

A screenshot showing how Excel calculated the standard deviation for the algebra quiz grades.

Program 2.1A Excel 2016 Output for the Dr. Shannon Example

Screenshot of Excel formulas used to attain data calculations.

Program 2.1B Formulas in an Excel Spreadsheet for the Dr. Shannon Example

With a continuous probability distribution, there is a continuous mathematical function that describes the probability distribution. This function is called the probability density function or simply the probability function. It is usually represented by f(X). When working with continuous probability distributions, the probability function can be graphed, and the area underneath the curve represents probability. Thus, to find any probability, we simply find the area under the curve associated with the range of interest.

We now look at the sketch of a sample density function in Figure 2.5. This curve represents the probability density function for the weight of a particular machined part. The weight could vary from 5.06 to 5.30 grams, with weights around 5.18 grams being the most likely. The shaded area represents the probability the weight is between 5.22 and 5.26 grams.

Graph illustrating the probability for the weight of a machined part; the X axis is labelled as Weight in grams, and the Y axis is labelled Probability.

Figure 2.5 Graph of Sample Density Function

If we wanted to know the probability of a part weighing exactly 5.1300000 grams, for example, we would have to compute the area of a line of width 0. Of course, this would be 0. This result may seem strange, but if we insist on enough decimal places of accuracy, we are bound to find that the weight differs from 5.1300000 grams exactly, be the difference ever so slight.

This is important because it means that, for any continuous distribution, the probability does not change if a single point is added to the range of values that is being considered. In Figure 2.5, this means the following probabilities are all exactly the same:

P(5.22 < X < 5.26)=P(5.22 < X  5.26) = P(5.22  X < 5.26)=P(5.2)  X  5.26)

The inclusion or exclusion of either or both endpoints (5.22 or 5.26) has no impact on the probability.

In this section, we have investigated the fundamental characteristics and properties of probability distributions in general. In the next five sections, we introduce three important continuous distributions—the normal distribution, the F distribution, and the exponential distribution—and two discrete distributions—the Poisson distribution and the binomial distribution.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset