CHAPTER
8

Probability Distributions

In This Chapter

  • Defining a random variable and probability distribution
  • Calculating the mean and variance of a discrete probability distribution

Well, we’ve finally arrived at our third and last chapter on general probability concepts. This chapter sets the stage for the last three chapters in Part 2, which will focus on specific types of probability distributions. Before you know it, we’ll be knee deep with inferential statistics.

In this chapter, we will look at the basic concepts of probability distributions and then move on to calculate the mean and variance of probability distributions.

Basic Concepts of Probability Distributions

Now let’s introduce you to probability distributions and prepare you for the last three chapters of Part 2. However, first we need to discuss the topic of random variables, which will lay the groundwork for specific probability distributions in Chapters 9, 10, and 11.

Random Variables

Random variables are variables whose values cannot be determined with certainty before conducting the experiment. They can take on different values and result from experiments. Examples of experiments could be rolling a single die, weighing three randomly selected candy bars, or measuring the wait time for customers in a checkout line.

Random variables can be discrete or continuous. (Does this ring a bell? We talked about the difference in Chapter 2.) A discrete random variable can assume only whole numbers and results from counting the outcomes of an experiment. Discrete random variables come from experiments like counting the number of customers waiting in the checkout line or the outcomes of flipping a coin or rolling a die. A continuous random variable, on the other hand, can assume any numerical value within an interval and results from measuring the outcome of an experiment. Examples of continuous random variables are the wait time of customers in a checkout line, the weight of your favorite candy bar, or the time it takes to fly from Philadelphia to Chicago.

DEFINITION

A random variable is an outcome that takes on a numerical value as a result of an experiment. Random variables can be discrete or continuous. A discrete random variable can assume only whole numbers and results from counting the outcomes of an experiment. A continuous random variable can assume any numerical value within an interval and results from measuring the outcome of an experiment.

Probability Distributions

Random variables have probability distributions associated with them. A probability distribution is nothing more than a table with two columns: the values of the random variable in one column and the probability associated with each value in the second column. For example, in rolling a single die, we know the outcomes and the probability of getting each one. I can put them in a table as follows:

Outcome

Probability

1

⅙ = 0.167

2

⅙ = 0.167

3

⅙ = 0.167

4

⅙ = 0.167

5

⅙ = 0.167

6

⅙ = 0.167

This is the probability distribution. Simple, isn’t it?

DEFINITION

A probability distribution is a table that lists all the values of the random variable and the probability associated with each value.

A probability distribution can be discrete or continuous, depending on the type of random variable used. If the random variable is discrete, then the probability distribution is called a discrete probability distribution. If the random variable is continuous, then the probability distribution is called a continuous probability distribution. We will see examples of each type in the next three chapters. I know it sounds simple–statistics is not difficult, after all!

Chapters 9 and 10 give examples of common discrete probability distributions, whereas Chapter 11 gives an example of a very familiar continuous probability distribution. For now, we will focus solely on discrete random variables.

Let’s look at another example. Bob’s oldest daughter, Christin, was a very accomplished competitive swimmer between the ages of 7 and 13, but her talent certainly didn’t come from Bob’s side of the family. Christin could not only swim, but she also could swim fast.

The following table is a relative frequency distribution showing the number of first-, second-, third-, fourth-, and fifth-place finishes that Christin earned during 50 races.

Place

Number of Races

Relative Frequency (Probability)

1

27

27/50 = 0.54

2

12

12/50 = 0.24

3

7

7/50 = 0.14

4

3

3/50 = 0.06

5

1

1/50 = 0.02

Total = 50

= 1.00

If we define the random variable x = the place Christin finished in a race, the previous table would be the discrete probability distribution for the variable x. From this table, we can state the probability that Christin will finish first as follows:

P(x = 1) = 0.54

Figure 8.1 shows the discrete probability distribution for x graphically.

Figure 8.1

The discrete probability distribution for Christin’s races.

Rules for Discrete Probability Distributions

Any discrete probability distribution needs to meet the following requirements:

  • Each outcome in the distribution needs to be mutually exclusive—that is, the value of the random variable cannot fall into more than one of the frequency distribution classes. For example, it is not possible for Christin to take first and second place in the same race.
  • The probability of each outcome, P(x), must be between 0 and 1; that is, 0 ≤ P(x) ≤ 1 for all values of x. In the previous example, P(x = 3) = 0.14, which falls between 0 and 1.
  • The sum of the probabilities for all the outcomes in the distribution needs to add up to 1; that is, . In the swimming example, the sum of the Relative Frequency (Probability) column in the previous table adds up to 1.

The Mean of Discrete Probability Distributions

The mean of a discrete probability distribution is simply a weighted average (discussed in Chapter 4) calculated using the following formula:

where:

μ = the mean of the discrete probability distribution

xi = the value of the random variable for the ith outcome

P(xi) = the probability that the ith outcome will occur

n = the number of outcomes in the distribution

The table that follows revisits Christin’s swimming probability distribution.

Place (xi)

Probability P(xi)

1

0.54

2

0.24

3

0.14

4

0.06

5

0.02

The mean of this discrete probability distribution is as follows:

μ = (1)(0.54) + (2)(0.24) + (3)(0.14) + (4)(0.06) + (5)(0.02) = 1.78

This mean is telling us that Christin’s average finish for a race is 1.78 place! How does she do that? Obviously, this will never be the result of any one particular race. Rather, it represents the average finish of many races. The mean of a discrete probability distribution does not have to equal one of the values of the random variable (1, 2, 3, 4, or 5 in this case).

Another term for describing the mean of a probability distribution is the expected value, E(x). Therefore:

DEFINITION

An expected value is the mean of a probability distribution.

Didn’t I say statisticians love all sorts of notation to describe the same concept?

The Variance and Standard Deviation of Discrete Probability Distributions

Just when you thought it was safe to get back into the water, along comes another variance! Well, if you’ve seen one variance calculation, you’ve seen them all. You can calculate the variance for a discrete probability distribution as follows:

where:

σ2 = the variance of the discrete probability distribution

As before, the standard deviation of the distribution is as follows:

To demonstrate the use of these equations, we’ll rely on Christin’s swimming distribution. The calculations are summarized in the following table.

1.052

The standard deviation of this distribution is:

= 1.026

A more efficient way to calculate the variance of a discrete probability distribution is:

The following table summarizes these calculations using Christin’s swimming example.

As you can see, the result is the same, but with less effort!

Practice Problems

1. Stock “A” has the following probability distribution for its rate of return:

Rate of Return for Stock A (RA)

Probability

0.1

0.3

0.15

0.5

0.1

0.2

Calculate the mean, variance, and standard deviation for the rate of return on stock “A.”

2. A survey of 450 families was conducted to find how many cats were owned by each respondent. The following table summarizes the results.

Number of Cats

Number of Families

0

137

1

160

2

112

3

31

4

10

Develop a probability distribution for this data and calculate the mean, variance, and standard deviation.

The Least You Need to Know

  • A random variable is an outcome that takes on a numerical value as a result of an experiment. The value is not known with certainty before the experiment.
  • A random variable is discrete if it is limited to assuming only specific integer values as a result of counting the outcome of an experiment. A random variable is continuous if it can assume any numerical value within an interval as a result of measuring the outcome of an experiment.
  • A probability distribution is a listing of all the possible outcomes of an experiment along with the relative frequency or probability of each outcome.
  • You find the mean, or expected value, of a discrete probability distribution as follows:

  • You find the variance of a discrete probability distribution as follows:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset