4.2 Probability Distributions for Discrete Random Variables

A complete description of a discrete random variable requires that we specify all the values the random variable can assume and the probability associated with each value. To illustrate, consider Example 4.4.

Example 4.4 Finding a Probability Distribution—Coin-Tossing Experiment

Figure 4.2

Venn diagram for the two-coin-toss experiment

Problem

  1. Recall the experiment of tossing two coins (p. 118), and let x be the number of heads observed. Find the probability associated with each value of the random variable x, assuming that the two coins are fair.

  2. Recall the experiment of tossing two coins (p. 118), and let x be the number of heads observed. Find the probability associated with each value of the random variable x, assuming that the two coins are fair.

Solution

  1. The sample space and sample points for this experiment are reproduced in Figure 4.2. Note that the random variable x can assume values 0, 1, 2. Recall (from Chapter 3) that theThe probability associated with each of the four sample points is 14. Then, identifying the probabilities of the sample points associated with each of these values of x, we have

    [&*AS*P|pbo|x|=|0|pbc|*AP*|=|P|pbo|TT|pbc||=|*frac*{1}{4} &][&*AS*P|pbo|x|=|1|pbc|*AP*|=|P|pbo|TH|pbc||+|P|pbo|HT|pbc||=|*frac*{1}{4}|+|*frac*{1}{4}|=|*frac*{1}{2} &][&*AS*P|pbo|x|=|2|pbc|*AP*|=|P|pbo|HH|pbc||=|*frac*{1}{4} &]

    P(x=0)=P(TT)=14P(x=1)=P(TH)+P(HT)=14+14=12P(x=2)=P(HH)=14

    Thus, we now know the values the random variable can assume (0, 1, 2) and how the probability is distributed over those values (14,12,14). This dual specification completely describes the random variable and is referred to as the probability distribution, denoted by the symbol p(x).* The probability distribution for the coin-toss example is shown in tabular form in Table 4.1 and in graphic form in Figure 4.3. Since the probability distribution for a discrete random variable is concentrated at specific points (values of x), the graph in Figure 4.3a represents the probabilities as the heights of vertical lines over the corresponding values of x. Although the representation of the probability distribution as a histogram, as in Figure 4.3b, is less precise (since the probability is spread over a unit interval), the histogram representation will prove useful when we approximate probabilities of certain discrete random variables in Section 4.4.

    Figure 4.3

    Probability distribution for coin-toss experiment: graphical form

    Table 4.1 Probability Distribution for Coin-Toss Experiment: Tabular Form

    x p(x)
    0 14
    1 12
    2 14

Look Ahead

We could also present the probability distribution for x as a formula, but this would unnecessarily complicate a very simple example. We give the formulas for the probability distributions of some common discrete random variables later in the chapter.

Now Work Exercise 4.24

The probability distribution of a discrete random variable is a graph, table, or formula that specifies the probability associated with each possible value that the random variable can assume.

Two requirements must be satisfied by all probability distributions for discrete random variables:

Requirements for the Probability Distribution of a Discrete Random Variable x

  1. p(x)0 for all values of x.

  2. p(x)=1

where the summation of p(x) is over all possible values of x.*

Example 4.5 Probability Distribution from a Graph—Playing Craps

Problem

  1. Craps is a popular casino game in which a player throws two dice and bets on the outcome (the sum total of the dots showing on the upper faces of the two dice). Consider a $5 wager. On the first toss (called the come-out roll), if the total is 7 or 11 the roller wins $5. If the outcome is a 2, 3, or 12, the roller loses $5 (i.e., the roller wins $5). For any other outcome (4, 5, 6, 8, 9, or 10), a point is established and no money is lost or won on that roll (i.e., the roller wins $0). In a computer simulation of repeated tosses of two dice, the outcome x of the come-out roll wager ( $5,$0,or+$5) was recorded. A relative frequency histogram summarizing the results is shown in Figure 4.4. Use the histogram to find the approximate probability distribution of x.

Solution

  1. The histogram shows that the relative frequencies of the outcomes x=$5,x=$0, and x=$5 are .1, .65, and .25, respectively. For example, in repeated tosses of two dice, 25% of the outcomes resulted in a sum of 7 or 11 (a $5 win for the roller). Based on our long-run definition of probability given in Chapter 3, these relative frequencies estimate the probabilities of the three outcomes. Consequently, the approximate probability distribution of x, the outcome of the come-out wager in craps, is p($5)=.1,p($0)=.65,andp($5)=.25. Note that these probabilities sum to 1.

    Figure 4.4

    MINITAB Histogram for $5 Wager on Come-Out Roll in Craps

Look Back

When two dice are tossed, there is a total of 36 possible outcomes. (Can you list these outcomes, or sample points?) Of these, 4 result in a sum of 2, 3, or 12; 24 result in a sum of 4, 5, 6, 8, 9, or 10; and 8 result in a sum of 7 or 11. Using the rules of probability established in Chapter 3, you can show that the actual probability distribution for xis p($5)=4/36=.1111,p($0)=24/36=.6667,andp($5)=8/36=.2222.

Now Work Exercise 4.21

Examples 4.4 and 4.5 illustrate how the probability distribution for a discrete random variable can be derived, but for many practical situations the task is much more difficult. Fortunately, numerous experiments and associated discrete random variables observed in nature possess identical characteristics. Thus, you might observe a random variable in a psychology experiment that would possess the same probability distribution as a random variable observed in an engineering experiment or a social sample survey. We classify random variables according to type of experiment, derive the probability distribution for each of the different types, and then use the appropriate probability distribution when a particular type of random variable is observed in a practical situation. The probability distributions for most commonly occurring discrete random variables have already been derived. This fact simplifies the problem of finding the probability distributions for random variables, as the next example illustrates.

Example 4.6 Probability Distribution Using a Formula—Texas Droughts

Problem

  1. A drought is a period of abnormal dry weather that causes serious problems in the farming industry of the region. University of Arizona researchers used historical annual data to study the severity of droughts in Texas (Journal of Hydrologic Engineering, Sept./Oct. 2003). The researchers showed that the distribution of x, the number of consecutive years that must be sampled until a dry (drought) year is observed, can be modeled using the formula

    [&p|pbo|x|pbc||=||pbo|.3|pbc||pbo|.7|pbc|^{x|minus|1}, x|=|1, 2, 3, |elip| &]

    p(x)=(.3)(.7)x1,x=1,2,3,

    Find the probability that exactly 3 years must be sampled before a drought year occurs.

Solution

  1. We want to find the probability that x=3. Using the formula, we have

    [&p|pbo|3|pbc||=||pbo|.3|pbc||pbo|.7|pbc|^{3|minus|1}|=||pbo|.3|pbc||pbo|.7|pbc|^{2}|=||pbo|.3|pbc||pbo|.49|pbc||=|.147 &]

    p(3)=(.3)(.7)31=(.3)(.7)2=(.3)(.49)=.147

    Thus, there is about a 15% chance that exactly 3 years must be sampled before a drought year occurs in Texas.

Look Back

The probability of interest can also be derived using the principles of probability developed in Chapter 3. The event of interest is N1N2D3, where N1 represents no drought occurs in the first sampled year, N2 represents no drought occurs in the second sampled year, and D3 represents a drought occurs in the third sampled year. The researchers discovered that the probability of a drought occurring in any sampled year is .3 (and, consequently, the probability of no drought occurring in any sampled year is .7). Using the multiplicative rule of probability for independent events, the probability of interest is (.7)(.7)(.3)=.147.

Now Work Exercise 4.36

Since probability distributions are analogous to the relative frequency distributions of Chapter 2, it should be no surprise that the mean and standard deviation are useful descriptive measures.

If a discrete random variable x were observed a very large number of times and the data generated were arranged in a relative frequency distribution, the relative frequency distribution would be indistinguishable from the probability distribution for the random variable. Thus, the probability distribution for a random variable is a theoretical model for the relative frequency distribution of a population. To the extent that the two distributions are equivalent (and we will assume that they are), the probability distribution for x possesses a mean μ and a variance σ2 that are identical to the corresponding descriptive measures for the population. The procedure for finding μ and σ2 of a random variable follows.

Examine the probability distribution for x (the number of heads observed in the toss of two fair coins) in Figure 4.5. Try to locate the mean of the distribution intuitively. We may reason that the mean μ of this distribution is equal to 1 as follows: In a large number of experiments—say, 100,000— 14 (or 25,000) should result in x=0 heads, 12 (or 50,000) in x=1 head, and 14 (or 25,000) in x=2 heads. Therefore, the average number of heads is

[&|mu||=|*frac*{0|pbo|25,000|pbc||+|1|pbo|50,000|pbc||+|2|pbo|25,000|pbc|}{100,000}|=|0|pbo|^{1}|sfrac|_{4}|pbc||+|1|pbo|^{1}|sfrac|_{2}|pbc||+|2|pbo|^{1}|sfrac|_{4}|pbc| &][&|=|0|+|^{1}|sfrac|_{2}|+|^{1}|sfrac|_{2}|=|1 &]

μ=0(25,000)+1(50,000)+2(25,000)100,000=0(14)+1(12)+2(14)=0+12+12=1

Note that to get the population mean of the random variable x, we multiply each possible value of x by its probability p(x), and then we sum this product over all possible values of x. The mean of x is also referred to as the expected value of x, denoted E(x).

Figure 4.5

Probability distribution for a two-coin toss

The mean,or expected value, of a discrete random variable x is

[&|mu||=|E|pbo|x|pbc||=||Sig|xp|pbo|x|pbc| &]

μ=E(x)=Σxp(x)

Expected is a mathematical term and should not be interpreted as it is typically used. Specifically, a random variable might never be equal to its “expected value.” Rather, the expected value is the mean of the probability distribution, or a measure of its central tendency. You can think of μ as the mean value of x in a very large (actually, infinite) number of repetitions of the experiment in which the values of x occur in proportions equivalent to the probabilities of x.

Example 4.7 Finding an Expected Value—An Insurance Application

Problem

  1. Suppose you work for an insurance company and you sell a $10,000 one-year term insurance policy at an annual premium of $290. Actuarial tables show that the probability of death during the next year for a person of your customer’s age, sex, health, etc., is .001. What is the expected gain (amount of money made by the company) for a policy of this type?

Solution

  1. The experiment is to observe whether the customer survives the upcoming year. The probabilities associated with the two sample points, Live and Die, are .999 and .001, respectively. The random variable you are interested in is the gain x, which can assume the values shown in the following table:

    Gain x Sample Point Probability
    $290 Customer lives .999
    $9,710 Customer dies .001

    If the customer lives, the company gains the $290 premium as profit. If the customer dies, the gain is negative because the company must pay $10,000, for a net “gain” of $(29010,000)=$9,710. The expected gain is therefore

    [&*AS*|mu|*AP*|=|E|pbo|x|pbc||=||Sig|xp|pbo|x|pbc||=||pbo|290|pbc||pbo|.999|pbc||+||pbo||minus|9,710|pbc||pbo|.001|pbc||=||doll|280 &]

    μ=E(x)=Σxp(x)=(290)(.999)+(9,710)(.001)=$280

    In other words, if the company were to sell a very large number of $10,000 one-year policies to customers possessing the characteristics described, it would (on the average) net $280 per sale in the next year.

Look Back

Note that E(x) need not equal a possible value of x. That is, the expected value is $280, but x will equal either $290 or $9,710 each time the experiment is performed (a policy is sold and a year elapses). The expected value is a measure of central tendency—and in this case represents the average over a very large number of one-year policies—but is not a possible value of x.

We learned in Chapter 2 that the mean and other measures of central tendency tell only part of the story about a set of data. The same is true about probability distributions: We need to measure variability as well. Since a probability distribution can be viewed as a representation of a population, we will use the population variance to measure its variability.

The population variance σ2 is defined as the average of the squared distance of x from the population mean μ. Since x is a random variable, the squared distance, (xμ)2, is also a random variable. Using the same logic we employed to find the mean value of x, we calculate the mean value of (xμ)2 by multiplying all possible values of (xμ)2 by p(x) and then summing over all possible x values.* This quantity,

[&E|sbo|*N*[-2%0]|pbo|x|-||mu||pbc|^{2}|sbc||=|~SA~[C]*sum*{}{~rom~*N*[0%1]all~normal~ x*N*[0%-1]}|pbo|~norm~x|-||mu||pbc|^{2}p|pbo|x|pbc| &]

E[(xμ)2]=all x(xμ)2p(x)

is also called the expected value of the squared distance from the mean; that is, σ2=E[(xμ)2]. The standard deviation of x is defined as the square root of the variance σ2.

The variance of a random variable x is

[&|sig|^{2}|=|E|sbo|*N*[-2%0]|pbo|x|-||mu||pbc|^{2}|sbc||=||Sig||pbo|x|-||mu||pbc|^{2}p|pbo|x|pbc||=||Sig|x^{2}p|pbo|x|pbc||-||mu|^{2} &]

σ2=E[(xμ)2]=Σ(xμ)2p(x)=Σx2p(x)μ2

The standard deviation of a discrete random variable is equal to the square root of the variance, or σ=σ2.

Knowing the mean μ and standard deviation σ of the probability distribution of x, in conjunction with Chebyshev’s rule (Rule 2.1) and the empirical rule (Rule 2.2), we can make statements about the likelihood of values of x falling within the intervals μ±σ,μ±2σ, and μ±3σ. These probabilities are given in the following box:

Probability Rules for a Discrete Random Variable

Let x be a discrete random variable with probability distribution p(x), mean μ, and standard deviation σ. Then, depending on the shape of p(x), the following probability statements can be made:

Chebyshev’s Rule Empirical Rule
Applies to any probability distribution (see Figure 4.6a) Applies to probability distributions that are mound shaped and symmetric (see Figure 4.6b)
P(μσ<x<μ+σ) 0 .68
P(μ2σ<x<μ+2σ) 34 .95
P(μ3σ<x<μ+3σ) 89 1.00

Figure 4.6

Shapes of two probability distributions for a discrete random variable x

Example 4.8 Finding μ and σ—Skin Cancer Treatment

Problem

  1. Medical research has shown that a certain type of chemotherapy is successful 70% of the time when used to treat skin cancer. Suppose five skin cancer patients are treated with this type of chemotherapy, and let x equal the number of successful cures out of the five. The probability distribution for the number x of successful cures out of five is given in the following table:

    Alternate View
    x   0   1   2   3   4   5
    p(x) .002 .029 .132 .309 .360 .168
    1. Find μ=E(x). Interpret the result.

    2. Find σ=E[(xμ)2]. Interpret the result.

    3. Graph p(x). Locate μ and the interval μ±2σ on the graph. Use either Chebyshev’s rule or the empirical rule to approximate the probability that x falls into this interval. Compare your result with the actual probability.

    4. Would you expect to observe fewer than two successful cures out of five?

Solution

  1. Applying the formula for μ, we obtain

    [&*AS*|mu|*AP*|=|E|pbo|x|pbc||=||Sig|xp|pbo|x|pbc| &][&*AS**AP*|=|0|pbo|.002|pbc||+|1|pbo|.029|pbc||+|2|pbo|.132|pbc||+|3|pbo|.309|pbc||+|4|pbo|.360|pbc||+|5|pbo|.168|pbc||=|3.50 &]

    μ=E(x)=xp(x)=0(.002)+1(.029)+2(.132)+3(.309)+4(.360)+5(.168)=3.50

    On average, the number of successful cures out of five skin cancer patients treated with chemotherapy will equal 3.5. Remember that this expected value has meaning only when the experiment—treating five skin cancer patients with chemotherapy—is repeated a large number of times.

  2. Now we calculate the variance of x:

    [&*AS*|sig|^{2}*AP*|=|E|sbo||pbo|x|-||mu||pbc|^{2}|sbc||=||Sig||pbo|x|-||mu||pbc|^{2}p|pbo|x|pbc| &][&*AS**AP*|=||pbo|0|-|3.5|pbc|^{2}|pbo|.002|pbc||+||pbo|1|-|3.5|pbc|^{2}|pbo|.029|pbc||+||pbo|2|-|3.5|pbc|^{2}|pbo|.132|pbc| &][&*AS**AP*|+||pbo|3|-|3.5|pbc|^{2}|pbo|.309|pbc||+||pbo|4|-|3.5|pbc|^{2}|pbo|.360|pbc||+||pbo|5|-|3.5|pbc|^{2}|pbo|.168|pbc| &][&*AS**AP*|=|1.05 &]

    σ2=E[(xμ)2]=Σ(xμ)2p(x)=(03.5)2(.002)+(13.5)2(.29)+(23.5)2(.132)+(33.5)2(.309)+(43.5)2(.360)+(53.5)2(.168)=1.05

    Thus, the standard deviation is

    [&|sig||=|*rad*{|sig|^{2}}|=|*rad*{1.05}|=|1.02 &]

    σ=σ2=1.05=1.02

    This value measures the spread of the probability distribution of x, the number of successful cures out of five. A more useful interpretation is obtained by answering parts c and d.

  3. The graph of p(x) is shown in Figure 4.7, with the mean μ and the interval μ±2σ=3.50±2(1.02)=3.50±2.04=(1.46,5.54) also indicated. Note particularly that μ=3.5 locates the center of the probability distribution. Since this distribution is a theoretical relative frequency distribution that is moderately mound shaped (see Figure 4.7), we expect (from Chebyshev’s rule) at least 75% and, more likely (from the empirical rule), approximately 95% of observed x values to fall between 1.46 and 5.54. You can see from the figure that the actual probability that x falls in the interval μ±2σ includes the sum of p(x) for the values x=2,x=3,x=4, and x=5. This probability is p(2)+p(3)+p(4)+p(5)=.132+.309+.360+.168=.969. Therefore, 96.9% of the probability distribution lies within two standard deviations of the mean. This percentage is consistent with both Chebyshev’s rule and the empirical rule.

    Figure 4.7

    Graph of p(x) for Example 4.8

  4. Fewer than two successful cures out of five implies that x=0 or x=1. Both of these values of x lie outside the interval μ±2σ, and the empirical rule tells us that such a result is unlikely (approximate probability of .05). The exact probability, P(x1), is p(0)+p(1)=.002+.029=.031. Consequently, in a single experiment in which five skin cancer patients are treated with chemotherapy, we would not expect to observe fewer than two successful cures.

Now Work Exercise 4.46

Exercises 4.17–4.47

Understanding the Principles

  1. 4.17 Give three different ways of representing the probability distribution of a discrete random variable.

Learning the Mechanics

  1. 4.18 What does the expected value of a random variable represent?

  2. 4.19 Will E(x) always be equal to a specific value of the random variable x?

  3. 4.20 For a mound-shaped, symmetric distribution, what is the probability that x falls into the interval μ±2σ?

  4. 4.21 A discrete random variable x can assume five possible values: 20, 21, 22, 23, and 24. The MINITAB histogram at the next page shows the likelihood of each value.

    1. What is p(22)?

    2. What is the probability that x equals 20 or 24?

    3. What is P(x23)?

    Histogram for Exercise 4.21

  5. 4.22 Explain why each of the following is or is not a valid probability distribution for a discrete random variable x:

    1. x   0   1 2   3
      p(x) .2  .3 .3 .2
    2. x 2 1   0
      p(x) .25 .50 .20
    3. x   4 9 20
      p(x) .3 1.0 .3
    4. x 2   3   5   6
      p(x) .15 .20 .40 .35
  6. 4.23 The random variable x has the following discrete probability distribution:

    Alternate View
    x 10 11 12 13 14
    p(x) .2 .3 .2 .1 .2

    Since the values that x can assume are mutually exclusive events, the event {x12} is the union of three mutually exclusive events:

    [&|cbo|x|=|10|cbc| |Opunion| |cbo|x|=|11|cbc| |Opunion| |cbo|x|=|12|cbc| &]

    {x=10}{x=11}{x=12}
    1. Find P(x12).

    2. Find P(x>12).

    3. Find P(x14).

    4. Find P(x=14).

    5. Find P(x11 or x>12).

  7. 4.24 Toss three fair coins, and let x equal the number of heads observed.

    1. Identify the sample points associated with this experiment, and assign a value of x to each sample point.

    2. Calculate p(x) for each value of x.

    3. Construct a probability histogram for p(x).

    4. What is P(x=2 ox=3)?

  8. 4.25 Consider the probability distribution for the random variable x shown here:

    Alternate View
    x 1 2 4 10
    p(x) .2 .4 .2 .2
    1. Find μ=E(x).

    2. Find σ2=E[(xμ)2].

    3. Find σ.

    4. Interpret the value you obtained for μ.

    5. In this case, can the random variable x ever assume the value μ? Explain.

    6. In general, can a random variable ever assume a value equal to its expected value? Explain.

  9. 4.26 Consider the probability distribution shown here:

    Alternate View
    x 4 3 2 1  0  1  2  3  4
    p(x) .02 .07 .10 .15 .30 .18 .10 .06 .02
    1. Calculate μ,σ2, and σ.

    2. Graph p(x). Locate μ,μ2σ, and μ+2σ on the graph.

    3. What is the probability that x will fall into the interval μ±2σ?

  10. 4.27 Consider the probability distributions shown here:

    Alternate View
    x 0 1 2
    p(x) .3 .4 .3
    Alternate View
    y 0 1 2
    p(y) .1 .8 .1
    1. Use your intuition to find the mean for each distribution. How did you arrive at your choice?

    2. Which distribution appears to be more variable? Why?

    3. Calculate μ and σ2 for each distribution. Compare these answers with your answers in parts a and b.

Applet Exercise 4.1

Use the applet entitled Random Numbers to generate a list of 25 numbers between 1 and 3, inclusive. Let x represent a number chosen from this list.

    1. What are the possible values of x?

    2. Give the probability distribution for x in table form.

    3. Let y be a number randomly chosen from the set {1,2,3}. Give the probability distribution for y in table form.

    4. Compare the probability distributions of x and y in parts b and c. Why should these distributions be approximately the same?

Applet Exercise 4.2

Run the applet entitled Simulating the Probability of a Head with a Fair Coin 10 times with n=2, resetting between runs, to simulate flipping two coins 10 times. Count and record the number of heads each time. Let x represent the number of heads on a single flip of the coins.

    1. What are the possible values of x?

    2. Use the results of the simulation to find the probability distribution for x in table form.

    3. Explain why the probability of exactly two heads should be close to .25.

Applying the Concepts—Basic

  1. 4.28 Size of TV households. According to Nielsen’s Television Audience Report, 2010 & 2011, 26% of all U.S. TV households have a size of 1 person, 32% have a household size of 2, 17% have a household size of 3, and 25% have a household size of 4 or more. Let x represent the size (number of people) in a randomly selected TV household.

    1. List the possible values of the discrete random variable x.

    2. What is the probability that x=1?

    3. What is the probability that x4?

  2. 4.29 Do social robots walk or roll? Refer to the International Conference on Social Robotics (Vol. 6414, 2010) study of the trend in the design of social robots, Exercise 2.7 (p. 38). Recall that in a random sample of 106 social (or service) robots designed to entertain, educate, and care for human users, 63 were built with legs only, 20 with wheels only, 8 with both legs and wheels, and 15 with neither legs nor wheels. Assume the following: Of the 63 robots with legs only, 50 have two legs, 5 have three legs, and 8 have four legs; of the 8 robots with both legs and wheels, all 8 have two legs. Suppose one of the 106 social robots is randomly selected. Let x equal the number of legs on the robot.

    1. List the possible values of x.

    2. Find the probability distribution of x.

    3. Find E(x) and give a practical interpretation of its value.

  3. 4.30 NHTSA crash tests. Refer to the National Highway Traffic Safety Administration (NHTSA) crash tests of new car models, presented in Exercise 4.6 (p. 170). A summary of the driver-side star ratings for the 98 cars is reproduced in the MINITAB printout in the below. Assume that 1 of the 98 cars is selected at random, and let x equal the number of stars in the car’s driver-side star rating.

    1. a. Use the information in the printout to find the probability distribution for x.

    2. b. Find P(x=5).

    3. c. Find P(x2).

    4. d. Find μ=E(x) and interpret the result.

  4. 4.31 Downloading apps to your cell phone. According to an August 2011 survey by the Pew Internet & American Life Project, nearly 40% of adult cell phone owners have downloaded an application (“app”) to their cell phone. The accompanying table gives the probability distribution for x, the number of apps used at least once a week by cell phone owners who have downloaded an app to their phone. (The probabilities in the table are based on information from the Pew Internet & American Life Project survey.)

    Number of Apps Used, x p(x)
     0  .17
     1  .10
     2  .11
     3  .11
     4  .10
     5  .10
     6  .07
     7  .05
     8  .03
     9  .02
    10  .02
    11  .02
    12  .02
    13  .02
    14  .01
    15  .01
    16  .01
    17  .01
    18  .01
    19 .005
    20 .005
    1. Show that the properties of a probability distribution for a discrete random variable are satisfied.

    2. Find P(x10).

    3. Find the mean and varience of x.

    4. Give an interval that will contain the value of x with a probability of at least .75.

  5. 4.32 Controlling the water hyacinth. An insect that naturally feeds on the water hyacinth, one of the world’s worst aquatic weeds, is the delphacid. Female delphacids lay anywhere from one to four eggs onto a water hyacinth blade. The Annals of the Entomological Society of America (Jan. 2005) published a study of the life cycle of a South American delphacid species. The following table gives the percentages of water hyacinth blades that have one, two, three, and four delphacid eggs:

    Alternate View
    One Egg Two Eggs Three Eggs Four Eggs
    Percentage  of Blades 40 54 2 4

    Source: Sosa, A. J., et al. “Life history of Megamelus scutellaris with description of immature stages,” Annals of the Entomological Society of America, Vol. 98, No. 1, Jan. 2005 (adapted from Table 1).

    1. One of the water hyacinth blades in the study is randomly selected, and x, the number of delphacid eggs on the blade, is observed. Give the probability distribution of x.

    2. What is the probability that the blade has at least three delphacid eggs?

    3. Find E(x) and interpret the result.

  6. 4.33 Gender in two-child families. Human Biology (Feb. 2009) published a study on the gender of children in two-child families. In populations where it is just as likely to have a boy as a girl, the probabilities of having two girls, two boys, or a boy and a girl are well known. Let x represent the number of boys in a two-child family.

    1. List the possible ways (sample points) in which a two-child family can be gender-configured. (For example, BG represents the event that the first child is a boy and the second is a girl.)

    2. Assuming boys are just as likely as girls to be born, assign probabilities to the sample points in part a.

    3. Use the probabilities, part a, to find the probability distribution for x.

    4. The article reported on the results of the National Health Interview Survey (NHIS) of almost 43,000 two-child families. The table gives the proportion of families with each gender configuration. Use this information to revise the probability distribution for x.

      Gender Configuration Proportion
      Girl-girl (GG) .222
      Boy-girl (BG) .259
      Girl-boy (GB) .254
      Boy-boy (BB) .265
    5. Refer to part d. Find E(x) and give a practical interpretation of its value.

  7. 4.34 Environmental vulnerability of amphibians. Many species of amphibians are declining due to dramatic changes in the environment. The environmental vulnerability of amphibian species native to Mexico was examined in Amphibian & Reptile Conservation (Aug. 2013). The researchers quantified this decline using a modified Environmental Vulnerability Score (EVS) algorithm. Scores ranged from 3 to 20 points, with higher scores indicating a greater level of environmental vulnerability. EVS scores for 382 snake species were determined and are summarized in the table below. Let x represent the EVS score for a randomly selected snake species in Mexico.

    Alternate View
    EVS Score 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Total
    Number of Snake Species 1 1 7 10 9 19 17 30 25 31 46 52 50 44 24 9 7 0 382

    Table for Exercise 4.34

    Source: Wilson, L. D., et al. “A conservation reassessment of the amphibians of Mexico based on the EVS measure.” Amphibian & Reptile Conservation, Vol. 7, No. 1, Aug. 2013 (Table 4).

    1. Specify the probability distribution for EVS score x.

    2. EVS scores greater than 14 are indicative of species that are endangered due to high environmental vulnerability. Find P(x>14).

    3. Find the mean of x and interpret the result.

Applying the Concepts—Intermediate

  1. 4.35 The “last name” effect in purchasing. The Journal of Consu­mer Research (Aug. 2011) published a study demonstrating the “last name” effect—i.e., the tendency for consumers with last names that begin with a later letter of the alphabet to purchase an item before consumers with last names that begin with earlier letters. To facilitate the analysis, the researchers assigned a number, x, to each consumer based on the first letter of the consumer’s last name. For example, last names beginning with “A” were assigned x=1, last names beginning with “B” were assigned x=2, and last names beginning with “Z” were assigned x=26.

    1. If the first letters of consumers’ last names are equally likely, find the probability distribution for x.

    2. Do you believe the probability distribution in part a is realistic? Explain. How might you go about estimating the true probability distribution for x?

  2. 4.36 Solar energy cells. According to the Earth Policy Institute (July 2013), 60% of the world’s solar energy cells are manufactured in China. Consider a random sample of 5 solar energy cells, and let x represent the number in the sample that are manufactured in China. In Section 4.4 , we show that the probability distribution for x is given by the formula

    [&p|pbo|x|pbc||=|*frac*{|pbo|5|fract||pbc||pbo|.6|pbc|^{x}|pbo|.4|pbc|^{5|minus|x}}{|pbo|x|fract||pbc||pbo|5|minus|x|pbc|!}, &][&~rom~where~normal~ n|fract||=||pbo|n|pbc||pbo|n|minus|1|pbc||pbo|n|minus|2|pbc||elip||pbo|2|pbc||pbo|1|pbc| &]

    p(x)=(5!)(.6)x(.4)5x(x!)(5x)!,wheren!=(n)(n1)(n2)(2)(1)
    1. Explain why x is a discrete random variable.

    2. Find p(x) for x=0,1,2,3,4, and 5.

    3. Show that the properties for a discrete probability distribution are satisfied.

    4. Find the probability that at least 4 of the 5 solar energy cells in the sample are manufactured in China.

  3. 4.37 Contaminated gun cartridges. A weapons manufacturer uses a liquid propellant to produce gun cartridges. During the manufacturing process, the propellant can get mixed with another liquid to produce a contaminated cartridge. A University of South Florida statistician hired by the company to investigate the level of contamination in the stored cartridges found that 23% of the cartridges in a particular lot were contaminated. Suppose you randomly sample (without replacement) gun cartridges from this lot until you find a contaminated one. Let x be the number of cartridges sampled until a contaminated one is found. It is known that the probability distribution for x is given by the formula

    [&p|pbo|x|pbc||=||pbo|.23|pbc||pbo|.77|pbc|^{x|-|1}, x|=|1, 2, 3, *N*[-2%0]|elip| &]

    p(x)=(.23)(.77)x1,x=1,2,3,
    1. Find p(1). Interpret this result.

    2. Find p(5). Interpret this result.

    3. Find P(x2). Interpret this result.

  4. 4.38 Variable speed limit control for freeways. A common transportation problem in large cities is congestion on the freeways. In the Canadian Journal of Civil Engineering (Jan. 2013), civil engineers investigated the use of variable speed limits (VSL) to control the congestion problem. The study site was an urban freeway in Edmonton, Canada. A portion of the freeway was divided equally into three sections, and variable speed limits were posted (independently) in each section. Simulation was used to find the optimal speed limits based on various traffic patterns and weather conditions. Probability distributions of the speed limits for the three sections were determined. For example, one possible set of distributions is as follows (probabilities in parentheses). Section 1: 30 mph (.05), 40 mph (.25), 50 mph (.25), 60 mph (.45); Section 2: 30 mph (.10), 40 mph (.25), 50 mph (.35), 60 mph (.30); Section 3: 30 mph (.15), 40 mph (.20), 50 mph (.30), 60 mph (.35).

    1. Verify that the properties of discrete probability distributions are satisfied for each individual section of the freeway.

    2. Consider a vehicle that will travel through the three sections of the freeway at a steady (fixed) speed. Let x represent this speed. Find the probability distribution for x.

    3. Refer to part b. What is the probability that the vehicle can travel at least 50 mph through the three sections of the freeway?

  5. 4.39 Choosing portable grill displays. Refer to the Journal of Consumer Research (Mar. 2003) marketing study of influencing consumer choices by offering undesirable alternatives, presented in Exercise 3.29 (p. 129). Recall that each of 124 college students selected showroom displays for portable grills. Five different displays (representing five different-sized grills) were available, but the students were instructed to select only three displays in order to maximize purchases of Grill #2 (a smaller grill). The table that follows shows the grill display combinations and number of each selected by the 124 students. Suppose 1 of the 124 students is selected at random. Let x represent the sum of the grill numbers selected by that student. (This sum is an indicator of the size of the grills selected.)

    1. Find the probability distribution for x.

    2. What is the probability that x exceeds 10?

    3. Find the mean and variance of x. Interpret the results?

    Grill Display Combination Number of Students
    1–2–3 35
    1–2–4  8
    1–2–5 42
    2–3–4  4
    2–3–5  1
    2–4–5 34

    Based on Hamilton, R. W. “Why do people suggest what they do not want? Using context effects to influence others’ choices.” Journal of Consumer Research, Vol. 29, Mar. 2003 (Table 1).

  6. 4.40 Reliability of a manufacturing network. A team of industrial management university professors investigated the reliability of a manufacturing system that involves multiple production lines (Journal of Systems Sciences & Systems Engineering, Mar. 2013). An example of such a network is a system for producing integrated circuit (IC) cards with two production lines set up in sequence. Items (IC cards) first pass through Line 1 and then are processed by Line 2. The probability distribution of the maximum capacity level of each line is shown in below. Assume the lines operate independently.

    Line Maximum Capacity, x p(x)
    1  0  .01
    12  .02
    24  .02
    36  .95
    2  0 .002
    35 .002
    70 .996
    1. Verify that the properties of discrete probability distributions are satisfied for each line in the system.

    2. Find the probability that the maximum capacity level for Line 1 will exceed 30 items.

    3. Repeat part b for Line 2.

    4. Now consider the network of two production lines. What is the probability that a maximum capacity level of 30 items is maintained throughout the network?

    5. Find the mean maximum capacity for each line. Interpret the results practically.

    6. Find the standard deviation of the maximum capacity for each line. Interpret the results practically.

  1. 4.41 Beach erosional hot spots. Refer toConsider the U.S. Army Corps of Engineers’ study of beach erosional hot spots, presented in Exercise 3.123 (p. 160). The data on the nearshore bar condition for six beach hot spots are reproduced in the accompanying table. Suppose you randomly select two of these six beaches and count x, the total number in the sample with a planar nearshore bar condition.

    Beach Hot Spot Nearshore Bar Condition
    Miami Beach, FL Single, shore parallel
    Coney Island, NY Other
    Surfside, CA Single, shore parallel
    Monmouth Beach, NJ Planar
    Ocean City, NJ Other
    Spring Lake, NJ Planar

    Based on Identification and Characterization of Erosional Hotspots, William & Mary Virginia Institute of Marine Science, U.S. Army Corps of Engineers Project Report, Mar. 18, 2002.

    1. List all possible pairs of beach hot spots that can be selected from the six.

    2. Assign probabilities to the outcomes in part a.

    3. For each outcome in part a, determine the value of x.

    4. Form a probability distribution table for x.

    5. Find the expected value of x. Interpret the result.

  2. 4.42 Expected Lotto winnings. The chance of winning Florida’s Pick-6 Lotto game is 1 in approximately 23 million. Suppose you buy a $1 Lotto ticket in anticipation of winning the $7 million grand prize. Calculate your expected net winnings for this single ticket. Interpret the result.

  3. 4.43 Expected winnings in roulette. In the popular casino game of roulette, you can bet on whether the ball will fall in an arc on the wheel colored red, black, or green. You showed (Exercise 3.127, p. 161) that the probability of a red outcome is 18/38, that of a black outcome is 18/38, and that of a green outcome is 2/38. Suppose you make a $5 bet on red. Find your expected net winnings for this single bet. Interpret the result.

Applying the Concepts—Advanced

  1. 4.44 Parlay card betting. Odds makers try to predict which professional and college football teams will win and by how much (the spread). If the odds makers do this accurately, adding the spread to the underdog’s score should make the final score a tie. Suppose a bookie will give you $6 for every $1 you risk if you pick the winners in three different football games (adjusted by the spread) on a “parlay” card. What is the bookie’s expected earnings per dollar wagered? Interpret this value.

  2. 4.45 Voter preferences for a committee. A “Condorcet” committee of, say, 3 members is a committee that is preferred by voters over any other committee of 3 members. A scoring function was used to determine “Condorcet” committees in Mathematical Social Sciences (Nov. 2013). Consider a committee with members A, B, and C. Suppose there are 10 voters who each have a preference for a 3-member committee. For example, one voter may prefer a committee made up of members A, C, and G. Then this voter’s preference score for the {A, B, C} committee is 2, because 2 of the members (A and B) are on this voter’s preferred list. For a 3-member committee, voter preference scores range from 0 (no members on the preferred list) to 3 (all members on the preferred list). The table below shows the preferred lists of 10 voters for a 3-member committee selected from potential members A, B, C, D, E, F, and G.

    Alternate View
    Voter: 1 2 3 4 5 6 7 8 9 10
    A B A B D C A B A A
    C C B C E E C C B C
    D E D F F G G D C G
    1. Find the preference score for committee {A, B, C} for each voter.

    2. For a randomly selected voter, let x represent the preference score for committee {A, B, C}. Determine the probability distribution for x.

    3. What is the probability that the preference score, x, exceeds 2?

    4. Is {A, B, C} a “Condorcet” committee?

  3. 4.46 Punnett square for earlobes. Geneticists use a grid—called a Punnett square—to display all possible gene combinations in genetic crosses. (The grid is named for Reginald Punnett, a British geneticist who developed the method in the early 1900s.) The figure below is a Punnett square for a cross involving human earlobes. In humans, free earlobes (E) are dominant over attached earlobes (e). Consequently, the gene pairs EE and Ee will result in free earlobes, while the gene pair ee results in attached earlobes. Consider a couple with genes as shown in the Punnett square below. Suppose the couple has seven children. Let x represent the number of children with attached earlobes (i.e., with the gene pair ee). Find the probability distribution of x.

  4. 4.47 Robot-sensor system configuration. Engineers at Broadcom Corp. and Simon Fraser University collaborated on research involving a robot-sensor system in an unknown environment (The International Journal of Robotics Research, Dec. 2004). As an example, the engineers presented the three-point, single-link robotic system shown in the accompanying figure. Each point (A, B, or C) in the physical space of the system has either an “obstacle” status or a “free” status. There are two single links in the system: AB and BC. A link has a “free” status if and only if both points in the link are “free”; otherwise the link has an “obstacle” status. Of interest is the random variable x: the total number of links in the system that are “free.”

    1. List the possible values of x for the system.

    2. The researchers stated that the probability of any point in the system having a “free” status is .5. Assuming that the three points in the system operate independently, find the probability distribution for x.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset