5.3 Confidence Interval for a Population Mean: Student’s t-Statistic

Federal legislation requires pharmaceutical companies to perform extensive tests on new drugs before they can be marketed. Initially, a new drug is tested on animals. If the drug is deemed safe after this first phase of testing, the pharmaceutical company is then permitted to begin human testing on a limited basis. During this second phase, inferences must be made about the safety of the drug on the basis of information obtained from very small samples.

Suppose a pharmaceutical company must estimate the average increase in blood pressure of patients who take a certain new drug. Assume that only six patients (randomly selected from the population of all patients) can be used in the initial phase of human testing. The use of a small sample in making an inference about μ presents two immediate problems when we attempt to use the standard normal z as a test statistic.

Problem 1

  1. The shape of the sampling distribution of the sample mean x¯ (and the z-statistic) now depends on the shape of the population that is sampled. We can no longer assume that the sampling distribution of x¯ is approximately normal because the Central Limit Theorem ensures normality only for samples that are sufficiently large.

Solution to Problem 1

  1. The sampling distribution of x¯ (and z) is exactly normal even for relatively small samples if the sampled population is normal. It is approximately normal if the sampled population is approximately normal.

Problem 2

  1. The population standard deviation σ is almost always unknown. Although it is still true that σx¯=σ/n, the sample standard deviation s may provide a poor approximation for σ when the sample size is small.

Solution to Problem 2

  1. Instead of using the standard normal statistic

    z=x¯μσx¯=x¯μσ/n

    which requires knowledge of, or a good approximation to, σ, we define and use the statistic

    t=x¯μs/n

    in which the sample standard deviation s replaces the population standard deviation σ.

If we are sampling from a normal distribution, the t-statistic has a sampling distribution very much like that of the z-statistic: mound shaped, symmetric, and with mean 0. The primary difference between the sampling distributions of t and z is that the t-statistic is more variable than the z, a property that follows intuitively when you realize that t contains two random quantities ( x¯ and s), whereas z contains only one ( x¯).

Biography William S. Gosset (1876–1937)

Student’s t-Distribution

At the age of 23, William Gosset earned a degree in chemistry and mathematics at prestigious Oxford University. He was immediately hired by the Guinness Brewing Company in Dublin, Ireland, for his expertise in chemistry. However, Gosset’s mathematical skills allowed him to solve numerous practical problems associated with brewing beer. For example, Gosset applied the Poisson distribution to model the number of yeast cells per unit volume in the fermentation process. His most important discovery was the t-distribution in 1908. Since most applied researchers worked with small samples, Gosset was interested in the behavior of the mean in the case of small samples. He tediously took numerous small sets of numbers, calculated the mean and standard deviation of each, obtained their t-ratio, and plotted the results on graph paper. The shape of the distribution was always the same: the t-distribution. Under company policy, employees were forbidden to publish their research results, so Gosset used the pen name Student to publish a paper on the subject. Hence, the distribution has been called Student’s t-distribution.

The actual amount of variability in the sampling distribution of t depends on the sample size n. A convenient way of expressing this dependence is to say that the t statistic has (n1) degrees of freedom (df).* Recall that the quantity (n1) is the divisor that appears in the formula for s2. This number plays a key role in the sampling distribution of s2 and appears in discussions of other statistics in later chapters. In particular, the smaller the number of degrees of freedom associated with the t-statistic, the more variable will be its sampling distribution.

In Figure 5.9, we show both the sampling distribution of z and the sampling distribution of a t-statistic with both 4 and 20 df. You can see that the t-distribution is more variable than the z-distribution and that this variability increases as the degrees of freedom decrease. You can also see that the increased variability of the t-statistic means that the t-value, tα, that locates an area α in the upper tail of the t-distribution is larger than the corresponding value, zα. For any given value of α the t-value, tα, increases as the number of degrees of freedom (df) decreases. Values of t that will be used in forming small-sample confidence intervals of μ can be obtained using technology (e.g., statistical software) or from Table III of Appendix B. A partial reproduction of this table is shown in Table 5.3. Note that tα values are listed for various degrees of freedom, where α refers to the tail area under the t-distribution to the right of tα.

Figure 5.9

Standard normal (z) distribution and t-distributions

For example, if we want the t-value with an area of .025 to its right and 4 df, we look in the table under the column t.025 for the entry in the row corresponding to 4 df. This entry is t.025=2.776, shaded in Table 5.3 and shown in Figure 5.10. The corresponding standard normal z-score is z.025=1.96. Note that the last row of Table III, where df= (infinity), contains the standard normal z-values. This follows from the fact that as the sample size n grows very large, s becomes closer to σ and thus t becomes closer in distribution to z. In fact, when df=29, there is little difference between corresponding tabulated values of z and t. Thus, we choose the arbitrary cutoff of n=30(df=29) to distinguish between large-sample and small-sample inferential techniques when σ is unknown.

Table 5.3 Reproduction of Part of Table III in Appendix B

Alternate View
Degrees of Freedom  t.100  t.050  t.025  t.010  t.005  t.001  t.0005
 1 3.078 6.314 12.706 31.821 63.657 318.31 636.62
 2 1.886 2.920 4.303 6.965 9.925 22.326 31.598
 3 1.638 2.353 3.182 4.541 5.841 10.213 12.924
 4 1.533 2.132 2.776 3.747 4.604 7.173 8.610
 5 1.476 2.015 2.571 3.365 4.032 5.893 6.869
 6 1.440 1.943 2.447 3.143 3.707 5.208 5.959
 7 1.415 1.895 2.365 2.998 3.499 4.785 5.408
 8 1.397 1.860 2.306 2.896 3.355 4.501 5.041
 9 1.383 1.833 2.262 2.821 3.250 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.852 4.221
14 1.345 1.761 2.145 2.624 2.977 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.733 4.073
1.282 1.645 1.960 2.326 2.576 3.090 3.291

Figure 5.10

The t.025 value in a t-distribution with 4 df, and the corresponding z.025 value

Example 5.4 Confidence Interval for Mean Blood Pressure Increase, t-Statistic 

Problem

  1. Consider the pharmaceutical company that desires an estimate of the mean increase in blood pressure of patients who take a new drug. The blood pressure increases (measured in points) for the n=6 patients in the human testing phase are shown in Table 5.4. Use this information to construct a 95% confidence interval for  μ, the mean increase in blood pressure associated with the new drug for all patients in the population.

    Table 5.4 Blood Pressure Increases (Points) for Six Patients

    1.7 3.0 .8 3.4 2.7 2.1

    Data Set: BPINCR

Solution

  1. Fisth a sample too small to assume, by the Central Limit Theorem, that the sample mean x¯ is approximately normally distributed. That is, we do not get the normal distribution of x¯ “automatically” from the Central Limit Theorem when the sample size is small. Instead, we must assume that the measured variable, in this case the increase in blood pressure, is normally distributed in order for the distribution of x¯ to be normal.

    Second, unless we are fortunate enough to know the population standard deviation σ, which in this case represents the standard deviation of all the patients’ increases in blood pressure when they take the new drug, we cannot use the standard normal z-statistic to form our confidence interval for μ. Instead, we must use the t-distribution, with (n1) degrees of freedom.

    In this case, n1=5df, and the t-value is found in Table 5.3 to be

    t.025=2.571with5df

    Recall that the large-sample confidence interval would have been of the form

    x¯±zα/2σx¯=x¯±zα/2σn=x¯±z.025σn

    where 95% is the desired confidence level. To form the interval for a small sample from a normal distribution, we simply substitute t for z and s for σ in the preceding formula, yielding

    x¯±tα/2sn

    An SPSS printout showing descriptive statistics for the six blood pressure increases is displayed in Figure 5.11. Note that x¯=2.283 and s=.950. Substituting these numerical values into the confidence interval formula, we get

    2.283±(2.571)(.9506)=2.283±.997

    or 1.286 to 3.280 points. Note that this interval agrees (except for rounding) with the confidence interval highlighted on the SPSS printout in Figure 5.11.

    We interpret the interval as follows: We can be 95% confident that the mean increase in blood pressure associated with taking this new drug is between 1.286 and 3.28 points. As with our large-sample interval estimates, our confidence is in the process, not in this particular interval. We know that if we were to repeatedly use this estimation procedure, 95% of the confidence intervals produced would contain the true mean μ, assuming that the probability distribution of changes in blood pressure from which our sample was selected is normal. The latter assumption is necessary for the small-sample interval to be valid.

Figure 5.11

SPSS confidence interval for mean blood pressure increase

Look Back

What price did we pay for having to utilize a small sample to make the inference? First, we had to assume that the underlying population is normally distributed, and if the assumption is invalid, our interval might also be invalid.* Second, we had to form the interval by using a t value of 2.571 rather than a z value of 1.96, resulting in a wider interval to achieve the same 95% level of confidence. If the interval from 1.286 to 3.28 is too wide to be of much use, we know how to remedy the situation: Increase the number of patients sampled in order to decrease the width of the interval (on average).

Now Work Exercise 5.31

The procedure for forming a small-sample confidence interval is summarized in the accompanying boxes.

Small-Sample 100(1α) Confidence Interval for μ, t-Statistic

σ unknown:x¯±(tα/2)(s/n)

where tα/2 is the t-value corresponding to an area α/2 in the upper tail of the Student’s t-distribution based on (n1) degrees of freedom.

σ known:x¯±(zα/2)(σ/n)

Conditions Required for a Valid Small-Sample Confidence Interval for μ

  1. A random sample is selected from the target population.

  2. The population has a relative frequency distribution that is approximately normal.

Example 5.5 A Small-Sample Confidence Interval for μ—Destructive Sampling 

Problem

  1. Some quality control experiments require destructive sampling (i.e., the test to determine whether the item is defective destroys the item) in order to measure a particular characteristic of the product. The cost of destructive sampling often dictates small samples. Suppose a manufacturer of printers for personal computers wishes to estimate the mean number of characters printed before the printhead fails.The printer manufacturer tests n=15 printheads and records the number of characters printed until failure for each. These 15 measurements (in millions of characters) are listed in Table 5.5, followed by a MINITAB summary statistics printout in Figure 5.12.

    1. Form a 99% confidence interval for the mean number of characters printed before the printhead fails. Interpret the result.

    2. What assumption is required for the interval you found in part a to be valid? Is that assumption reasonably satisfied?

      Table 5.5 Number of Characters (in Millions) for  n=15 Printhead Tests

      Alternate View
      1.13 1.55 1.43 .92 1.25 1.36 1.32 .85 1.07 1.48 1.20 1.33 1.18 1.22 1.29

      Data Set: PRNTHD

      Figure 5.12

      MINITAB printout with descriptive statistics and 99% confidence interval for Example 5.5

Solution

  1. For this small sample (n=15), we use the t-statistic to form the confidence interval. We use a confidence coefficient of .99 and n1=14 degrees of freedom to find tα/2 in Table III:

    tα/2=t.005=2.977

    [Note: The small sample forces us to extend the interval almost three standard deviations (of x¯) on each side of the sample mean in order to form the 99% confidence interval.] From the MINITAB printout shown in Figure 5.12, we find that x¯=1.24 and s=.19. Substituting these (rounded) values into the confidence interval formula, we obtain

    x¯±t.005(sn)=1.24±2.977(.1915)=1.24±.15or(1.09,1.39)

    This interval is highlighted in Figure 5.12.

    Our interpretation is as follows: The manufacturer can be 99% confident that the printhead has a mean life of between 1.09 and 1.39 million characters. If the manufacturer were to advertise that the mean life of its printheads is (at least) 1 million characters, the interval would support such a claim. Our confidence is derived from the fact that 99% of the intervals formed in repeated applications of this procedure will contain μ.

    Figure 5.13

    MINITAB stem-and-leaf display of data in Table 5.5

  2. Since n is small, we must assume that the number of characters printed before the printhead fails is a random variable from a normal distribution. That is, we assume that the population from which the sample of 15 measurements is selected is distributed normally. One way to check this assumption is to graph the distribution of data in Table 5.5. If the sample data are approximately normally distributed, then the population from which the sample is selected is very likely to be normal. A MINITAB stem-and-leaf plot for the sample data is displayed in Figure 5.13. The distribution is mound shaped and nearly symmetric. Therefore, the assumption of normality appears to be reasonably satisfied.

Look Back

Other checks for normality, such as a normal probability plot and the ratio IQR/s, may also be used to verify the normality condition.

Now Work Exercise 5.37

We have emphasized throughout this section that an assumption that the population is approximately normally distributed is necessary for making small-sample inferences about μ when the t-statistic is used. Although many phenomena do have approximately normal distributions, it is also true that many random phenomena have distributions that are not normal or even mound shaped. Empirical evidence acquired over the years has shown that the t-distribution is rather insensitive to moderate departures from normality. That is, the use of the t-statistic when sampling from slightly skewed mound-shaped populations generally produces credible results; however, for cases in which the distribution is distinctly nonnormal, we must either take a large sample or use a nonparametric method.

What Do You Do When the Population Relative Frequency Distribution Departs Greatly from Normality?

Answer: Use the nonparametric statistical method of optimal section 6.8.

Statistics in Action Revisited

Estimating the Mean Overpayment

Refer to the Medicare fraud investigation described in the Statistics in Action method of Optional Section 6.8. Recall that the United States Department of Justice (USDOJ) obtained a random sample of 52 claims from a population of 1,000 Medicare claims. For each claim, the amount paid, the amount disallowed (denied) by the auditor, and the amount that should have been paid (allowed) were recorded and saved in the MCFRAUD file. The USDOJ wants to use these data to calculate an estimate of the overpayment for all 1,000 claims in the population.

One way to do this is to first use the sample data to estimate the mean overpayment per claim for the population, then use the estimated mean to extrapolate the overpayment amount to the population of all 1,000 claims. The difference between the amount paid and the amount allowed by the auditor represents the overpayment for each claim. This value is recorded as the amount denied in the MCFRAUD file. These overpayment amounts are listed in Table SIA5.1.

MINITAB software is used to find a 95% confidence interval for μ, the mean overpayment amount. The MINITAB printout is displayed in Figure SIA5.1. The 95% confidence interval for μ, (highlighted on the printout) is (16.51, 27.13). Thus, the USDOJ can be 95% confident that the mean overpayment amount for the population of 1,000 claims is between $16.51 and $27.13.

Figure SIA5.1

MINITAB confidence interval for mean overpayment

Table SIA5.1 Overpayment Amounts for Sample of 52 Claims

Alternate View
$0.00 $31.00 $0.00 $37.20 $37.20 $0.00 $43.40 $0.00
$37.20 $43.40 $0.00 $37.20 $0.00 $24.80 $0.00 $0.00
$37.20 $0.00 $37.20 $0.00 $37.20 $37.20 $37.20 $0.00
$37.20 $37.20 $0.00 $0.00 $37.20 $0.00 $43.40 $37.20
$0.00 $37.20 $0.00 $37.20 $37.20 $0.00 $37.20 $37.20
$0.00 $37.20 $43.40 $0.00 $37.20 $37.20 $37.20 $0.00
$43.40 $0.00 $43.40 $0.00

Data Set: MCFRAUD

Now, let xi represent the overpayment amount for the i claim. If the true mean μ were known, then the total overpayment amount for all 1,000 claims would be equal to

i=11,000xi=(1,000)[i=11,000]xi/(1,000)=(1,000)μ

Consequently, to estimate the total overpayment amount for the 1,000 claims, the USDOJ will simply multiply the end points of the interval by 1,000. This yields the 95% confidence interval ($16,510, $27,130).* Typically, the USDOJ is willing to give the Medicare provider in question the benefit of the doubt by demanding a repayment equal to the lower 95% confidence bound—in this case, $16,510.

Exercises 5.29–5.51

Understanding the Principles

  1. 5.29 State the two problems (and corresponding solutions) that arise with using a small sample to estimate μ.

  2. 5.30 Compare the shapes of the z- and t-distributions.

  3. 5.31 Explain the differences in the sampling distributions of x¯ for large and small samples under the following assumptions:

    1. The variable of interest, x, is normally distributed.

    2. Nothing is known about the distribution of the variable x.

Applet Exercise 5.3

Use the applet entitled Confidence Intervals for a Mean (the impact of not knowing the standard deviation) to compare proportions of z-intervals and t-intervals that contain the mean for a population that is normally distributed.

    1. Using n=5 and the normal distribution with mean 50 and standard deviation 10, run the applet several times. How do the proportions of z-intervals and t-intervals that contain the mean compare?

    2. Repeart part a first for n=10 and then for n=20. Compare your results with those you obtained in part a.

    3. Describe any patterns you observe between the proportion of z-intervals that contain the mean and the proportion of t-intervals that contain the mean as the sample size increases.

Applet Exercise 5.4

Use the applet entitled Confidence Intervals for a Mean (the impact of not knowing the standard deviation) to compare proportions of z-intervals and t-intervals that contain the mean for a population with a skewed distribution.

    1. Using n=5 and the right-skewed distribution with mean 50 and standard deviation 10, run the applet several times. How do the proportions of z-intervals and t-intervals that contain the mean compare?

    2. Repeat part a first for n=10 and then for n=20. Compare your results with those you obtained in part a.

    3. Describe any patterns you observe between the proportion of z-intervals that contain the mean and the proportion of t-intervals that contain the mean as the sample size increases.

    4. How does the skewness of the underlying distribution affect the proportions of z-intervals and t-intervals that contain the mean?

Learning the Mechanics

  1. 5.32 Suppose you have selected a random sample of n=7 measurements from a normal distribution. Compare the standard normal z-values with the corresponding t-values if you were forming the following confidence intervals:

    1. 80% confidence interval

    2. 90% confidence interval

    3. 95% confidence interval

    4. 98% confidence interval

    5. 99% confidence interval

    6. Use the table values you obtained in parts a–e to sketch the z- and t-distributions. What are the similarities and differences?

  2. 5.33 Let t0 be a specific value of t. Use technology or Table III in Appendix B to find t0 values such that the following statements are true:

    1. P(tt0)=.025, where df=10

    2. P(tt0)=.01, where df=17

    3. P(tt0)=.005, where df=6

    4. P(tt0)=.05, where df=13

  3. 5.34 Let t0 be a particular value of t. Use technology or Table III of Appendix B to find t0 values such that the following statements are true:

    1. P(t0<t<t0)=.95, where df=16

    2. P(tt0ortt0)=.05, where df=16

    3. P(tt0)=.05, where df=16

    4. P(tt0ortt0)=.10, where df=12

    5. P(tt0ortt0)=.01, where df=8

  4. 5.35 The following random sample was selected from a normal distribution: 4, 6, 3, 5, 9, 3.

    1. Construct a 90% confidence interval for the population mean μ.

    2. Construct a 95% confidence interval for the population mean μ.

    3. Construct a 99% confidence interval for the population mean μ.

    4. Assume that the sample mean x¯ and sample standard deviation s remain exactly the same as those you just calculated, but that they are based on a sample of n=25 observations rather than n=6 observations. Repeat parts a–c. What is the effect of increasing the sample size on the width of the confidence intervals?

  5. L05036 5.36 The following sample of 16 measurements was selected from a population that is approximately normally distributed:

    Alternate View
    91 80  99 110  95 106 78 121 106 100
    97 82 100  83 115 104
    1. Construct an 80% confidence interval for the population mean.

    2. Construct a 95% confidence interval for the population mean, and compare the width of this interval with that of part a.

    3. Carefully interpret each of the confidence intervals, and explain why the 80% confidence interval is narrower.

Applying the Concepts—Basic

  1. PAI 5.37 Music performance anxiety. Refer to the British Journal of Music Education (Mar. 2014) study of performance anxiety by music students, Exercise 2.39 (p. 50). Recall that the Performance Anxiety Inventory (PAI) was used to measure music performance anxiety on a scale from 20 to 80 points. The table below gives PAI values for participants in eight different studies.

    Alternate View
    54 42 51 39 41 43 55 40

    Source: Patston, T. “Teaching stage fright?—Implications for music educators.” British Journal of Music Education, Vol. 31, No. 1, Mar. 2014 (adapted from Figure 1).

    1. Compute the mean PAI value, x¯, for the sample of 8 studies. (See your answer to Exercise 2.62a .)

    2. Compute the standard deviation of the PAI values, s, for the sample of 8 studies. (See your answer to Exercise2.87b .)

    3. Use the results, parts a and b, to form a 95% confidence interval for μ, the true mean PAI value for the population of all similar music performance anxiety studies.

    4. For the interval, part c, to be valid, how should the population of PAI values for all music performance anxiety studies be distributed?

    5. If you were to repeatedly sample eight music performance anxiety studies and form a 95% confidence interval for μ for each sample, what proportion of the intervals will actually contain the true value of μ?

  2. 5.38 Giraffes have excellent vision. Due to habitat, giraffes travel in small groups. Hence, they require excellent vision in order to detect predators. The eyesight of giraffes was studied in African Zoology (Oct. 2013). The researchers measured a variety of eye characteristics for 27 giraffes native to Zimbabwe, Africa. One variable measured was eye mass (in grams). The study reported x¯=53.4g and s=8.6g.

    1. Use this information to find a 99% confidence interval for the true mean eye mass of giraffes native to Zimbabwe.

    2. Suppose it is known that the mean eye mass of an African water buffalo is μ=31 grams. Is it likely that the true mean eye mass of a giraffe is larger or smaller than this mean? Explain.

    3. Suppose it is known that the mean eye mass of an African elephant is μ=58 grams. Is it likely that the true mean eye mass of a giraffe is larger or smaller than this mean? Explain.

  3. 5.39 Radon exposure in Egyptian tombs. Many ancient Egyptian tombs were cut from limestone rock that contained uranium. Since most tombs are not well-­ventilated, guards, tour guides, and visitors may be exposed to deadly radon gas. In Radiation Protection Dosimetry (Dec. 2010), a study of radon exposure in tombs in the Valley of Kings, Luxor, Egypt (recently opened for public tours), was conducted. The radon levels—measured in becquerels per cubic meter (Bq/m3)—in the inner chambers of a sample of 12 tombs were determined. For this data, assume that x¯=3,643Bq/m3 and s=1,187Bq/m3. Use this information to estimate, with 95% confidence, the true mean level of radon exposure in tombs in the Valley of Kings. Interpret the resulting interval.

  4. ANTS 5.40 Rainfall and desert ants. Refer to the Journal of Biogeography (Dec. 2003) study of ants and their habitat in the desert of Central Asia, presented in Exercise 2.68 (p. 63). Recall that botanists randomly selected five sites in the Dry Steppe region and six sites in the Gobi Desert where ants were observed. One of the variables of interest is the annual rainfall (in millimeters) at each site. Summary statistics for the annual rainfall at each site are provided in the SAS printout below.

    1. Give a point estimate for the average annual rainfall amount at ant sites in the Dry Steppe region of Central Asia.

    2. Give the t-value used in a small-sample 90% confidence interval for the true average annual rainfall amount at ant sites in the Dry Steppe region.

    3. Use the result you obtained in part b and the values of x¯ and s shown on the SAS printout to form a 90% confidence interval for the target parameter.

    4. Give a practical interpretation for the interval you found in part c.

    5. Use the data in the ANTS file to check the validity of the confidence interval you found in part c.

    6. Repeat parts a–e for the Gobi Desert region of Central Asia.

  5. TRAPS 5.41 Lobster trap placement. An observational study of teams fishing for the red spiny lobster in Baja California Sur, Mexico, was conducted and the results published in Bulletin of Marine Science (Apr. 2010). One of the variables of interest was the average distance separating traps—called trap spacing—deployed by the same team of fishermen. Trap spacing measurements (in meters) for a sample of seven teams of red spiny lobster fishermen are shown in the accompanying table. Of interest is the mean trap spacing for the population of red spiny lobster fishermen fishing in Baja California Sur, Mexico.

    93 99 105 94 82 70 86

    Based on Shester, G. G. “Explaining catch variation among Baja California lobster fishers through spatial analysis of trap-placement decisions.” Bulletin of Marine Science, Vol. 86, No. 2, Apr. 2010 (Table 1), pp. 479 - 498.

    1. Identify the target parameter for this study.

    2. Compute a point estimate of the target parameter.

    3. What is the problem with using the normal (z) statistic to find a confidence interval for the target parameter?

    4. Find a 95% confidence interval for the target parameter.

    5. Give a practical interpretation of the interval, part d.

    6. What conditions must be satisfied for the interval, part d, to be valid?

  6. TURTLES 5.42 Shell lengths of sea turtles. Refer to the Aquatic Biology (Vol. 9, 2010) study of green sea turtles inhabiting the Grand Cayman South Sound lagoon, Exercise 5.24 (p. 264). Time-depth recorders were deployed on 6 of the 76 captured turtles. The time-depth recorders allowed the environmentalists to track the movement of the sea turtles in the lagoon. These 6 turtles had a mean shell length of 52.9 cm with a standard deviation of 6.8 cm.

    1. Use the information on the 6 tracked turtles to estimate, with 99% confidence, the true mean shell length of all green sea turtles in the lagoon. Interpret the result.

    2. What assumption about the distribution of shell lengths must be true in order for the confidence interval, part a, to be valid? Is this assumption reasonably satisfied? (Use the data saved in the TURTLES file to help you answer this question.)

  7. DAYLT 5.43 Duration of daylight in western Pennsylvania. What area of the United States has the least amount of daylight, on average? Having grown up in western Pennsylvania, co-author Sincich wonders if it is his hometown of Sharon, PA. Data on the number of minutes of daylight per day in Sharon, PA, for 12 randomly selected days (one each month) in a recent year were obtained from the Naval Oceanography Portal Web site (aa.usno.navy.mil/USNO/astronomical-applications/­data-services). The data are listed in the table. Descriptive statistics and a 95% confidence interval for the mean are produced in the SPSS printout in the next column.

    1. Locate the confidence interval on the printout and give the value of the confidence coefficient.

    2. Use the descriptive statistics on the printout to calculate the 95% confidence interval. Be sure your answer agrees with the interval shown on the printout.

    3. Practically interpret the confidence interval.

    4. Comment on the method of sampling. Do you think the sample is representative of the target population?

      SPSS Output for Exercise 5.43

    Alternate View
    591 618 704 831 896 909 865 839 748 672 583 565

    Based on Astronomical Applications Dept., U.S. Naval Observatory, Washington, DC.

    SPSS Output for Exercise 5.43

Applying the Concepts—Intermediate

  1. ROBOTS 5.44 Do social robots walk or roll? Refer to the International Conference on Social Robotics (Vol. 6414, 2010) study on the current trend in the design of social robots, Exercise 2.7 (p. 38). Recall that in a random sample of social robots obtained through a Web search, 28 were built with wheels. The number of wheels on each of the 28 robots is reproduced in the accompanying table.

    1. Estimate μ, the average number of wheels used on all social robots built with wheels, with 99% confidence.

    2. Practically interpret the interval, part a.

    3. Refer to part a. In repeated sampling, what proportion of all similarly constructed confidence intervals will contain the true mean, μ?

      Alternate View
      4 4 3 3 3 6 4 2 2 2 1 3 3 3
      3 4 4 3 2 8 2 2 3 4 3 3 4 2

      Source: Chew, S., et al. “Do social robots walk or roll?” International Conference on Social Robotics, Vol. 6414, 2010 (adapted from Figure 2).

  2. 5.45 Pitch memory of amusiacs. A team of psychologists and neuroscientists tested the pitch memory of individuals diagnosed with amusia (a disorder that impacts one’s perception of music) and reported their results in Advances in Cognitive Psychology (Vol. 6, 2010). Each in a sample of 17 amusiacs listened to a series of tone pairs and then were asked to determine if the tones were the same or different. In one trial, the tones were separated by 1 second. In a second trial, the tones were separated by 5 seconds. Scores in the two trials were compared for each amusiac. The mean score difference was .11 with a standard deviation of .19. Use this information to form a 90% confidence interval for the true mean score difference for all amusiacs. Interpret the result. What assumption about the population of score differences must hold true for the interval to be valid? 

  3. SHAFTS 5.46 Shaft graves in ancient Greece. Refer to the American Journal of Archaeology (Jan. 2014) study of shaft graves in ancient Greece, Exercise 2.37 (p. 50). Recall that shaft graves are named for the beautifully decorated sword shafts that are buried along with the bodies. The table on p. 274 gives the number of shafts buried at each of 13 recently discovered grave sites.

    Alternate View
    1 2 3 1 5 6 2 4 1 2 4 2 9

    Source: Harrell, K. “The fallen and their swords: A new explanation for the rise of the shaft graves.” American Journal of Archaeology, Vol. 118, No. 1, Jan. 2014 (Figure 1).

    1. Estimate the average number of shafts buried in ancient Greece graves using a 90% confidence interval. Give a practical interpretation of the interval.

    2. What assumption about the data on shaft graves is required for the inference, part a, to be valid?

  4. BUBBLE 5.47 Oxygen bubbles in molten salt. Molten salt is used in an electro-refiner to treat nuclear fuel waste. Eventually, the salt needs to be purified (for reuse) or disposed of. A promising method of purification involves oxidation. Such a method was investigated in Chemical Engineering Research and Design (Mar. 2013). An important aspect of the purification process is the rising velocity of oxygen bubbles in the molten salt. An experiment was conducted in which oxygen was inserted (at a designated sparging rate) into molten salt and photographic images of the bubbles taken. A random sample of 25 images yielded the data on bubble velocity (measured in meters per second) shown in the table. [Note: These data are simulated based on information provided in the article.]

    Alternate View
    0.275 0.261 0.209 0.266 0.265 0.312 0.285 0.317 0.229
    0.251 0.256 0.339 0.213 0.178 0.217 0.307 0.264 0.319
    0.298 0.169 0.342 0.270 0.262 0.228 0.220
    1. Use statistical software to find a 95% confidence interval for the mean bubble rising velocity of the population. Interpret the result.

    2. The researchers discovered that the mean bubble rising velocity is μ=.338 when the sparging rate of oxygen is 3.33×106. Do you believe that the data in the table were generated at this sparging rate? Explain.

  5. SKID 5.48 Minimizing tractor skidding distance. In planning for a new forest road to be used for tree harvesting, planners must select the location that will minimize tractor skidding distance. In the Journal of Forest Engineering (July 1999), researchers wanted to estimate the true mean skidding distance along a new road in a European forest. The skidding distances (in meters) were measured at 20 randomly selected road sites. These values are given in the accom­panying table.

    Alternate View
    488 350 457 199 285 409 435 574 439 546
    385 295 184 261 273 400 311 312 141 425

    Based on Tujek, J., & Pacola, E. “Algorithms for skidding distance model­ing on a raster Digital Terrain Model.” Journal of Forest Engineering, Vol. 10, No. 1, July 1999 (Table 1).

    1. Estimate, with a 95% confidence interval, the true mean skidding distance of the road.

    2. Give a practical interpretation of the interval you found in part a.

    3. What conditions are required for the inference you made in part b to be valid? Are these conditions reasonably satisfied?

    4. A logger working on the road claims that the mean skidding distance is at least 425 meters. Do you agree?

  6. 5.49 Reproduction of bacteria-infected spider mites. Zoologists in Japan investigated the reproductive traits of spider mites with a bacterial infection (Heredity, Jan. 2007). Male and female pairs of infected spider mites were mated in a laboratory and the number of eggs produced by each female recorded. Summary statistics for several samples are provided in the accompanying table. Note that, in some samples, one or both infected spider mites were treated with antibiotic prior to mating.

    1. For each type of female–male pair, construct and interpret a 90% confidence interval for the population mean number of eggs produced by the female spider mite.

    2. Identify the type of female–male pair that appears to produce the highest mean number of eggs.

    Alternate View
    Female– Male Pairs Sample Size Mean # of Eggs Standard Deviation
    Both  untreated 29 20.9 3.34
    Male treated 23 20.3 3.50
    Female  treated 18 22.9 4.37
    Both treated 21 18.6 2.11

    Based on Gotoh, T., Noda, H., & Ito, S. “Cardinium symbionts cause cytoplasmic incompatibility in spider mites.” Heredity, Vol. 98, No. 1, Jan. 2007 (Table 2).

  7. 5.50 Antigens for a parasitic roundworm in birds. Ascaridia galli is a parasitic roundworm that attacks the intestines of birds, especially chickens and turkeys. Scientists are working on a synthetic vaccine (antigen) for the parasite. The success of the vaccine hinges on the characteristics of DNA in peptide (protein) produced by the antigen. In the journal Gene Therapy and Molecular Biology (June 2009), scientists tested alleles of antigen-produced protein for level of peptide. For a sample of 4 alleles, the mean peptide score was 1.43 and the standard deviation was .13.

    1. Use this information to construct a 90% confidence interval for the true mean peptide score in alleles of the antigen-produced protein.

    2. Interpret the interval for the scientists.

    3. What is meant by the phrase “90% confidence”?

Applying the Concepts—Advanced

  1. 5.51 Study on waking sleepers early. Scientists have discovered increased levels of the hormone adrenocorticotropin in people just before they awake from sleeping (Nature, Jan. 7, 1999). In the study described, 15 subjects were monitored during their sleep after being told that they would be woken at a particular time. One hour prior to the designated wake-up time, the adrenocorticotropin level (pg/mL) was measured in each, with the following results:

    x¯=37.3s=13.9
    1. Use a 95% confidence interval to estimate the true mean adrenocorticotropin level of sleepers one hour prior to waking.

    2. Interpret the interval you found in part a in the words of the problem.

    3. The researchers also found that if the subjects were woken three hours earlier than they anticipated, the average adrenocorticotropin level was 25.5 pg/mL. Assume that μ=25.5 for all sleepers who are woken three hours earlier than expected. Use the interval from part a to make an inference about the mean adrenocorticotropin level of sleepers under two conditions: one hour before the anticipated wake-up time and three hours before the anticipated wake-up time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset