5.6 Confidence Interval for a Population Variance (Optional)

In the previous sections, we considered interval estimation for population means or proportions. In this optional section, we discuss a confidence interval for a population variance, σ2.

Recall Example 1.4 (p. 10) and the Consider a U.S. Army Corps of Engineers study of contaminated fish in the Tennessee River, Alabama. It is important for the Corps of Engineers to know how stable the weights of the contaminated fish are. That is, how large is the variation in the fish weights? The keyword “variation” indicates that the target population parameter is σ2, the variance of the weights of all contaminated fish inhabiting the Tennessee River. Of course, the exact value of σ2 will be unknown. Consequently, the Corps of Engineers wants to estimate its value with a high level of confidence.

Intuitively, it seems reasonable to use the sample variance, s2, to estimate σ2. However, unlike with sample means and proportions, the sampling distribution of s2 does not follow a normal (z) distribution or a Student’s t distribution. Rather, when certain assumptions are satisfied (we discuss these later), the sampling distribution of s2 possesses approximately a chi-square (χ2) distribution. The chi-square probability distribution, like the t distribution, is characterized by a quantity called the degrees of freedom (df) associated with the distribution. Several chi-square distributions with different df values are shown in Figure 5.20. You can see that unlike z and t distributions, the chi-square distribution is not symmetric about 0.

Figure 5.20

Several χ2 Probability Distributions

Table 5.7 Reproduction of Part of Table IV in Appendix B: Critical Values of Chi Square

Alternate View
Degrees of Freedom χ.100 2 χ.050 2 χ.025 2 χ.010 2 χ.005 2
 1 2.70554 3.84146 5.02389 6.63490 7.87944
 2 4.60517 5.99147 7.37776 9.21034 10.5966
 3 6.25139 7.81473 9.34840 11.3449 12.8381
 4 7.77944 9.48773 11.1433 13.2767 14.8602
 5 9.23635 11.0705 12.8325 15.0863 16.7496
 6 10.6446 12.5916 14.4494 16.8119 18.5476
 7 12.0170 14.0671 16.0128 18.4753 20.2777
 8 13.3616 15.5073 17.5346 20.0902 21.9550
 9 14.6837 16.9190 19.0228 21.6660 23.5893
10 15.9871 18.3070 20.4831 23.2093 25.1882
11 17.2750 19.6751 21.9200 24.7250 25.7569
12 18.5494 21.0261 23.3367 26.2170 28.2995
13 19.8119 22.3621 24.7356 27.6883 29.8194
14 21.0642 23.6848 26.1190 29.1413 31.3193
15 22.3072 24.9958 27.4884 30.5779 32.8013
16 23.5418 26.2862 28.8454 31.9999 34.2672
17 24.7690 27.5871 30.1910 33.4087 35.7185
18 25.9894 28.8693 31.5264 34.8053 37.1564
19 27.2036 30.1435 32.8523 36.1908 38.5822

The upper-tail areas for this distribution have been tabulated and are given in Table IV in Appendix B, a portion of which is reproduced in Table 5.7. The table gives the values of χ2, denoted as χα2, that locate an area of α in the upper tail of the chi-square distribution; that is, P(χ2>χα2)=α. As with the t-statistic, the degrees of freedom associated with s2are(n1). Thus, for n=10 and an upper-tail value α=.05, you will have n1=9 df and χ.052=16.9190 (highlighted in Table 5.7).

[Note: Values of χα2 can also be obtained using the inverse chi-square option of statistical software.]

The chi-square distribution is used to find a confidence interval for σ2, as shown in the box. An illustrative example follows.

A 100(1-α) Confidence Interval for σ2

(n1)s2χα/22σ2(n1)s2χ(1α/2)2

where χα/22 and χ(1α/2)2 are values corresponding to an area of α/2 in the right (upper) and left (lower) tails, respectively, of the chi-square distribution based on (n1) degrees of freedom.

Conditions Required for a Valid Confidence Interval for σ2

  1. A random sample is selected from the target population.

  2. The population of interest has a relative frequency distribution that is approxi­mately normal.

FISHDDT Example 5.11 Estimating the Weight Variance, σ 2, of Contaminated Fish 

Problem

  1. Refer to the U.S. Army Corps of Engineers study of contaminated fish in the Tennessee River. The Corps of Engineers has collected data for a random sample of 144 fish contaminated with DDT. (The engineers made sure to capture contaminated fish in several different randomly selected streams and tributaries of the river.) The fish weights (in grams) are saved in the FISHDDT file. The Army Corps of Engineers wants to estimate the true variation in fish weights in order to determine whether the fish are stable enough to allow further testing for DDT contamination.

    1. Use the sample data to find a 95% confidence interval for the parameter of interest.

    2. Determine whether the confidence interval, part a, is valid.

Solution

  1. Here the target parameter is σ2, the variance of the population of weights of contaminated fish. First, we need to find the sample variance, s2, to compute the interval estimate. The MINITAB printout, Figure 5.21 gives descriptive statistics for the sample weights. You can see that the variance (highlighted) is s2=141,787.

    Figure 5.21

    MINITAB Descriptive Statistics for Fish Weights, Example 5.11

    For α=.05 (a 95% confidence interval), we require the critical values χα/22=χ.0252 and χ(1α/2)2=χ.9752 for a chi-square distribution with (n1)=143 degrees of freedom. Examining Table IV in Appendix B, we see that these values are given for df=100 and df=150, but not for df=143. We could approximate these critical chi-square values using the entries for df=150 (the row closest to df=143). Or we could use statistical software to obtain the exact values. The exact values, obtained using MINITAB (and shown on Figure 5.22), are χ.0252=111.79 and χ.9752=178.0.

    Figure 5.22

    MINITAB Output with critical χ2 values, Example 5.11

    Substituting the appropriate values into the formula given in the box, we obtain:

    (143)(141,787)178σ2(143)(141,787)111.79

    Or

    113,907σ2181,371

    Thus, the Army Corps of Engineers can be 95% confident that the variance in weights of the population of contaminated fish ranges between 113,907 and 181,371. [Note: You can obtain this interval directly using statistical software as well. This interval is shown (highlighted) on the MINITAB printout, Figure 5.23. Our calculated values agree, except for rounding.]

    Figure 5.23

    MINITAB Output with 95% Confidence Interval for σ2, Example 5.11

  2. According to the box, two conditions are required for the confidence interval to be valid. First, the sample must be randomly selected from the population. The Army Corps of Engineers did, indeed, collect a random sample of contaminated fish, making sure to sample fish from different locations in the Tennessee River. Second, the population data (the fish weights) must be approximately normally distributed. A MINITAB histogram for the sampled fish weights (with a normal curve superimposed) is displayed in Figure 5.24. Clearly, the data appear to be approximately normally distributed. Thus, the confidence interval is valid.

Figure 5.24

MINITAB histogram of fish weights, Example 5.11

Look Ahead

Will this confidence interval be practically useful in helping the Corps of Engineers decide whether the weights of the fish are stable? Only if it is clear what a weight variance of, say, 150,000 grams2 implies. Most likely, the Corps of Engineers will want the interval in the same units as the weight measurement—grams. Consequently, a confidence interval for σ, the standard deviation of the population of fish weights, is desired. We demonstrate how to obtain this interval estimate in the next example.

Now Work Exercise 5.107a

FISHDDT Example 5.12 Estimating the Weight Standard Deviation, σ, of Contaminated Fish

Problem

  1. Refer to Example 5.11. Find a 95% confidence interval for σ, the true standard deviation of the contaminated fish weights.

Solution

  1. A confidence interval for σ is obtained by taking the square roots of the lower and upper endpoints of a confidence interval for σ2. Consequently, the 95% confidence interval for σ is:

    113,907σ181,371

    Or,

    337.5σ425.9

    [Note: This interval is also shown on the MINITAB printout, Figure 5.23.]

    Thus, the engineers can be 95% confident that the true standard deviation of fish weights is between 337.5 grams and 425.9 grams.

Look Back

Suppose the Corps of Engineers’ threshold is σ=500 grams. That is, if the standard deviation in fish weights is 500 grams or higher, further DDT contamination tests will be suspended due to the unstableness of the fish weights. Since the 95% confidence interval for σ lies below 500 grams, the engineers will continue the DDT contamination tests on the fish.

Now Work Exercise 5.107b

Caution

The procedure for estimating either σ2 or σ requires an assumption regardless of whether the sample size n is large or small (see the conditions in the box). The sampled data must come from a population that has an approximate normal distribution. Unlike small sample confidence intervals for μ based on the t-distribution, slight to moderate departures from normality will render the chi-square confidence interval for σ2 invalid.

Exercises 5.99–5.117

Understanding the Principles

  1. 5.99 What sampling distribution is used to find an interval estimate for σ2?

  2. 5.100 What conditions are required for a valid confidence interval for σ2?

  3. 5.101 How many degrees of freedom are associated with a chi-square sampling distribution for a sample of size n?

Learning the Mechanics

  1. 5.102 For each of the following combinations of α and degrees of freedom (df), use either Table IV in Appendix B or statistical software to find the values of χα/22andχ(1α/2)2 that would be used to form a confidence interval for σ2.

    1. α=.05,df=7

    2. α=.10,df=16

    3. α=.01,df=20

    4. α=.05,df=20

  2. 5.103 Given the following values of x¯, s, and n, form a 90% confidence interval for σ2.

    1. x¯=21,s=2.5,n=50

    2. x¯=1.3,s=.02,n=15

    3. x¯=167,s=31.6,n=22

    4. x¯=9.4,s=1.5,n=5

  3. 5.104 Refer to Exercise 5.103. For each part, a–d, form a 90% confidence interval for σ.

  4. L05105 5.105 A random sample of n=6 observations from a normal distribution resulted in the data shown in the table. Compute a 95% confidence interval for σ2.

    Alternate View
    8 2 3 7 11 6

Applying the Concepts—Basics

  1. SUSTAIN 5.106 Corporate sustainability of CPA firms. Refer to the Business and Society (Mar. 2011) study on the sustainability behaviors of CPA corporations, Exercise 5.18 (p. 262). Recall that the level of support for corporate sustainability (measured on a quantitative scale ranging from 0 to 160 points) was obtained for each in a sample of 992 senior managers at CPA firms. The accompanying MINITAB printout gives 90% confidence intervals for both the variance and standard deviation of level of support for all senior managers at CPA firms.

    1. Locate the 90% confidence interval for σ2 on the printout.

    2. Use the sample variance on the printout to calculate the 90% confidence interval for σ2. Does your result agree with the interval shown on the printout?

    3. Locate the 90% confidence interval for σ on the printout.

    4. Use the result, part a, to calculate the 90% confidence interval for σ. Does your result agree with the interval shown on the printout?

    5. Give a practical interpretation of the 90% confidence interval for σ.

    6. What assumption about the distribution of level of support is required for the inference, part e, to be valid? Is this assumption reasonably satisfied?

  2. ROCKS 5.107 Characteristics of a rockfall. Refer toConsider the Environmental Geology (Vol. 58, 2009) simulation study of how far a block from a collapsing rock wall will bounce down a soil slope, Exercise 2.61 (p. 61). Rebound lengths (in meters) were estimated for 13 rock bounces. The data are repeated in the table. A MINITAB analysis of the data is shown in the printout below.

    Alternate View
    10.94 13.71 11.38 7.26 17.83 11.92 11.87
    5.44 13.35 4.90 5.85 5.10 6.77

    Based on Paronuzzi, P. “Rockfall-induced block propagation on a soil slope, northern Italy.” Environmental Geology, Vol. 58, 2009 (Table 2).

    1. a. Locate a 95% confidence interval for σ2 on the printout. Interpret the result.

    2. b. Locate a 95% confidence interval for σ on the printout. Interpret the result.

    3. c. What conditions are required for the intervals, parts a and b, to be valid?

  3. 5.108 Motivation of drug dealers. Refer to the Applied Psychology in Criminal Justice (Sept. 2009) study of the personality characteristics of convicted drug dealers, Exercise 5.17 (p. 262). A random sample of 100 drug dealers had a mean Wanting Recognition (WR) score of 39 points, with a standard deviation of 6 points. The researchers also are interested in σ2, the variation in WR scores for all convicted drug dealers.

    1. Identify the target parameter, in symbols and words.

    2. Compute a 99% confidence interval for σ2.

    3. What does it mean to say that the target parameter lies within the interval with “99% confidence”?

    4. What assumption about the data must be satisfied in order for the confidence interval to be valid?

    5. To obtain a practical interpretation of the interval, part b, explain why a confidence interval for the standard deviation, σ, is desired.

    6. Use the results, part b, to compute a 99% confidence interval for σ. Give a practical interpretation of the interval.

  4. 5.109 Facial structure of CEOs. Refer to the Psychological Science (Vol. 22, 2011) study of a chief executive officer’s facial structure, Exercise 5.21 (p. 263). Recall that the facial width-to-height ratio (WHR) was determined by computer analysis for each in a sample of 55 CEOs at publicly traded Fortune 500 firms, with the following results: x¯=1.96,s=.15.

    1. Find and interpret a 95% confidence interval for the standard deviation, σ, of the facial WHR values for all CEOs at publicly traded Fortune 500 firms. Interpret the result.

    2. For the interval, part a, to be valid, the population of WHR values should be distributed how? Draw a sketch of the required distribution to support your answer.

  5. 5.110 Antigens for a parasitic roundworm in birds. Refer to the Gene Therapy and Molecular Biology (June 2009) study of DNA in peptide (protein) produced by antigens for a parasitic roundworm in birds, Exercise 5.50 (p. 274). Recall that scientists tested each in a sample of 4 alleles of antigen-produced protein for level of peptide. The results were: x¯=1.43ands=.13. Use this information to construct a 90% confidence interval for the true variation in peptide scores for alleles of the antigen-produced protein. Interpret the interval for the scientists.

  6. 5.111 Oil content of fried sweet potato chips. The characteristics of sweet potato chips fried at different temperatures were investigated in the Journal of Food Engineering (Sept. 2013). A sample of 6 sweet potato slices were fried at 130° using a vacuum fryer. One characteristic of interest to the researchers was internal oil content (measured in gigagrams). The results were: x¯=.178g/g and s=.011g/g. Use this information to construct a 95% confidence interval for the true standard deviation of the internal oil content distribution for the sweet potato chips. Interpret the result practically.

Applying the Concepts—Intermediate

  1. 5.112 Radon exposure in Egyptian tombs. Refer to the Radiation Protection Dosimetry (Dec. 2010) study of radon exposure in tombs carved from limestone in the Egyptian Valley of Kings, Exercise 5.39 (p. 272). The radon levels in the inner chambers of a sample of 12 tombs were determined, yielding the following summary statistics: x¯=3,643Bq/m3 and s=4,487Bq/m3. Use this information to estimate, with 95% confidence, the true standard deviation of radon levels in tombs in the Valley of Kings. Interpret the resulting interval. Be sure to give the units of measurement in your interpretation.

  2. MOLARS 5.113 Cheek teeth of extinct primates. Refer toConsider the American Journal of Physical Anthropology (Vol. 142, 2010) study of the characteristics of cheek teeth (e.g., molars) in an extinct primate species, Exercise 2.38 (p. 50). Recall that the The researchers recorded the dentary depth of molars (in millimeters) for a sample of 18 cheek teeth extracted from skulls. The data are repeated in the table. Estimate the true standard deviation in molar depths for the population of cheek teeth in extinct primates using a 95% confidence interval. Give a practical interpretation of the result. Are the conditions required for a valid confidence interval reasonably satisfied?

    18.12 16.55
    19.48 15.70
    19.36 17.83
    15.94 13.25
    15.83 16.12
    19.70 18.13
    15.76 14.02
    17.00 14.04
    13.96 16.20

    Based on Boyer, D. M., Evans, A. R., and Jernvall, J. “Evidence of dietary differentiation among Late Paleocene–Early Eocene Plesiadapids (Mammalia, Primates).” American Journal of Physical Anthropology, Vol. 142, 2010 (Table A3).

  3. TURTLES 5.114 Shell lengths of sea turtles. Refer to the Aquatic Biology (Vol. 9, 2010) study of green sea turtles inhabiting the Grand Cayman South Sound lagoon, Exercise 5.24 (p. 264). Recall that the data on shell length, measured in centimeters, for 76 captured turtles are saved in the TURTLES file. Use the sample data to estimate the true variance in shell lengths of all green sea turtles in the lagoon with 90% confidence. Interpret the result.

  4. TRAPS 5.115 Lobster trap placement. Refer to the Bulletin of Marine Science (Apr. 2010) study of red spiny lobster trap placement, Exercise 5.41 (p. 273). Trap spacing measurements (in meters) for a sample of seven teams of red spiny lobster fishermen are repeated in the table below. The researchers want to know how variable the trap spacing measurements are for the population of red spiny lobster fishermen fishing in Baja California Sur, Mexico. Provide the researchers with an estimate of the target parameter using a 99% confidence interval.

    Alternate View
    93 99 105 94 82 70 86

    Based on Shester, G. G. “Explaining catch variation among Baja California lobster fishers through spatial analysis of trap-placement decisions.” Bulletin of Marine Science, Vol. 86, No. 2, Apr. 2010 (Table 1), pp. 479–498.

  5. COUGH 5.116 Is honey a cough remedy? Refer to the Archives of Pediatrics and Adolescent Medicine (Dec. 2007) study of honey as a remedy for coughing, Exercise 2.40 (p. 51). Recall that the 105 ill children in the sample were randomly divided into groups. One group received a dosage of an over-the-counter cough medicine (DM); another group received a dosage of honey (H). The coughing improvement scores (as determined by the children’s parents) for the patients in the two groups are reproduced in the table on p. 295. The pediatric researchers desire information on the variation in coughing improvement scores for each of the two groups.

    1. Find a 90% confidence interval for the standard deviation in improvement scores for the honey dosage group.

    2. Repeat part a for the DM dosage group.

    3. Based on the results, parts a and b, what conclusions can the pediatric researchers draw about which group has the smaller variation in improvement scores? (We demonstrate a more statistically valid method for comparing variances in Chapter 9.)

    Alternate View
    Honey 12 11 15 11 10 13 10 4 15 16 9 14 10 6
     Dosage: 10 8 11 12 12 8 12 9 11 15 10 15 9 13
    8 12 10 8 9 5 12
    DM 4 6 9 4 7 7 7 9 12 10 11 6 3 4
     Dosage: 9 12 7 6 8 12 12 4 12 13 7 10 13 9
    4 4 10 15 9

    Based on Paul, I. M., et al. “Effect of honey, dextromethorphan, and no treatment on nocturnal cough and sleep quality for coughing children and their parents.” Archives of Pediatrics and Adolescent Medicine, Vol. 161, No. 12, Dec. 2007 (data simulated).

  6. PHISH 5.117 Phishing attacks to e-mail accounts.Consider Refer to the Chance (Summer 2007) study of an actual phishing attack against an organization, Exercise 4.164 (p. 236). Recall that phishing describes an attempt to extract personal/financial information from unsuspecting people through fraudulent e-mail. The interarrival times (in seconds) for 267 fraud box e-mail notifications are saved in the accompanying file. Like with Exercise 4.164, considerConsider these interarrival times to represent the population of interest.

    1. Obtain a random sample of n=10 interarrival times from the population.

    2. Use the sample, part a, to obtain an interval estimate of the population variance of the interarrival times. What is the measure of reliability for your estimate?

    3. Find the true population variance for the data. Does the interval, part b, contain the true variance? Give one reason why it may not.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset