5.2 Confidence Interval for a Population Mean: Normal (z) Statistic

Suppose a large hospital wants to estimate the average length of time patients remain in the hospital. Hence, the hospital’s target parameter is the population mean μ. To accomplish this objective, the hospital administrators plan to randomly sample 100 of all previous patients’ records and to use the sample mean x¯ of the lengths of stay to estimate μ, the mean of all patients’ visits. The sample mean x¯ represents a point estimator of the population mean μ. How can we assess the accuracy of this large-sample point estimator?

According to the Central Limit Theorem, the sampling distribution of the sample mean is approximately normal for large samples, as shown in Figure 5.1. Let us calculate the interval estimator:

Figure 5.1

Sampling distribution of x¯

x¯±1.96σx¯=x¯±1.96σn

That is, we form an interval from 1.96 standard deviations below the sample mean to 1.96 standard deviations above the mean. Prior to drawing the sample, what are the chances that this interval will enclose μ, the population mean?

To answer this question, refer to Figure 5.1. If the 100 measurements yield a value of x¯ that falls between the two lines on either side of μ (i.e., within 1.96 standard deviations of μ), then the interval x¯±1.96σx¯ will contain μ; if x¯ falls outside these boundaries, the interval x¯±1.96σx¯ will not contain μ. From Section 4.9, we know that the area under the normal curve (the sampling distribution of x¯) between these boundaries is exactly .95. Thus, the interval x¯±1.96σx¯ will contain μ with a probability equal to .95.

Example 5.1 Estimating the Mean Hospital Length of Stay, σ Known

Problem

  1. Consider the large hospital that wants to estimate the average length of stay of its patients, μ. The hospital randomly samples n=100 of its patients and finds that the sample mean length of stay is x¯=4.5 days. Also, suppose it is known that the standard deviation of the length of stay for all hospital patients is σ=4 days. Use the interval estimator x¯±1.96σx¯ to calculate a confidence interval for the target parameter, μ.

Solution

  1. Substituting x¯=4.5 and σ=4 into the interval estimator formula, we obtain:

    x¯±1.96σx¯=x¯±(1.96)σ/n=4.5±(1.96)(4/100)=4.5±.78

    Or, (3.72, 5.28). We can also obtain this confidence interval using statistical software, as shown (highlighted) on the MINITAB printout, Figure 5.2.

    Figure 5.2

    MINITAB Output Showing 95% Confidence Interval for μ, σ Known

Look Back

Since we know the probability that the interval x¯±1.96σx¯ will contain μ is .95, we call the interval estimator a 95% confidence interval for μ.

Now Work Exercise 5.9a

The interval x¯±1.96σx¯ in Example 5.1 is called a large-sample 95% confidence interval for the population mean μ. The term large-sample refers to the sample being of sufficiently large size that we can apply the Central Limit Theorem and the normal (z) statistic to determine the form of the sampling distribution of x¯. Empirical research suggests that a sample size n exceeding a value between 20 and 30 will usually yield a sampling distribution of x¯ that is approximately normal. This result led many practitioners to adopt the rule of thumb that a sample size of n30 is required to use large-sample confidence interval procedures. Keep in mind, though, that 30 is not a magical number and, in fact, is quite arbitrary.

Also, note that the large-sample interval estimator requires knowing the value of the population standard deviation, σ. In most (if not nearly all) practical applications, however, the value of σ will be unknown. For large samples, the fact that σ is unknown poses only a minor problem since the sample standard deviation s provides a very good approximation to σ. The next example illustrates the more realistic large-sample confidence interval procedure.

HOSPLOS Example 5.2 Estimating the Mean Hospital Length of Stay, σ Unknown

Problem

  1. Refer to Example 5.1 and the problem of estimating μ, the average length of stay of a hospital’s patients. The lengths of stay (in days) for the n=100 sampled patients are shown in Table 5.1. Use the data to find a 95% confidence interval for μ and interpret the result.

    Table 5.1 Lengths of Stay (in Days) for 100 Patients

    Alternate View
    2 3 8 6 4 4 6 4 2 5
    8 10 4 4 4 2 1 3 2 10
    1 3 2 3 4 3 5 2 4 1
    2 9 1 7 17 9 9 9 4 4
    1 1 1 3 1 6 3 3 2 5
    1 3 3 14 2 3 9 6 6 3
    5 1 4 6 11 22 1 9 6 5
    2 2 5 4 3 6 1 5 1 6
    17 1 2 4 5 4 4 3 2 3
    3 5 2 3 3 2 10 2 4 2

    Data Set: HOSPLOS

Solution

  1. The hospital almost surely does not know the true standard deviation, σ, of the population of lengths of stay. However, since the sample size is large, we will use the sample standard deviation, s, as an estimate for σ in the confidence interval formula. An SAS printout of summary statistics for the sample of 100 lengths of stay is shown at the top of Figure 5.3. From the shaded portion of the printout, we find x¯=4.53 days and s=3.68 days. Substituting these values into the interval estimator formula, we obtain:

    x¯±(1.96)σ/nx¯±(1.96)s/n=4.53±(1.96)(3.68)/100=4.53±.72

    Or, (3.81, 5.25). That is, we estimate the mean length of stay for all hospital patients to fall within the interval 3.81 to 5.25 days.

    Figure 5.3

    SAS printout with summary statistics and 95% confidence interval for data on 100 hospital stays

Look Back

The confidence interval is also shown at the bottom of the SAS printout, Figure 5.3. Note that the endpoints of the interval vary slightly from those computed in the example. This is due to the fact that when σ is unknown and n is large, the sampling distribution of x¯ will deviate slightly from the normal (z) distribution. In practice, these differences can be ignored.

Now Work Exercise 5.15

Can we be sure that μ, the true mean, is in the interval from 3.81 to 5.25 in Example 5.2? We cannot be certain, but we can be reasonably confident that it is. This confidence is derived from the knowledge that if we were to draw repeated random samples of 100 measurements from this population and form the interval x¯±1.96σx¯ each time, 95% of the intervals would contain μ. We have no way of knowing (without looking at all the patients’ records) whether our sample interval is one of the 95% that contain μ or one of the 5% that do not, but the odds certainly favor its containing μ. The probability, .95, that measures the confidence we can place in the interval estimate is called a confidence coefficient. The percentage, 95%, is called the confidence level for the interval estimate.

The confidence coefficient is the probability that an interval estimator encloses the population parameter—that is, the relative frequency with which the interval estimator encloses the population parameter when the estimator is used repeatedly a very large number of times. The confidence level is the confidence coefficient expressed as a percentage.

Now we have seen how an interval can be used to estimate a population mean. When we use an interval estimator, we can usually calculate the probability that the estimation process will result in an interval that contains the true value of the population mean. That is, the probability that the interval contains the parameter in repeated usage is usually known. Figure 5.4 shows what happens when 10 different samples are drawn from a population and a confidence interval for μ is calculated from each. The location of μ is indicated by the vertical line in the figure. Ten confidence intervals, each based on one of 10 samples, are shown as horizontal line segments. Note that the confidence intervals move from sample to sample, sometimes containing μ and other times missing μ. If our confidence level is 95%, then in the long run, 95% of our sample confidence intervals will contain μ.

Figure 5.4

Confidence intervals for μ: 10 samples

Suppose you wish to choose a confidence coefficient other than .95. Notice in Figure 5.1 that the confidence coefficient .95 is equal to the total area under the sampling distribution, less .05 of the area, which is divided equally between the two tails. Using this idea, we can construct a confidence interval with any desired confidence coefficient by increasing or decreasing the area (call it α) assigned to the tails of the sampling distribution. (See Figure 5.5.) For example, if we place the area α/2 in each tail and if zα/2 is the z value such that α/2 will lie to its right, then the confidence interval with confidence coefficient (1α) is

Figure 5.5

Locating zα/2 on the standard normal curve

x¯±zα/2σx¯

Biography Jerzy Neyman (1894–1981)

Speaking Statistics with a Polish Accent

Polish-born Jerzy Neyman was educated at the University of Kharkov (Russia) in elementary mathematics, but taught himself graduate mathematics by studying journal articles on the subject. After receiving his doctorate in 1924 from the University of Warsaw (Poland), Neyman accepted a position at University College (London). There, he developed a friendship with Egon Pearson; Neyman and Pearson together developed the theory of hypothesis testing (Chapter 8). In a 1934 talk to the Royal Statistical Society, Neyman first proposed the idea of interval estimation, which he called “confidence intervals.” (It is interesting that Neyman rarely receives credit in textbooks as the originator of the confidence interval procedure.) In 1938, he emigrated to the United States and went to the University of California at Berkeley, where he built one of the strongest statistics departments in the country. Jerzy Neyman is considered one of the great founders of modern statistics. He was a superb teacher and innovative researcher who loved his students, always sharing his ideas with them. Neyman’s influence on those he met is best expressed by a quote from prominent statistician David Salsburg: “We have all learned to speak statistics with a Polish accent.”

The value zα is defined as the value of the standard normal random variable z such that the area α will lie to its right. In other words, P(z>zα)=α.

To illustrate, for a confidence coefficient of .90, we have (1α)=.90,α=.10, and α/2=.05;z.05 is the z value that locates area .05 in the upper tail of the sampling distribution. Recall that Table II in Appendix B gives the areas between the mean and a specified z-value. Since the total area to the right of the mean is .5, we find that z.05 will be the z value corresponding to an area of .5.05=.45 to the right of the mean. (See Figure 5.6.) This z value is z.05=1.645. The same result may also be obtained using technology. The MINITAB printout in Figure 5.7 shows that the z-value that cuts off an upper tail area of .05 is approximately z.05=1.645.

Figure 5.6

The z value (z.05) corresponding to an area equal to .05 in the upper tail of the z-distribution

Figure 5.7

MINITAB Output for Finding z.05

Confidence coefficients used in practice usually range from .90 to .99. The most commonly used confidence coefficients with corresponding values of α and zα/2 are shown in Table 5.2.

Table 5.2 Commonly Used Values of zα/2

Alternate View
Confidence Level
 100(1α)% α α/2 zα/2
90% .10 .05 1.645
95% .05 .025 1.960
98% .02 .01 2.326
99% .01 .005 2.575

Now Work Exercise 5.7

Large-Sample 100(1α)% Confidence Interval for μ, Based on a Normal (z) Statistic

σknown:x¯±(zα/2)σx¯=x¯±(zα/2)(σ/n)σunknown:x¯±(zα/2)σx¯x¯±(zα/2)(s/n)

where zα/2 is the z-value corresponding to an area α/2 in the tail of a standard normal distribution (see Figure 5.5), σx¯ is the standard deviation of the sampling distribution of x¯,σ is the standard deviation of the population, and s is the standard deviation of the sample.

Conditions Required for a Valid Large-Sample Confidence Interval for μ

  1. A random sample is selected from the target population.

  2. The sample size n is large (i.e., n30). (Due to the Central Limit Theorem, this condition guarantees that the sampling distribution of x¯ is approximately normal. Also, for large n, s will be a good estimator of σ.)

NOSHOW Example 5.3 A Large-Sample Confidence Interval for μ—Mean Number of Unoccupied Seats per Flight 

Problem

  1. Unoccupied seats on flights cause airlines to lose revenue. Suppose a large airline wants to estimate its average number of unoccupied seats per flight over the past year. To accomplish this, the records of 225 flights are randomly selected, and the number of unoccupied seats is noted for each of the sampled flights. (The data are saved in the NOSHOW file.) Descriptive statistics for the data are displayed in the MINITAB printout of Figure 5.8.

    Estimate μ, the mean number of unoccupied seats per flight during the past year, using a 90% confidence interval.

Solution

  1. The form of a large-sample 90% confidence interval for a population mean (based on the z-statistic) is:

    x¯±zα/2σx¯=x¯±z.05σx¯=x¯±1.645(σn)

    From Figure 5.8, we find (after rounding) that x¯=11.6. Since we do not know the value of σ (the standard deviation of the number of unoccupied seats per flight for all flights of the year), we use our best approximation—the sample standard deviation, s=4.1, shown on the MINITAB printout. Then the 90% confidence interval is approximately

    11.6±1.645(4.1225)=11.6±.45

    or from 11.15 to 12.05. That is, at the 90% confidence level, we estimate the mean number of unoccupied seats per flight to be between 11.15 and 12.05 during the sampled year. This result is verified (except for rounding) on the right side of the MINITAB printout in Figure 5.8.

    Figure 5.8

    MINITAB printout with descriptive statistics and 90% confidence interval for Example 5.3

Look Back

We stress that the confidence level for this example, 90%, refers to the procedure used. If we were to apply that procedure repeatedly to different samples, approximately 90% of the intervals would contain μ. Although we do not know for sure whether this particular interval (11.15, 12.05) is one of the 90% that contain μ or one of the 10% that do not, our knowledge of probability gives us “confidence” that the interval contains μ.

Now Work Exercise 5.16a

The interpretation of confidence intervals for a population mean is summarized in the next box.

Interpretation of a Confidence Interval for a Population Mean

When we form a 100(1α)% confidence interval for μ, we usually express our confidence in the interval with a statement such as “We can be 100(1α)% confident that μ lies between the lower and upper bounds of the confidence interval,” where, for a particular application, we substitute the appropriate numerical values for the level of confidence and for the lower and upper bounds. The statement reflects our confidence in the estimation process, rather than in the particular interval that is calculated from the sample data. We know that repeated application of the same procedure will result in different lower and upper bounds on the interval. Furthermore, we know that 100(1α)% of the resulting intervals will contain μ. There is (usually) no way to determine whether any particular interval is one of those that contain μ or one of those that do not. However, unlike point estimators, confidence intervals have some measure of reliability—the confidence coefficient—associated with them. For that reason, they are generally preferred to point estimators.

Sometimes, the estimation procedure yields a confidence interval that is too wide for our purposes. In this case, we will want to reduce the width of the interval to obtain a more precise estimate of μ. One way to accomplish that is to decrease the confidence coefficient, 1α. For example, consider the problem of estimating the mean length of stay, μ, for hospital patients. Recall that for a sample of 100 patients, x¯=4.53 days and s=3.68 days. A 90% confidence interval for μ is

x¯±1.645(σ/n)4.53±(1.645)(3.68)/100=4.53±.61

or (3.92, 5.14). You can see that this interval is narrower than the previously calculated 95% confidence interval, (3.81, 5.25). Unfortunately, we also have “less confidence” in the 90% confidence interval. An alternative method used to decrease the width of an interval without sacrificing “confidence” is to increase the sample size n. We demonstrate this method in Section 5.5.

Exercises 5.1–5.28

Understanding the Principles

  1. 5.1 Define the target parameter.

  2. 5.2 What is the confidence coefficient in a 90% confidence interval for μ?

  3. 5.3 Explain the difference between an interval estimator and a point estimator for μ.

  4. 5.4 Explain what is meant by the statement “We are 95% confident that an interval estimate contains μ.

  5. 5.5 Will a large-sample confidence interval be valid if the population from which the sample is taken is not normally distributed? Explain.

  6. 5.6 What conditions are required to form a valid large-sample confidence interval for μ?

Learning the Mechanics

  1. 5.7 Find zα/2 for each of the following:

    1. α=.10

    2. α=.01

    3. α=.05

    4. α=.20

  2. 5.8 What is the confidence level of each of the following confidence intervals for μ?

    1. x¯±1.96(σn)

    2. x¯±1.645(σn)

    3. x¯±2.575(σn)

    4. x¯±1.28(σn)

    5. x¯±.99(σn)

  3. 5.9 A random sample of n measurements was selected from a population with unknown mean μ and standard deviation σ=20. Calculate a 95% confidence interval for μ for each of the following situations:

    1. n=75,x¯=28

    2. n=200,x¯=102

    3. n=100,x¯=15

    4. n=100,x¯=4.05

    5. Is the assumption that the underlying population of measurements is normally distributed necessary to ensure the validity of the confidence intervals in parts a–d? Explain.

  4. 5.10 A random sample of 90 observations produced a mean x¯=25.9 and a standard deviation s=2.7.

    1. Find a 95% confidence interval for the population mean μ.

    2. Find a 90% confidence interval for μ.

    3. Find a 99% confidence interval for μ.

  5. 5.11 A random sample of 100 observations from a normally distributed population possesses a mean equal to 83.2 and a standard deviation equal to 6.4.

    1. Find a 95% confidence interval for μ.

    2. What do you mean when you say that a confidence coefficient is .95?

    3. Find a 99% confidence interval for μ.

    4. What happens to the width of a confidence interval as the value of the confidence coefficient is increased while the sample size is held fixed?

    5. Would your confidence intervals of parts a and c be valid if the distribution of the original population were not normal? Explain.

  6. 5.12 The mean and standard deviation of a random sample of n measurements are equal to 33.9 and 3.3, respectively.

    1. Find a 95% confidence interval for μ if n=100.

    2. Find a 95% confidence interval for μ if n=400.

    3. Find the widths of the confidence intervals you calculated in parts a and b. What is the effect on the width of a confidence interval of quadrupling the sample size while holding the confidence coefficient fixed?

Applet Exercise 5.1

Use the applet entitled Confidence Intervals for a Mean (the impact of confidence level) to investigate the situation in Exercise 5.11 further. For this exercise, assume that μ=83.2 is the population mean and σ=6.4 is the population standard deviation.

    1. Using n=100 and the normal distribution with mean and standard deviation as just given, run the applet one time. How many of the 95% confidence intervals contain the mean? How many would you expect to contain the mean? How many of the 99% confidence intervals contain the mean? How many would you expect to contain the mean?

    2. Which confidence level has a greater frequency of intervals that contain the mean? Is this result what you would expect? Explain.

    3. Without clearing, run the applet several more times. What happens to the proportion of 95% confidence intervals that contain the mean as you run the applet more and more? What happens to the proportion of 99% confidence intervals that contain the mean as you run the applet more and more? Interpret these results in terms of the meanings of the 95% confidence interval and the 99% confidence interval.

    4. Change the distribution to right skewed, clear, and run the applet several more times. Do you get the same results as in part c? Would you change your answer to part e of Exercise 5.11? Explain.

Applet Exercise 5.2

Use the applet entitled Confidence Intervals for a Mean (the impact of confidence level) to investigate the effect of the sample size on the proportion of confidence intervals that contain the mean when the underlying distribution is skewed. Set the distribution to right skewed, the mean to 10, and the standard deviation to 1.

    1. Using n=30, run the applet several times without clearing. What happens to the proportion of 95% confidence intervals that contain the mean as you run the applet more and more? What happens to the proportion of 99% confidence intervals that contain the mean as you run the applet more and more? Do the proportions seem to be approaching the values that you would expect?

    2. Clear and run the applet several times, using n=100. What happens to the proportions of 95% confidence intervals and 99% confidence intervals that contain the mean this time? How do these results compare with your results in part a?

    3. Clear and run the applet several times, using n=1000. How do the results compare with your results in parts a and b?

    4. Describe the effect of sample size on the likelihood that a confidence interval contains the mean for a skewed distribution.

Applying the Concepts—Basic

  1. 5.13 Heart rate variability of police officers. The heart rate variability (HRV) of police officers was the subject of research published in the American Journal of Human Biology (Jan. 2014). HRV is defined as the variation in the time intervals between heartbeats. A measure of HRV was obtained for each in a sample of 355 Buffalo, NY, police officers. (The lower the measure of HRV, the more susceptible the officer is to cardiovascular disease.) For the 73 officers diagnosed with hypertension, a 95% confidence interval for the mean HRV was (4.1, 124.5). For the 282 officers that are not hypertensive, a 95% confidence interval for the mean HRV was (148.0, 192.6).

    1. What confidence coefficient was used to generate the confidence intervals?

    2. Give a practical interpretation of both of the 95% confidence intervals. Use the phrase “95% confident” in your answer.

    3. When you say you are “95% confident,” what do you mean?

    4. If you want to reduce the width of each confidence interval, should you use a smaller or larger confidence coefficient? Explain.

  2. ISR 5.14 Irrelevant speech effects. Refer to the Acoustical Science & Technology (Vol. 35, 2014) study of irrelevant speech effects, Exercise 2.34 (pp. 49). Recall that subjects performed a memorization task under two conditions: (1) with irrelevant background speech and (2) in silence. The difference in the error rates for the two conditions—called the relative difference in error rate (RDER)—was computed for each subject. Descriptive statistics for the RDER values are reproduced in the following SAS printout. Suppose you want to estimate the average difference in error rates for all subjects who perform the memorization tasks.

    1. Define the target parameter in words and in symbols.

    2. In Exercise 2.104b (p. 77), you computed the interval x¯±2s. Explain why this formula should not be used as an interval estimate for the target parameter.

    3. Form a 98% confidence interval for the target parameter. Interpret the result.

    4. Explain what the phrase “98% confident” implies in your answer to part c.

    5. Refer to the histogram of the sample RDER values shown in Exercise 2.34 and note that the distribution is not symmetric. Consequently, it is likely that the population of RDER values is not normally distributed. Does this compromise the validity of the interval estimate, part c? Explain.

  3. 5.15 Latex allergy in health care workers. Health care workers who use latex gloves with glove powder may develop a latex allergy. Symptoms of a latex allergy include conjunctivitis, hand eczema, nasal congestion, a skin rash, and shortness of breath. Each in a sample of 46 hospital employees who were diagnosed with latex allergy reported on their exposure to latex gloves (Current Allergy & Clinical Immunology, Mar. 2004). Summary statistics for the number of latex gloves used per week are x¯=19.3 and s=11.9.

    1. Give a point estimate for the average number of latex gloves used per week by all health care workers with a latex allergy.

    2. Form a 95% confidence interval for the average number of latex gloves used per week by all health care workers with a latex allergy.

    3. Give a practical interpretation of the interval you found in part b.

    4. Give the conditions required for the interval in part b to be valid.

  4. HYPER 5.16 Lipid profiles of hypertensive patients. People with high blood pressure suffer from hypertension. A study of the lipid profiles of hypertensive patients was carried out and the results published in Biology and Medicine (Vol. 2, 2010). Data on fasting blood sugar (milligrams/deciliter) and magnesium (milligrams/deciliter) in blood specimens collected from 50 patients diagnosed with hypertension were collected. The accompanying MINITAB printout gives 90% confidence intervals for the mean fasting blood sugar (FBS) and mean magnesium level (MAG).

    1. a. Locate and interpret the 90% confidence interval for mean fasting blood sugar on the printout.

    2. b. Locate and interpret the 90% confidence interval for mean magnesium level on the printout.

    3. c. If the confidence level is increased to 95%, what will happen to the width of the intervals?

    4. d. If the sample of hypertensive patients is increased from 50 to 100, what will likely happen to the width of the intervals?

  5. 5.17 Motivation of drug dealers. Refer to the Applied Psychology in Criminal Justice (Sept. 2009) study of the personality characteristics of drug dealers, Exercise 2.102 (p. 77). Recall that each in a sample of 100 convicted drug dealers was scored on the Wanting Recognition (WR) Scale, which provides a quantitative measure of a person’s level of need for approval and sensitivity to social situations. (Higher scores indicate a greater need for approval.) The sample of drug dealers had a mean WR score of 39, with a standard deviation of 6. Use this information to find an interval estimate of the mean WR score for all convicted drug dealers. Use a confidence coefficient of 99%. Interpret the result. 

  6. SUSTAIN 5.18 Corporate sustainability of CPA firms. Corporate sustainability refers to business practices designed around social and environmental considerations. Refer to the Business and Society (Mar. 2011) study on the sustainability behaviors of CPA corporations, Exercise 2.105 (p. 77). Recall that the level of support for corporate sustainability (measured on a quantitative scale ranging from 0 to 160 points) was obtained for each in a sample of 992 senior managers at CPA firms. Higher point values indicate a higher level of support for sustainability. The accompanying MINITAB printout gives a 90% confidence interval for the mean level of support for all senior managers at CPA firms.

    1. Locate the 90% confidence interval on the printout.

    2. Use the sample mean and standard deviation on the printout to calculate the 90% confidence interval. Does your result agree with the interval shown on the printout?

    3. Give a practical interpretation of the 90% confidence interval.

    4. Suppose the CEO of a CPA firm claims that the true mean level of support for sustainability is 75 points. Do you believe this claim? Explain.

  7. PONDICE 5.19 Albedo of ice melt ponds. Refer to the National Snow and Ice Data Center (NSIDC) collection of data on the albedo, depth, and physical characteristics of ice-melt ponds in the Canadian Arctic, presented in Exercise 2.15 (p. 40). Albedo is the ratio of the light reflected by the ice to that received by it. (High albedo values give a white appearance to the ice.) Visible albedo values were recorded for a sample of 504 ice-melt ponds located in the Barrow Strait in the Canadian Arctic; these data are saved in the PONDICE file.

    1. Find a 90% confidence interval for the true mean visible albedo value of all Canadian Arctic ice ponds.

    2. Give both a practical and a theoretical interpretation of the interval.

    3. Recall from Exercise 2.15 that theThe type of ice for each pond was classified as first-year ice, multiyear ice, or landfast ice. Find 90% confidence intervals for the mean visible albedo for each of the three types of ice. Interpret the intervals.

Applying the Concepts—Intermediate

  1. 5.20 Evaporation from swimming pools. A new formula for estimating the water evaporation from occupied swimming pools was proposed and analyzed in the journal Heating/Piping/Air Conditioning Engineering (Apr. 2013). The key components of the new formula are number of pool occupants, area of pool’s water surface, and the density difference between room air temperature and the air at the pool’s surface. Data were collected from a wide range of pools where the evaporation level was known. The new formula was applied to each pool in the sample, yielding an estimated evaporation level. The absolute value of the deviation between the actual and estimated evaporation level was then recorded as a percentage. The researchers reported the following summary statistics for absolute deviation percentage: x¯=18,s=20. Assume that the sample contained n=500 swimming pools.

    1. Estimate the true mean absolute deviation percentage for the new formula with a 90% confidence interval.

    2. The American Society of Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE) handbook also provides a formula for estimating pool evaporation. Suppose the ASHRAE mean absolute deviation percentage is μ=34%. (This value was reported in the article.) On average, is the new formula “better” than the ASHRAE formula? Explain.

  2. 5.21 Facial structure of CEOs. In Psychological Science (Vol. 22, 2011), researchers reported that a chief executive officer’s facial structure can be used to predict a firm’s financial performance. The study involved measuring the facial width-to-height ratio (WHR) for each in a sample of 55 CEOs at publicly traded Fortune 500 firms. These WHR values (determined by computer analyzing a photo of the CEO’s face) had a mean of x¯=1.96 and a standard deviation of s=.15.

    1. Find and interpret a 95% confidence interval for μ, the mean facial WHR for all CEOs at publicly traded Fortune 500 firms.

    2. The researchers found that CEOs with wider faces (relative to height) tended to be associated with firms that had greater financial performance. They based their inference on an equation that uses facial WHR to predict financial performance. Suppose an analyst wants to predict the financial performance of a Fortune 500 firm based on the value of the true mean facial WHR of CEOs. The analyst wants to use the value of μ=2.2. Do you recommend he use this value?

  3. BLKFRI 5.22 Shopping on Black Friday. The day after Thanksgiving—called Black Friday—is one of the largest shopping days in the United States. Winthrop University researchers conducted interviews with a sample of 38 women shopping on Black Friday to gauge their shopping habits and reported the results in the International Journal of Retail and Distribution Management (Vol. 39, 2011). One question was “How many hours do you usually spend shopping on Black Friday?” Data for the 38 shoppers are listed in the accompanying table.

    1. Describe the population of interest to the researchers.

    2. What is the quantitative variable of interest to the researchers?

    3. Use the information in the table to estimate the population mean number of hours spent shopping on Black Friday with a 95% confidence interval.

    4. Give a practical interpretation of the interval.

    5. A retail store advertises that the true mean number of hours spent shopping on Black Friday is 5.5 hours. Should the store be sued for false advertising? Explain.

    Alternate View
    6 6 4 4 3 16 4 4  5  6 6 5 5  4
    6 5 6 4 5  4 4 4  7 12 5 8 6 10
    5 8 8 3 3  8 5 6 10 11

    Source: Thomas, J. B., and Peters, C. “An exploratory investigation of Black Friday consumption rituals.” International Journal of Retail and Distribution Management, Vol. 39, No. 7, 2011 (Table I).

  4. PERAGGR 5.23 Personality and aggressive behavior. A team of university psychologists conducted a review of studies that examined the relationship between personality and aggressive behavior (Psychological Bulletin, Vol. 132, 2006). One variable of interest was the difference between the aggressive behavior level of individuals in the study who scored high on a personality test and those who scored low on the test. This variable, standardized to be between 7 and 7, was called “effect size.” (A large positive effect size indicates that those who score high on the personality test are more aggressive than those who score low.) The researchers collected the effect sizes for a sample of n=109 studies published in psychology journals. This data is saved in the PERAGGR file. A dot plot and summary statistics for effect size are shown in the MINITAB printouts at the bottom of the page. Of interest to the researchers is the true mean effect size μ for all psychological studies of personality and aggressive behavior.

    MINITAB Output for Exercise 5.23

    1. Identify the parameter of interest to the researchers.

    2. Examine the dot plot. Does effect size have a normal distribution? Explain why your answer is irrelevant to the subsequent analysis.

    3. Locate a 95% confidence interval for μ on the printout on p. 263. Interpret the result.

    4. If the true mean effect size exceeds 0, then the researchers will conclude that in the population, those who score high on a personality test are more aggressive than those who score low. Can the researchers draw this conclusion? Explain.

  5. TURTLES 5.24 Shell lengths of sea turtles. Refer to the Aquatic Biology (Vol. 9, 2010) study of green sea turtles inhabiting the Grand Cayman South Sound lagoon, Exercise 2.85 (p. 69). The data on curved carapace (shell) length, measured in centimeters, for 76 captured turtles are displayed in the table. Environmentalists want to estimate the true mean shell length of all green sea turtles in the lagoon.

    Alternate View
    33.96 30.37 32.57 31.50 36.46 35.54 36.16 35.32 35.99
    39.55 44.33 42.73 42.15 42.43 49.96 46.04 48.76 47.78
    45.81 49.05 49.65 49.71 54.29 52.01 51.15 54.42 52.62
    53.27 54.07 50.40 53.69 51.30 54.29 54.58 55.11 57.65
    56.35 55.68 58.40 58.06 57.79 56.54 57.03 57.64 59.27
    64.79 61.96 60.08 62.34 63.84 60.61 64.91 60.35 62.63
    63.33 63.00 64.55 60.03 64.75 60.24 69.01 65.07 65.77
    65.30 68.24 65.28 67.54 68.49 66.98 65.67 70.26 70.94
    70.52 72.01 74.34 81.63
    1. Define the parameter of interest to the environmentalists.

    2. Use the data to find a point estimate of the target parameter.

    3. Compute a 95% confidence interval for the target parameter. Interpret the result.

    4. Suppose a biologist claims that the mean shell length of all green sea turtles in the lagoon is 60 cm. Make an inference about the validity of this claim.

  6. 5.25 Colored string preferred by chickens. Animal behavi­orists have discovered that the more domestic chickens peck at objects placed in their environment, the healthier the chickens seem to be. White string has been found to be a particularly attractive pecking stimulus. In one experiment, 72 chickens were exposed to a string stimulus. Instead of white string, blue string was used. The number of pecks each chicken took at the blue string over a specified interval of time was recorded. Summary statistics for the 72 chickens were x¯=1.13 pecks and s=2.21 pecks (Applied Animal Behaviour Science, Oct. 2000).

    1. Use a 99% confidence interval to estimate the population mean number of pecks made by chickens pecking at blue string. Interpret the result.

    2. Previous research has shown that μ=7.5 pecks if chickens are exposed to white string. Based on the results you found in part a, is there evidence that chickens are more apt to peck at white string than blue string? Explain.

  7. SPRINT 5.26 Speed training in football. Researchers at Northern Kentucky University designed and tested a speed-training program for junior varsity and varsity high school football players (The Sport Journal, Winter 2004). Each in a sample of 38 high school athletes was timed in a 40-yard sprint prior to the start of the training program and timed again after completing the program. The decreases in times (measured in seconds) are listed in the table. [Note: A negative decrease implies that the athlete’s time after completion of the program was higher than his time prior to training.] The goal of the research is to demonstrate that the training program is effective in improving 40-yard sprint times.

    Alternate View
    .01 .1 .1 .24 .25 .05 .28 .25 .2 .14
    .32 .34 .3 .09 .05 0 .04 .17 0 .21
    .15 .3 .02 .12 .14 .1 .08 .5 .36 .1
    .01 .9 .34 .38 .44 .08 0 0

    Based on Gray, M., & Sauerbeck, J. A. “Speed training program for high school football players.” The Sport Journal, Vol. 7, No. 1, Winter 2004 (Table 2).

    1. Find a 95% confidence interval for the true mean decrease in sprint times for the population of all football players who participate in the speed-training program.

    2. Based on the confidence interval, is the training program really effective in improving the mean 40-yard sprint time of high school football players? Explain.

Applying the Concepts—Advanced

  1. 5.27 Study of undergraduate problem drinking. In Alcohol & Alcoholism (Jan./Feb. 2007), psychologists at the University of Pennsylvania compared the levels of alcohol consumption of male and female freshman students. Each student was asked to estimate the amount of alcohol (beer, wine, or liquor) they consume in a typical week. Summary statistics for 128 males and 184 females are provided in the accompanying table.

    1. For each gender, find a 95% confidence interval for mean weekly alcohol consumption.

    2. Prior to sampling, what is the probability that at least one of the two confidence intervals will not contain the population mean it estimates? Assume that the two intervals are independent.

    3. Based on the two confidence intervals, what inference can you make about which gender consumes the most alcohol, on average, per week? [Caution: In Chapter 9, we will learn about a more valid method of comparing population means.]

    Males Females
    Sample size, n 128 184
    Mean (ounces), x¯ 16.79 10.79
    Standard deviation, s 13.57 11.53

    Based on Leeman, R. F., Fenton, M., & Volpicelli, J. R. “Impaired control and undergraduate problem drinking.” Alcohol & Alcoholism, Vol. 42, No. 1, Jan./Feb. 2007 (Table 1).

  2. 5.28 The Raid test kitchen. According to scientists, the cockroach has had 300 million years to develop a resistance to destruction. In a study conducted by researchers for S. C. Johnson & Son, Inc. (manufacturers of Raid® and Off®), 5,000 roaches (the expected number in a roach-infested house) were released in the Raid test kitchen. One week later, the kitchen was fumigated and 16,298 dead roaches were counted, a gain of 11,298 roaches for the 1-week period. Assume that none of the original roaches died during the 1-week period and that the standard deviation of x, the number of roaches produced per roach in a 1-week period, is 1.5. Use the number of roaches produced by the sample of 5,000 roaches to find a 95% confidence interval for the mean number of roaches produced per week for each roach in a typical roach-infested house.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset