7.4 Determining the Sample Size

You can find the appropriate sample size to estimate the difference between a pair of parameters with a specified sampling error (SE) and degree of reliability by using the method described in Section 5.5. That is, to estimate the difference between a pair of parameters correct to within SE units with confidence level (1α), let zα/2 standard deviations of the sampling distribution of the estimator equal SE. Then solve for the sample size. To do this, you have to solve the problem for a specific ratio between n1 and  n2. Most often, you will want to have equal sample sizes—that is, n1=n2=n. We will illustrate the procedure with two examples.

Example 7.6 Finding the Sample Sizes for Estimating (μ1μ2) —Comparing Mean Crop Yields

Problem

  1. New fertilizer compounds are often advertised with the promise of increased crop yields. Suppose we want to compare the mean yield μ1 of wheat when a new fertilizer is used with the mean yield μ2 from a fertilizer in common use. The estimate of the difference in mean yield per acre is to be correct to within .25 bushel with a confidence coefficient of .95. If the sample sizes are to be equal, find n1=n2=n, the number of 1-acre plots of wheat assigned to each fertilizer.

Solution

  1. To solve the problem, you need to know something about the variation in the bushels of yield per acre. Suppose that, from past records, you know that the yields of wheat possess a range of approximately 10 bushels per acre. You could then approximate σ1=σ2=σ by letting the range equal 4σ. Thus,

    4σ10bushelsσ2.5bushels

    The next step is to solve the equation

    zα/2σ(x¯1x¯2)=SE, orzα/2σ12n1+σ22n2=SE

    for n, where n=n1=n2. Since we want our estimate to lie within SE=.25 of (μ1μ2) with confidence coefficient equal to .95, we have zα/2=z.025=1.96. Then, letting σ1=σ2=2.5 and solving for n, we get

    1.96(2.5)2n+(2.5)2n=.251.962(2.5)2n=.25n=768.32769(rounding up)

    Consequently, you will have to sample 769 acres of wheat for each fertilizer to estimate the difference in mean yield per acre to within .25 bushel.

Look Back

Since n=769 would necessitate extensive and costly experimentation, you might decide to allow a larger sampling error (say, SE=.50 or SE=1 ) in order to reduce the sample size, or you might decrease the confidence coefficient. The point is that we can obtain an idea of the experimental effort necessary to achieve a specified precision in our final estimate by determining the approximate sample size before the experiment is begun.

Now Work Exercise 7.56

Example 7.7 Finding the Sample Sizes for Estimating μd : Comparing Two Measuring Instruments

Problem

  1. A laboratory manager wishes to compare the difference in the mean reading of two instruments, A and B, designed to measure the potency (in parts per million) of an antibiotic. To conduct the experiment, the manager plans to select nd specimens of the antibiotic from a vat and to measure each specimen with both instruments. The difference (μAμB) will be estimated based on the nd paired differences (xAxB) obtained in the experiment. If preliminary measurements suggest that the differences will range between plus or minus 10 parts per million, how many differences will be needed to estimate (μAμB) correct to within 1 part per million with confidence equal to .99?

Solution

  1. The estimator for (μAμB), based on a paired difference experiment, is x¯d=(x¯Ax¯B) and

    σx¯d=σdnd

    Thus, the number nd of pairs of measurements needed to estimate (μAμB) to within 1 part per million can be obtained by solving for nd in the equation

    zα/2(σdnd)=SE

    where z.005=2.58 and SE=1. To solve this equation for nd, we need to have an approximate value for σd.

    We are given the information that the differences are expected to range from 10 to 10 parts per million. Letting the range equal 4σd, we find

    Range=204σdσd5

    Substituting σd=5, SE=1, and z.005=2.58 into the equation and solving for nd, we obtain

    2.58(5nd)=1nd=[(2.58)(5)]2=166.41

    Therefore, it will require nd=167 pairs of measurements to estimate (μAμB) correct to within 1 part per million using the paried difference experiment.

Now Work Exercise 7.68

The procedures for determining sample sizes necessary for estimating (μ1μ2) for the case n1=n2 and for μd in a paired difference experiment are given in the following boxes:

Determination of Sample Size for Estimating (μ1μ2): Equal Sample Size Case

To estimate (μ1μ2) to within a given sampling error SE and with confidence level (1α), use the following formula to solve for equal sample sizes that will achieve the desired reliability:

n1=n2=(zα/2)2(σ12+σ22)(SE)2

You will need to substitute estimates for the values of σ12 and σ22 before solving for the sample size. These estimates might be sample variances s12 and s22 from prior sampling (e.g., a pilot study) or from an educated (and conservatively large) guess based on the range—that is, sR/4.

Determination of Sample Size for Estimating μd

To estimate μd to within a given sampling error SE and with confidence level (1α), use the following formula to solve for the number of pairs, nd, that will achieve the desired reliability:

nd=(zα/2)2(σd2)/(SE)2

You will need to substitute an estimate for the value of σd, the standard deviation of the paired differences, before solving for the sample size.

Note: When estimating (μ1μ2), you may desire one sample size to be a multiple of the other, e.g., n2=a(n1 ) where a is an integer. For example, you may want to sample twice as many experimental units in the second sample as in the first. Then a=2 and n2=2(n1). For this unequal sample size case, slight adjustments are made to the computing formula. This formula (proof omitted) is provided below for convenience.

Adjustment to Sample Size Formula for Estimating (μ1μ2) When n2=a(n1)

n1=(zα/2)2+(aσ12+σ22)a(SE)2n2=a(n1)

Exercises 7.53–7.65

Understanding the Principles

  1. 7.53 In determining the sample sizes for estimating μ1μ2, how do you obtain estimates of the population variances (σ1)2 and (σ2)2 used in the calculations?

  2. 7.54 When determining the sample size for estimating μd , how do you obtain an estimate of the population variance (σd2) used in the calculations?

  3. 7.55 If the sample-size calculation yields a value of n that is too large to be practical, how should you proceed?

Learning the Mechanics

  1. 7.56 Find the appropriate values of n1 and n2 (assume that n1=n2 ) needed to estimate (μ1μ2) with

    1. A sampling error equal to 3.2 with 95% confidence. From prior experience, it is known that σ115 and σ217.

    2. A sampling error equal to 8 with 99% confidence. The range of each population is 60.

    3. A 90% confidence interval of width 1.0. Assume that σ125.8 and σ227.5.

  2. 7.57 Suppose you want to estimate the difference between two population means correct to within 2.2 with probability .95. If prior information suggests that the population variances are approximately equal to σ12=σ22=15 and you want to select independent random samples of equal size from the populations, how large should the sample sizes, n1 and n2, be?

  3. 7.58 Enough money has been budgeted to collect n=100 paired observations from populations 1 and 2 in order to estimate μd=(μ1μ2). Prior information indicates that σd=12. Have sufficient funds been allocated to construct a 90% confidence interval for μd of width 5 or less? Justify your answer.

Applying the Concepts—Basic

  1. 7.59 Hygiene of handshakes, high fives, and fist bumps. Refer to the American Journal of Infection Control (Aug. 2014) study of the hygiene of hand greetings, Exercise 7.22 (p. 385). The number of bacteria transferred from a gloved hand dipped into a culture of bacteria to a second gloved hand contacted by either a handshake, high five, or fist bump was recorded. Recall that the experiment was replicated only five times for each contact method and the data used to compare the mean percentage of bacteria transferred for any two contact methods. Therefore, for this independent-samples design, n1=n2=5. Suppose you want to estimate the difference between the mean percentage of bacteria transferred for the handshake and fist bump greetings to within 10% using a 95% confidence interval.

    1. Define the parameter of interest in this study.

    2. Give the value of zα/2 for the confidence interval.

    3. What is the desired sampling error, SE?

    4. From the data provided in Exercise 7.22, find estimates of the variances, σ12 and σ22, for the two contact methods.

    5. Use the information in parts a–d to calculate the number of replicates for each contact method required to obtain the desired reliability. Assume an equal number of replicates.

  2. 7.60 Laughter among deaf signers. Refer to the Journal of Deaf Studies and Deaf Education (Fall 2006) paired difference study on vocalized laughter among deaf users of sign language, presented in Exercise 7.42 (p. 396). Suppose you want to estimate μd=(μSμA), the difference between the population mean number of laugh episodes of deaf speakers and deaf audience members, using a 90% confidence interval with a sampling error of .75. Find the number of pairs of deaf people required to obtain such an estimate, assuming that the variance of the paired differences is σd2=3.

  3. 7.61 Bulimia study. Refer to the American Statistician (May 2001) study comparing the “fear of negative evaluation” (FNE) scores for bulimic and normal female students, presented in Exercise 7.17 (p. 383). Suppose you want to estimate (μBμN), the difference between the population means of the FNE scores for bulimic and normal female students, using a 95% confidence interval with a sampling error of two points. Find the sample sizes required to obtain such an estimate. Assume equal sample sizes of σB2=σN2=25.

Applying the Concepts—Intermediate

  1. 7.62 Last name and acquisition timing. Refer to the Journal of Consumer Research (Aug. 2011) study of the last name effect in acquisition timing, Exercise 7.12 (p. 382). Recall that the mean response times (in minutes) to acquire basketball tickets were compared for two groups of MBA students: those students with last names beginning with one of the first 9 letters of the alphabet and those with last names beginning with one of the last 9 letters of the alphabet. How many MBA students from each group would need to be selected to estimate the difference in mean times to within 2 minutes of its true value with 95% confidence? (Assume equal sample sizes will be selected for each group and that the response time standard deviation for both groups is σ9minutes.)

  2. SOLAR 7.63 Solar energy generation along highways. Refer to the International Journal of Energy and Environmental Engineering (Dec. 2013) study of solar energy generation along highways, Exercise 7.45 (p. 397). Recall that the researchers compared the mean monthly amount of solar energy generated by east-west and north-south oriented solar panels using a matched-pairs experiment. However, a small sample of only five months was used for the analysis. How many more months would need to be selected in order to estimate the difference in means to within 25 kilowatt hours with a 90% confidence interval? Use the information provided in the SOLAR file to find an estimate of the standard error required to carry out the calculation.

  3. 7.64 Do video game players have superior visual attention skills? Refer to the Journal of Articles in Support of the Null Hypothesis (Vol. 6, 2009) study comparing the visual attention skill of video game and non-video game players, Exercise 7.20 (p. 384). Recall that there was no significant difference between the mean score on the attentional blink test of video game players and the corresponding mean for non–video game players. It is possible that selecting larger samples would yield a significant difference. How many video game and non–video game players would need to be selected in order to estimate the difference in mean score for the two groups to within 5 points with 95% confidence? (Assume equal sample sizes will be selected from the two groups and that the score standard deviation for both groups is σ9.)

  4. 7.65 Scouting an NFL free agent. In seeking a free-agent NFL running back, a general manager is looking for a player with high mean yards gained per carry and a small standard deviation. Suppose the GM wishes to compare the mean yards gained per carry for two free agents, on the basis of independent random samples of their yards gained per carry. Data from last year’s pro football season indicate that σ1=σ25 yards. If the GM wants to estimate the difference in means correct to within 1 yard with a confidence level of .90, how many runs would have to be observed for each player? (Assume equal sample sizes.)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset