Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7.4 Determining the Sample Size

You can find the appropriate sample size to estimate the difference between a pair of parameters with a specified sampling error (SE) and degree of reliability by using the method described in Section 5.5. That is, to estimate the difference between a pair of parameters correct to within SE units with confidence level $(1 - α),$ $(1 - α),$ let $z_{α / 2}$ $z_{α / 2}$ standard deviations of the sampling distribution of the estimator equal SE. Then solve for the sample size. To do this, you have to solve the problem for a specific ratio between $n_{1}$ $n_{1}$ and $n_{2} .$ $n_{2} .$ Most often, you will want to have equal sample sizes—that is, $n_{1} = n_{2} = n .$ $n_{1} = n_{2} = n .$ We will illustrate the procedure with two examples.

Example 7.6 Finding the Sample Sizes for Estimating $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ —Comparing Mean Crop Yields

Problem

New fertilizer compounds are often advertised with the promise of increased crop yields. Suppose we want to compare the mean yield $μ_{1}$ $μ_{1}$ of wheat when a new fertilizer is used with the mean yield $μ_{2}$ $μ_{2}$ from a fertilizer in common use. The estimate of the difference in mean yield per acre is to be correct to within .25 bushel with a confidence coefficient of .95. If the sample sizes are to be equal, find $n_{1} = n_{2} = n,$ $n_{1} = n_{2} = n,$ the number of 1-acre plots of wheat assigned to each fertilizer.

Solution

To solve the problem, you need to know something about the variation in the bushels of yield per acre. Suppose that, from past records, you know that the yields of wheat possess a range of approximately 10 bushels per acre. You could then approximate $σ_{1} = σ_{2} = σ$ $σ_{1} = σ_{2} = σ$ by letting the range equal $4 σ .$ $4 σ .$ Thus,

$\begin{array}{l} \begin{matrix} 4 σ \approx 10 bushels \end{matrix} \\ \begin{matrix} σ \approx 2.5 bushels \end{matrix} \end{array}$ $\begin{array}{l} \begin{matrix} 4 σ \approx 10 bushels \end{matrix} \\ \begin{matrix} σ \approx 2.5 bushels \end{matrix} \end{array}$

The next step is to solve the equation

$z_{α / 2} σ_{({\bar{x}}_{1} - {\bar{x}}_{2})} = SE, or z_{α / 2} \sqrt{\frac{σ_{1}^{2}}{n_{1}} + \frac{σ_{2}^{2}}{n_{2}}} = SE$ $z_{α / 2} σ_{({\bar{x}}_{1} - {\bar{x}}_{2})} = SE, or z_{α / 2} \sqrt{\frac{σ_{1}^{2}}{n_{1}} + \frac{σ_{2}^{2}}{n_{2}}} = SE$

for n, where $n = n_{1} = n_{2} .$ $n = n_{1} = n_{2} .$ Since we want our estimate to lie within $S E = .25$ $S E = .25$ of $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ with confidence coefficient equal to .95, we have $z_{α / 2} = z_{.025} = 1.96.$ $z_{α / 2} = z_{.025} = 1.96.$ Then, letting $σ_{1} = σ_{2} = 2.5$ $σ_{1} = σ_{2} = 2.5$ and solving for n, we get

$\begin{array}{l} 1.96 \sqrt{\frac{{(2.5)}^{2}}{n} + \frac{{(2.5)}^{2}}{n}} & = & .25 \\ 1.96 \sqrt{\frac{2 {(2.5)}^{2}}{n}} & = & .25 \\ n & = & 768.32 \approx 769 (rounding up) \end{array}$ $\begin{array}{l} 1.96 \sqrt{\frac{{(2.5)}^{2}}{n} + \frac{{(2.5)}^{2}}{n}} & = & .25 \\ 1.96 \sqrt{\frac{2 {(2.5)}^{2}}{n}} & = & .25 \\ n & = & 768.32 \approx 769 (rounding up) \end{array}$

Consequently, you will have to sample 769 acres of wheat for each fertilizer to estimate the difference in mean yield per acre to within .25 bushel.

Look Back

Since $n = 769$ $n = 769$ would necessitate extensive and costly experimentation, you might decide to allow a larger sampling error (say, $S E = .50$ $S E = .50$ or $S E = 1$ $S E = 1$ ) in order to reduce the sample size, or you might decrease the confidence coefficient. The point is that we can obtain an idea of the experimental effort necessary to achieve a specified precision in our final estimate by determining the approximate sample size before the experiment is begun.

Now Work Exercise 7.56

Example 7.7 Finding the Sample Sizes for Estimating $μ_{d}$ $μ_{d}$ : Comparing Two Measuring Instruments

Problem

A laboratory manager wishes to compare the difference in the mean reading of two instruments, A and B, designed to measure the potency (in parts per million) of an antibiotic. To conduct the experiment, the manager plans to select n_d specimens of the antibiotic from a vat and to measure each specimen with both instruments. The difference $(μ_{A} - μ_{B})$ $(μ_{A} - μ_{B})$ will be estimated based on the n_d paired differences $(x_{A} - x_{B})$ $(x_{A} - x_{B})$ obtained in the experiment. If preliminary measurements suggest that the differences will range between plus or minus 10 parts per million, how many differences will be needed to estimate $(μ_{A} - μ_{B})$ $(μ_{A} - μ_{B})$ correct to within 1 part per million with confidence equal to .99?

Solution

The estimator for $(μ_{A} - μ_{B})$ $(μ_{A} - μ_{B})$ , based on a paired difference experiment, is ${\bar{x}}_{d} = ({\bar{x}}_{A} - {\bar{x}}_{B})$ ${\bar{x}}_{d} = ({\bar{x}}_{A} - {\bar{x}}_{B})$ and

$σ_{{\bar{x}}_{d}} = \frac{σ_{d}}{\sqrt{n_{d}}}$ $σ_{{\bar{x}}_{d}} = \frac{σ_{d}}{\sqrt{n_{d}}}$

Thus, the number n_d of pairs of measurements needed to estimate $(μ_{A} - μ_{B})$ $(μ_{A} - μ_{B})$ to within 1 part per million can be obtained by solving for n_d in the equation

$z_{α / 2} (\frac{σ_{d}}{\sqrt{n_{d}}}) = S E$ $z_{α / 2} (\frac{σ_{d}}{\sqrt{n_{d}}}) = S E$

where $z_{.005} = 2.58$ $z_{.005} = 2.58$ and $S E = 1$ $S E = 1$ . To solve this equation for n_d, we need to have an approximate value for $σ_{d}$ $σ_{d}$ .

We are given the information that the differences are expected to range from $- 10$ $- 10$ to 10 parts per million. Letting the range equal $4 σ_{d}$ $4 σ_{d}$ , we find

$\begin{array}{l} Range & = & 20 \approx 4 σ_{d} \\ σ_{d} & \approx & 5 \end{array}$ $\begin{array}{l} Range & = & 20 \approx 4 σ_{d} \\ σ_{d} & \approx & 5 \end{array}$

Substituting $σ_{d} = 5$ $σ_{d} = 5$ , $S E = 1$ $S E = 1$ , and $z_{.005} = 2.58$ $z_{.005} = 2.58$ into the equation and solving for n_d, we obtain

$\begin{array}{l} 2.58 (\frac{5}{\sqrt{n_{d}}}) & = & 1 \\ n_{d} & = & {[(2.58) (5)]}^{2} \\ = & 166.41 \end{array}$ $\begin{array}{l} 2.58 (\frac{5}{\sqrt{n_{d}}}) & = & 1 \\ n_{d} & = & {[(2.58) (5)]}^{2} \\ = & 166.41 \end{array}$

Therefore, it will require $n_{d} = 167$ $n_{d} = 167$ pairs of measurements to estimate $(μ_{A} - μ_{B})$ $(μ_{A} - μ_{B})$ correct to within 1 part per million using the paried difference experiment.

Now Work Exercise 7.68

The procedures for determining sample sizes necessary for estimating $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ for the case $n_{1} = n_{2}$ $n_{1} = n_{2}$ and for $μ_{d}$ $μ_{d}$ in a paired difference experiment are given in the following boxes:

Determination of Sample Size for Estimating $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ : Equal Sample Size Case

To estimate $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ to within a given sampling error SE and with confidence level $(1 - α),$ $(1 - α),$ use the following formula to solve for equal sample sizes that will achieve the desired reliability:

n_{1} = n_{2} = \frac{{(z_{α / 2})}^{2} (σ_{1}^{2} + σ_{2}^{2})}{{(SE)}^{2}}

$n_{1} = n_{2} = \frac{{(z_{α / 2})}^{2} (σ_{1}^{2} + σ_{2}^{2})}{{(SE)}^{2}}$

You will need to substitute estimates for the values of $σ_{1}^{2}$ $σ_{1}^{2}$ and $σ_{2}^{2}$ $σ_{2}^{2}$ before solving for the sample size. These estimates might be sample variances $s_{1}^{2}$ $s_{1}^{2}$ and $s_{2}^{2}$ $s_{2}^{2}$ from prior sampling (e.g., a pilot study) or from an educated (and conservatively large) guess based on the range—that is, $s \approx R / 4$ $s \approx R / 4$ .

Determination of Sample Size for Estimating $μ_{d}$ $μ_{d}$

To estimate $μ_{d}$ $μ_{d}$ to within a given sampling error SE and with confidence level $(1 - α)$ $(1 - α)$ , use the following formula to solve for the number of pairs, $n_{d}$ $n_{d}$ , that will achieve the desired reliability:

n_{d} = {(z_{α / 2})}^{2} (σ_{d}^{2}) / {(SE)}^{2}

$n_{d} = {(z_{α / 2})}^{2} (σ_{d}^{2}) / {(SE)}^{2}$

You will need to substitute an estimate for the value of $σ_{d}$ $σ_{d}$ , the standard deviation of the paired differences, before solving for the sample size.

Note: When estimating $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ , you may desire one sample size to be a multiple of the other, e.g., $n_{2} = a (n_{1}$ $n_{2} = a (n_{1}$ ) where a is an integer. For example, you may want to sample twice as many experimental units in the second sample as in the first. Then $a = 2$ $a = 2$ and $n_{2} = 2 (n_{1})$ $n_{2} = 2 (n_{1})$ . For this unequal sample size case, slight adjustments are made to the computing formula. This formula (proof omitted) is provided below for convenience.

Adjustment to Sample Size Formula for Estimating $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ When $n_{2} = a (n_{1})$ $n_{2} = a (n_{1})$

\begin{array}{l} n_{1} = \frac{{(z_{α / 2})}^{2} + (a σ_{1}^{2} + σ_{2}^{2})}{a {(SE)}^{2}} & n_{2} = a (n_{1}) \end{array}

$\begin{array}{l} n_{1} = \frac{{(z_{α / 2})}^{2} + (a σ_{1}^{2} + σ_{2}^{2})}{a {(SE)}^{2}} & n_{2} = a (n_{1}) \end{array}$

Exercises 7.53–7.65

Understanding the Principles

7.53 In determining the sample sizes for estimating $μ_{1} - μ_{2},$ $μ_{1} - μ_{2},$ how do you obtain estimates of the population variances $(σ_{1})^{2}$ $(σ_{1})^{2}$ and $(σ_{2})^{2}$ $(σ_{2})^{2}$ used in the calculations?
7.54 When determining the sample size for estimating $μ_{d}$ $μ_{d}$ , how do you obtain an estimate of the population variance $(σ_{d}^{2})$ $(σ_{d}^{2})$ used in the calculations?
7.55 If the sample-size calculation yields a value of n that is too large to be practical, how should you proceed?

Learning the Mechanics

7.56 Find the appropriate values of $n_{1}$ $n_{1}$ and $n_{2}$ $n_{2}$ (assume that $n_{1} = n_{2}$ $n_{1} = n_{2}$ ) needed to estimate $(μ_{1} - μ_{2})$ $(μ_{1} - μ_{2})$ with
1. A sampling error equal to 3.2 with 95% confidence. From prior experience, it is known that $σ_{1} \approx 15$ $σ_{1} \approx 15$ and $σ_{2} \approx 17.$ $σ_{2} \approx 17.$
2. A sampling error equal to 8 with 99% confidence. The range of each population is 60.
3. A 90% confidence interval of width 1.0. Assume that $σ_{1}^{2} \approx 5.8$ $σ_{1}^{2} \approx 5.8$ and $σ_{2}^{2} \approx 7.5.$ $σ_{2}^{2} \approx 7.5.$
7.57 Suppose you want to estimate the difference between two population means correct to within 2.2 with probability .95. If prior information suggests that the population variances are approximately equal to $σ_{1}^{2} = σ_{2}^{2} = 15$ $σ_{1}^{2} = σ_{2}^{2} = 15$ and you want to select independent random samples of equal size from the populations, how large should the sample sizes, $n_{1}$ $n_{1}$ and $n_{2},$ $n_{2},$ be?
7.58 Enough money has been budgeted to collect $n = 100$ $n = 100$ paired observations from populations 1 and 2 in order to estimate $μ_{d} = (μ_{1} - μ_{2})$ $μ_{d} = (μ_{1} - μ_{2})$ . Prior information indicates that $σ_{d} = 12$ $σ_{d} = 12$ . Have sufficient funds been allocated to construct a 90% confidence interval for $μ_{d}$ $μ_{d}$ of width 5 or less? Justify your answer.

Applying the Concepts—Basic

7.59 Hygiene of handshakes, high fives, and fist bumps. Refer to the American Journal of Infection Control (Aug. 2014) study of the hygiene of hand greetings, Exercise 7.22 (p. 385). The number of bacteria transferred from a gloved hand dipped into a culture of bacteria to a second gloved hand contacted by either a handshake, high five, or fist bump was recorded. Recall that the experiment was replicated only five times for each contact method and the data used to compare the mean percentage of bacteria transferred for any two contact methods. Therefore, for this independent-samples design, $n_{1} = n_{2} = 5$ $n_{1} = n_{2} = 5$ . Suppose you want to estimate the difference between the mean percentage of bacteria transferred for the handshake and fist bump greetings to within 10% using a 95% confidence interval.
1. Define the parameter of interest in this study.
2. Give the value of $z_{α / 2}$ $z_{α / 2}$ for the confidence interval.
3. What is the desired sampling error, SE?
4. From the data provided in Exercise 7.22, find estimates of the variances, $σ_{1}^{2}$ $σ_{1}^{2}$ and $σ_{2}^{2}$ $σ_{2}^{2}$ , for the two contact methods.
5. Use the information in parts a–d to calculate the number of replicates for each contact method required to obtain the desired reliability. Assume an equal number of replicates.
7.60 Laughter among deaf signers. Refer to the Journal of Deaf Studies and Deaf Education (Fall 2006) paired difference study on vocalized laughter among deaf users of sign language, presented in Exercise 7.42 (p. 396). Suppose you want to estimate $μ_{d} = (μ_{S} - μ_{A})$ $μ_{d} = (μ_{S} - μ_{A})$ , the difference between the population mean number of laugh episodes of deaf speakers and deaf audience members, using a 90% confidence interval with a sampling error of .75. Find the number of pairs of deaf people required to obtain such an estimate, assuming that the variance of the paired differences is $σ_{d}^{2} = 3$ $σ_{d}^{2} = 3$ .
7.61 Bulimia study. Refer to the American Statistician (May 2001) study comparing the “fear of negative evaluation” (FNE) scores for bulimic and normal female students, presented in Exercise 7.17 (p. 383). Suppose you want to estimate $(μ_{B} - μ_{N}),$ $(μ_{B} - μ_{N}),$ the difference between the population means of the FNE scores for bulimic and normal female students, using a 95% confidence interval with a sampling error of two points. Find the sample sizes required to obtain such an estimate. Assume equal sample sizes of $σ_{B}^{2} = σ_{N}^{2} = 25.$ $σ_{B}^{2} = σ_{N}^{2} = 25.$

Applying the Concepts—Intermediate

7.62 Last name and acquisition timing. Refer to the Journal of Consumer Research (Aug. 2011) study of the last name effect in acquisition timing, Exercise 7.12 (p. 382). Recall that the mean response times (in minutes) to acquire basketball tickets were compared for two groups of MBA students: those students with last names beginning with one of the first 9 letters of the alphabet and those with last names beginning with one of the last 9 letters of the alphabet. How many MBA students from each group would need to be selected to estimate the difference in mean times to within 2 minutes of its true value with 95% confidence? (Assume equal sample sizes will be selected for each group and that the response time standard deviation for both groups is $σ \approx 9 minutes$ $σ \approx 9 minutes$ .)
SOLAR 7.63 Solar energy generation along highways. Refer to the International Journal of Energy and Environmental Engineering (Dec. 2013) study of solar energy generation along highways, Exercise 7.45 (p. 397). Recall that the researchers compared the mean monthly amount of solar energy generated by east-west and north-south oriented solar panels using a matched-pairs experiment. However, a small sample of only five months was used for the analysis. How many more months would need to be selected in order to estimate the difference in means to within 25 kilowatt hours with a 90% confidence interval? Use the information provided in the SOLAR file to find an estimate of the standard error required to carry out the calculation.
7.64 Do video game players have superior visual attention skills? Refer to the Journal of Articles in Support of the Null Hypothesis (Vol. 6, 2009) study comparing the visual attention skill of video game and non-video game players, Exercise 7.20 (p. 384). Recall that there was no significant difference between the mean score on the attentional blink test of video game players and the corresponding mean for non–video game players. It is possible that selecting larger samples would yield a significant difference. How many video game and non–video game players would need to be selected in order to estimate the difference in mean score for the two groups to within 5 points with 95% confidence? (Assume equal sample sizes will be selected from the two groups and that the score standard deviation for both groups is $σ \approx 9$ $σ \approx 9$ .)
7.65 Scouting an NFL free agent. In seeking a free-agent NFL running back, a general manager is looking for a player with high mean yards gained per carry and a small standard deviation. Suppose the GM wishes to compare the mean yards gained per carry for two free agents, on the basis of independent random samples of their yards gained per carry. Data from last year’s pro football season indicate that $σ_{1} = σ_{2} \approx 5$ $σ_{1} = σ_{2} \approx 5$ yards. If the GM wants to estimate the difference in means correct to within 1 yard with a confidence level of .90, how many runs would have to be observed for each player? (Assume equal sample sizes.)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7.4 Determining the Sample Size

Create new playlist

Sign In

Sign Up

Table of Contents for
7.4 Determining the Sample Size