Chapter 26

Ten Easy Ways to Estimate How Many Subjects You Need

In This Chapter

arrow Quickly estimating sample size for several basic kinds of tests

arrow Adjusting for different levels of power and alpha

arrow Adjusting for unequal group sizes and for attrition during the study

Sample-size calculations tend to frighten researchers and send them running to the nearest statistician. But if you’re brainstorming a possible research project and you need a ballpark idea of how many subjects to enroll, you can use the ten quick and (fairly) easy rules of thumb in this chapter.

remember.eps Before you begin, look at Chapter 3, especially the sections on hypothesis testing and the power of a test, so that you have the basic idea of what power and sample-size calculations are all about. Think about the effect size of importance (such as the difference in some variable between two groups, or the degree of correlation between two variables) that you want to be able to detect. Then find the rule for the statistical test that’s appropriate for the primary objective of your study.

The first six sections tell you how many analyzable subjects you need to analyze in order to have an 80 percent chance of getting a p value that’s less than 0.05 when you run the test. Those parameters (80 percent power at 0.05 alpha) are widely used in biological research. The remaining four sections tell you how to modify this figure for other power or alpha values and how to adjust for unequal group size and dropouts from the study.

Comparing Means between Two Groups

check.png Applies to: Unpaired Student t test, Mann-Whitney U test, or Wilcoxon Sum-of-Ranks test (see Chapter 12).

check.png Effect size (E): The difference between the means of two groups divided by the standard deviation (SD) of the values within a group. (See Chapter 8 for details on means and SD.)

check.png Rule: You need 16/E2 subjects in each group, or 32/E2 subjects altogether.

For example, if you’re comparing a blood pressure (BP) drug to a placebo, an improvement of 10 millimeters of mercury (mmHg) is important, and the SD of the BP changes is known to be 20 mmHg, then E = 10/20, or 0.5, and you need 16/(0.5)2, or 64 subjects in each group (128 subjects altogether).

Comparing Means among Three, Four, or Five Groups

check.png Applies to: One-way Analysis of Variance (ANOVA) or Kruskal-Wallis test (see Chapter 12).

check.png Effect size (E): The difference between the largest and smallest means among the groups divided by the within-group SD.

check.png Rule: You need 20/E2 subjects in each group.

Continuing the example from the preceding section, if you’re comparing two BP drugs and a placebo (for a total of three groups), and if any difference of 10 mmHg between any pair of drugs is important, then E is still 10/20, or 0.5, but you now need 20/(0.5)2, or 80 subjects in each group (240 subjects altogether).

Comparing Paired Values

check.png Applies to: Paired Student t test or Wilcoxon Signed-Ranks test.

check.png Effect size (E): The average of the paired differences divided by the SD of the paired differences.

check.png Rule: You need 8/E2 subjects (pairs of values).

So, if you’re studying test scores before and after tutoring, a six-point improvement is important, and the SD of the changes is ten points, then E = 6/10, or 0.6, and you need 8/(0.6)2, or about 22 students, each of whom provides a “before” score and an “after” score.

Comparing Proportions between Two Groups

check.png Applies to: Chi-square test of association or Fisher Exact test (see Chapter 13).

check.png Effect size (D): The difference between the two proportions, P1 and P2, that you’re comparing. You also have to calculate the average of the two proportions: P = (P1 + P2)/2.

check.png Rule: You need 16 × P × (1 – P)/D2 subjects in each group.

For example, if a disease has a 60 percent mortality rate but you think your drug can cut this rate in half (to 30 percent), then P = (0.6 + 0.3)/2, or 0.45, and D = 0.6 – 0.3, or 0.3. You need 16 × 0.45 × (1 – 0.45)/(0.3)2, or 44 subjects in each group (88 subjects altogether).

Testing for a Significant Correlation

check.png Applies to: Pearson correlation test (see Chapter 17) and is also a good approximation for the nonparametric Spearman correlation test.

check.png Effect size: The correlation coefficient (r) you want to be able to detect.

check.png Rule: You need 8/r2 subjects (pairs of values).

So, if you’re studying the association between weight and blood pressure, and you want the correlation test to come out significant if these two variables have a true correlation coefficient of at least 0.2, then you need to study 8/(0.2)2, or 200 subjects.

Comparing Survival between Two Groups

check.png Applies to: Log-rank test or Cox proportional-hazard regression (see Chapter 23).

check.png Effect size: The hazard ratio (HR) you want to be able to detect.

check.png Rule: The required total number of observed deaths (or events) = 32/(natural log of HR)2.

Here’s how the formula works out for several values of HR:

Hazard Ratio

Total Number of Events

1.1

3,523

1.2

963

1.3

465

1.4

283

1.5

195

1.75

102

2.0

67

2.5

38

3.0

27

warning_bomb.eps Your enrollment must be large enough, and your follow-up must be long enough, to ensure that you get the required number of events. The required enrollment may be difficult to estimate beforehand, because it involves recruitment rates, censoring rates, the shape of the survival curve, and other things that are difficult to foresee and difficult to handle mathematically. So some protocols provide only a tentative estimate of the expected enrollment (for planning and budgeting purposes), and state that enrollment and/or follow-up will continue until the required number of events has been observed.

Scaling from 80 Percent to Some Other Power

Here’s how you take a sample-size estimate that provides 80 percent power (from one of the preceding rules) and scale it up or down to provide some other power:

check.png For 50 percent power: Use only half as many subjects (multiply by 0.5).

check.png For 90 percent power: Increase the sample size by 33 percent (multiply by 1.33).

check.png For 95 percent power: Increase the sample size by 66 percent (multiply by 1.66).

For example, if you know (from some power calculation) that a study with 70 subjects provides 80 percent power to test its primary objective, then a study that has 1.33 × 70, or 93 subjects, will have about 90 percent power to test that objective.

Scaling from 0.05 to Some Other Alpha Level

Here’s how you take a sample-size estimate that was based on testing at the 0.05 alpha level and scale it up or down to correspond to testing at some other alpha level:

check.png For 0.10 alpha: Decrease the sample size by 20 percent (multiply by 0.8).

check.png For 0.025 alpha: Increase the sample size by 20 percent (multiply by 1.2).

check.png For 0.01 alpha: Increase the sample size by 50 percent (multiply by 1.5).

For example, if you’ve calculated a sample size of 100 subjects based on using p < 0.05 as your criterion for significance, and then your boss says you have to apply a two-fold Bonferroni correction (see Chapter 5) and use p < 0.025 as your criterion instead, you need to increase your sample size to 100 × 1.2, or 120 subjects, to have the same power at the new alpha level.

Making Adjustments for Unequal Group Sizes

When comparing means or proportions between two groups, it’s usually most efficient (that is, you get the best power for a given total sample size) if both groups are the same size. If you want to have unbalanced groups, you need more subjects overall in order to preserve the statistical power of the study. Here’s how to adjust the size of the two groups to keep the same statistical power:

check.png If you want one group twice as large as the other: Increase one group by 50 percent and reduce the other group by 25 percent. This increases the total sample size by about 13 percent.

check.png If you want one group three times as large as the other: Reduce one group by a third and double the other group. This increases the total sample size by about 33 percent.

check.png If you want one group four times as large as the other: Reduce one group by 38 percent and increase the other group by a factor of 2.5. This increases the total sample size by about 56 percent.

Suppose, for example, you’re comparing two equal-size groups (drug and placebo), and you’ve calculated that you need 64 subjects (two groups of 32). But then you decide you want to randomize drug and placebo subjects in a 2:1 ratio. To keep the same power, you’ll need 32 × 1.5, or 48 drug subjects, (an increase of 50 percent) and 32 × 0.75, or 24 placebo subjects (a decrease of 25 percent), for a total of 72 subjects altogether.

Allowing for Attrition

Most sample-size calculations (including the quick formulas shown in this chapter) tell you how many analyzable subjects you need. But you have to enroll more than that number because some subjects will drop out of the study or be unanalyzable for other reasons. Here’s how to scale up the number of analyzable subjects (from a power calculation) to get the number of subjects you need to enroll:

Enrollment = Number Analyzable × 100/(100 – %Attrition)

Here are the enrollment scale-ups for several attrition rates:

Expected Attrition

Increase the Enrollment by

5%

5%

10%

11%

15%

18%

20%

25%

25%

33%

33%

50%

50%

100%

So, if a power calculation indicates that you need a total of 60 analyzable subjects and you expect a 25 percent attrition rate, you need to enroll 60 × 1.33, or 80 subjects. That way, you’ll still have 60 subjects left after a quarter of the original 80 subjects have dropped out.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset