Chapter 26
Ten Easy Ways to Estimate How Many Subjects You Need
In This Chapter
Quickly estimating sample size for several basic kinds of tests
Adjusting for different levels of power and alpha
Adjusting for unequal group sizes and for attrition during the study
Sample-size calculations tend to frighten researchers and send them running to the nearest statistician. But if you’re brainstorming a possible research project and you need a ballpark idea of how many subjects to enroll, you can use the ten quick and (fairly) easy rules of thumb in this chapter.
The first six sections tell you how many analyzable subjects you need to analyze in order to have an 80 percent chance of getting a p value that’s less than 0.05 when you run the test. Those parameters (80 percent power at 0.05 alpha) are widely used in biological research. The remaining four sections tell you how to modify this figure for other power or alpha values and how to adjust for unequal group size and dropouts from the study.
Comparing Means between Two Groups
Applies to: Unpaired Student t test, Mann-Whitney U test, or Wilcoxon Sum-of-Ranks test (see Chapter 12).
Effect size (E): The difference between the means of two groups divided by the standard deviation (SD) of the values within a group. (See Chapter 8 for details on means and SD.)
Rule: You need 16/E2 subjects in each group, or 32/E2 subjects altogether.
For example, if you’re comparing a blood pressure (BP) drug to a placebo, an improvement of 10 millimeters of mercury (mmHg) is important, and the SD of the BP changes is known to be 20 mmHg, then E = 10/20, or 0.5, and you need 16/(0.5)2, or 64 subjects in each group (128 subjects altogether).
Comparing Means among Three, Four, or Five Groups
Applies to: One-way Analysis of Variance (ANOVA) or Kruskal-Wallis test (see Chapter 12).
Effect size (E): The difference between the largest and smallest means among the groups divided by the within-group SD.
Rule: You need 20/E2 subjects in each group.
Continuing the example from the preceding section, if you’re comparing two BP drugs and a placebo (for a total of three groups), and if any difference of 10 mmHg between any pair of drugs is important, then E is still 10/20, or 0.5, but you now need 20/(0.5)2, or 80 subjects in each group (240 subjects altogether).
Comparing Paired Values
Applies to: Paired Student t test or Wilcoxon Signed-Ranks test.
Effect size (E): The average of the paired differences divided by the SD of the paired differences.
Rule: You need 8/E2 subjects (pairs of values).
So, if you’re studying test scores before and after tutoring, a six-point improvement is important, and the SD of the changes is ten points, then E = 6/10, or 0.6, and you need 8/(0.6)2, or about 22 students, each of whom provides a “before” score and an “after” score.
Comparing Proportions between Two Groups
Applies to: Chi-square test of association or Fisher Exact test (see Chapter 13).
Effect size (D): The difference between the two proportions, P1 and P2, that you’re comparing. You also have to calculate the average of the two proportions: P = (P1 + P2)/2.
Rule: You need 16 × P × (1 – P)/D2 subjects in each group.
For example, if a disease has a 60 percent mortality rate but you think your drug can cut this rate in half (to 30 percent), then P = (0.6 + 0.3)/2, or 0.45, and D = 0.6 – 0.3, or 0.3. You need 16 × 0.45 × (1 – 0.45)/(0.3)2, or 44 subjects in each group (88 subjects altogether).
Testing for a Significant Correlation
Applies to: Pearson correlation test (see Chapter 17) and is also a good approximation for the nonparametric Spearman correlation test.
Effect size: The correlation coefficient (r) you want to be able to detect.
Rule: You need 8/r2 subjects (pairs of values).
So, if you’re studying the association between weight and blood pressure, and you want the correlation test to come out significant if these two variables have a true correlation coefficient of at least 0.2, then you need to study 8/(0.2)2, or 200 subjects.
Comparing Survival between Two Groups
Applies to: Log-rank test or Cox proportional-hazard regression (see Chapter 23).
Effect size: The hazard ratio (HR) you want to be able to detect.
Rule: The required total number of observed deaths (or events) = 32/(natural log of HR)2.
Here’s how the formula works out for several values of HR:
Hazard Ratio |
Total Number of Events |
1.1 |
3,523 |
1.2 |
963 |
1.3 |
465 |
1.4 |
283 |
1.5 |
195 |
1.75 |
102 |
2.0 |
67 |
2.5 |
38 |
3.0 |
27 |
Scaling from 80 Percent to Some Other Power
Here’s how you take a sample-size estimate that provides 80 percent power (from one of the preceding rules) and scale it up or down to provide some other power:
For 50 percent power: Use only half as many subjects (multiply by 0.5).
For 90 percent power: Increase the sample size by 33 percent (multiply by 1.33).
For 95 percent power: Increase the sample size by 66 percent (multiply by 1.66).
For example, if you know (from some power calculation) that a study with 70 subjects provides 80 percent power to test its primary objective, then a study that has 1.33 × 70, or 93 subjects, will have about 90 percent power to test that objective.
Scaling from 0.05 to Some Other Alpha Level
Here’s how you take a sample-size estimate that was based on testing at the 0.05 alpha level and scale it up or down to correspond to testing at some other alpha level:
For 0.10 alpha: Decrease the sample size by 20 percent (multiply by 0.8).
For 0.025 alpha: Increase the sample size by 20 percent (multiply by 1.2).
For 0.01 alpha: Increase the sample size by 50 percent (multiply by 1.5).
For example, if you’ve calculated a sample size of 100 subjects based on using p < 0.05 as your criterion for significance, and then your boss says you have to apply a two-fold Bonferroni correction (see Chapter 5) and use p < 0.025 as your criterion instead, you need to increase your sample size to 100 × 1.2, or 120 subjects, to have the same power at the new alpha level.
Making Adjustments for Unequal Group Sizes
When comparing means or proportions between two groups, it’s usually most efficient (that is, you get the best power for a given total sample size) if both groups are the same size. If you want to have unbalanced groups, you need more subjects overall in order to preserve the statistical power of the study. Here’s how to adjust the size of the two groups to keep the same statistical power:
If you want one group twice as large as the other: Increase one group by 50 percent and reduce the other group by 25 percent. This increases the total sample size by about 13 percent.
If you want one group three times as large as the other: Reduce one group by a third and double the other group. This increases the total sample size by about 33 percent.
If you want one group four times as large as the other: Reduce one group by 38 percent and increase the other group by a factor of 2.5. This increases the total sample size by about 56 percent.
Suppose, for example, you’re comparing two equal-size groups (drug and placebo), and you’ve calculated that you need 64 subjects (two groups of 32). But then you decide you want to randomize drug and placebo subjects in a 2:1 ratio. To keep the same power, you’ll need 32 × 1.5, or 48 drug subjects, (an increase of 50 percent) and 32 × 0.75, or 24 placebo subjects (a decrease of 25 percent), for a total of 72 subjects altogether.
Allowing for Attrition
Most sample-size calculations (including the quick formulas shown in this chapter) tell you how many analyzable subjects you need. But you have to enroll more than that number because some subjects will drop out of the study or be unanalyzable for other reasons. Here’s how to scale up the number of analyzable subjects (from a power calculation) to get the number of subjects you need to enroll:
Enrollment = Number Analyzable × 100/(100 – %Attrition)
Here are the enrollment scale-ups for several attrition rates:
Expected Attrition |
Increase the Enrollment by |
5% |
5% |
10% |
11% |
15% |
18% |
20% |
25% |
25% |
33% |
33% |
50% |
50% |
100% |
So, if a power calculation indicates that you need a total of 60 analyzable subjects and you expect a 25 percent attrition rate, you need to enroll 60 × 1.33, or 80 subjects. That way, you’ll still have 60 subjects left after a quarter of the original 80 subjects have dropped out.