Chapter 5

Conducting Clinical Research

In This Chapter

arrow Planning and carrying out a clinical research study

arrow Protecting the subjects

arrow Collecting, validating, and analyzing research data

This chapter and the next one provide a closer look at a special kind of biological research — the clinical trial. This chapter describes some aspects of conducting clinical research; Chapter 6 gives you the “big picture” of pharmaceutical drug trials — an example of a high-profile, high-stakes, highly regulated research endeavor. Although you may never be involved in something as massive as a drug trial, the principles are just as relevant, even if you’re only trying to show whether drinking a fruit smoothie every day gives you more energy.

Designing a Clinical Study

Clinical studies should conform to the highest standards of scientific rigor, and that starts with the design of the study. The following sections note some aspects of good experimental design you should keep in mind at the start of any research project.

Identifying aims, objectives, hypotheses, and variables

The aims or goals of a study are short general statements (often just one statement) of the overall purpose of the trial. For example, the aim of a study may be “to assess the safety and efficacy of drug XYZ in patients with moderate hyperlipidemia.”

The objectives are much more specific than the aims. Objectives usually refer to the effect of the product on specific safety and efficacy variables, at specific points in time, in specific groups of subjects. An efficacy study may have many individual efficacy objectives, as well as one or two safety objectives; a safety study may or may not have efficacy objectives.

remember.eps You should identify one or two primary objectives — those that are most directly related to the aim of the study and determine whether the product passes or fails in the study. You may then identify up to several dozen secondary objectives, which may involve different variables or the same variables at different time points or in different subsets of the study population. You may also list a set of exploratory objectives, which are less important, but still interesting. Finally, you list one or more safety objectives (if this is an efficacy study) or some efficacy objectives (if this is a safety study).

A typical set of primary, secondary, exploratory, and safety objectives (this example shows one of each type) for an efficacy study might look like this:

check.png Primary efficacy objective: To compare the effect of drug XYZ, relative to placebo, on changes in serum total cholesterol from baseline to week 12, in patients with moderate hyperlipidemia.

check.png Secondary efficacy objective: To compare the effect of drug XYZ, relative to placebo, on changes in serum total cholesterol and serum triglycerides from baseline to weeks 4 and 8, in patients with moderate hyperlipidemia.

check.png Exploratory efficacy objective: To compare the effect of drug XYZ, relative to placebo, on changes in serum lipids from baseline to weeks 4, 8, and 12, in male and female subsets of patients with moderate hyperlipidemia.

check.png Safety objective: To evaluate the safety of drug XYZ, relative to placebo, in terms of the occurrence of adverse events, changes from baseline in vital signs (blood pressure and heart rate), and safety laboratory results (chemistry, hematology, and so on), in patients with moderate hyperlipidemia.

Hypotheses usually correspond to the objectives but are worded in a way that directly relates to the statistical testing to be performed. So the preceding primary objective may correspond to the following hypothesis: “The mean 12-week reduction in total cholesterol will be greater in the XYZ group than in the placebo group.” Alternatively, the hypothesis may be expressed in a more formal mathematical notation and as a null and alternate pair (see Chapters 2 and 3 for details on these terms and the mathematical notation used):

HNull: ΔXYZΔPlacebo = 0

HAlt: ΔXYZΔPlacebo > 0

where Δ = mean of (TCholWeek 12 – TCholBaseline).

remember.eps Identifying the variables to collect in your study should be straightforward after you’ve enumerated all the objectives. Generally, you should plan on collecting some or all of the following kinds of data:

check.png Basic demographic information (such as date of birth, gender, race, and ethnicity)

check.png Information about the subject’s participation in the study (for instance, date of enrollment, whether the subject met each inclusion and exclusion criterion, date of each visit, measures of compliance, and final status (complete, withdrew, lost to follow-up, and so on)

check.png Basic baseline measurements (height, weight, vital signs, safety laboratory tests, and so forth)

check.png Subject and family medical history, including diseases, hospitalizations, smoking and other substance use, and current and past medications

check.png Laboratory and other testing (ECGs, X-rays, and so forth) results related to the study’s objectives

check.png Responses from questionnaires and other subjective assessments

check.png Occurrence of adverse events

Some of this information needs to be recorded only once (like birthdate, gender, and family history); other information (such as vital signs, dosing, and test results) may be acquired at scheduled or unscheduled visits, and some may be recorded only at unpredictable times, if at all (like adverse events).

tip.eps For very simple studies, you may be able to record all your data on a single (albeit large) sheet of ruled paper, with a row for each subject and a column for each variable. But in formal clinical studies, you need to design a Case Report Form (CRF). A CRF is often a booklet or binder with one page for the one-time data and a set of identical pages for each kind of recurring data. Many excellent CRF templates can be downloaded from the web, for example from globalhealthtrials.tghn.org/articles/downloadable-­templates-and-tools-clinical-research/ (or just enter "CRF templates" in your browser). See the later section Collecting and validating data for more information on CRFs.

Deciding who will be in the study

Because you can’t examine the entire population of people with the condition you’re studying, you must select a representative sample from that population (see Chapter 3 for an introduction to populations and samples). You do this by explicitly defining the conditions that determine whether or not a subject is suitable to be in the study.

check.png Inclusion criteria are used during the screening process to identify potential subjects and usually involve subject characteristics that define the population you want to draw conclusions about. A reasonable inclusion criterion for a study of a lipid-lowering treatment would be, “Subject must have a documented diagnosis of hyperlipidemia, defined as Total Cholesterol > 200 mg/dL and LDL > 130 mg/dL at screening.”

check.png Exclusion criteria are used to identify subjects for whom participation would be unsafe or those whose participation would compromise the scientific integrity of the study (due to preexisting conditions, an inability to understand instructions, and so on). The following usually appears in the list of exclusion criteria: “The subject is, in the judgment of the investigator, unlikely to be able to understand and comply with the treatment regimen prescribed by the protocol.”

check.png Withdrawal criteria describe situations that could arise during the study that would prevent the subject’s further participation for safety or other reasons (such as an intolerable adverse reaction or a serious noncompliance). A typical withdrawal criterion may be “The subject has missed two consecutive scheduled clinic visits.”

Choosing the structure of the study

Most clinical trials involving two or more test products have one of the following structures (or designs), each of which has both pros and cons:

check.png Parallel: Each subject receives one of the products. Parallel designs are simpler, quicker, and easier for each subject, but you need more subjects. Trials with very long treatment periods usually have to be parallel. The statistical analysis of parallel trials is generally simpler than for crossover trials (see the next bullet).

check.png Crossover: Each subject receives all the products in sequence during consecutive treatment periods (called phases) separated by washout intervals (lasting from several days to several weeks). Crossover designs can be more efficient, because each subject serves as his own control, eliminating subject-to-subject variability. But you can use crossover designs only if you’re certain that at the end of each washout period the subject will have been restored to the same condition as at the start of the study; this may be impossible for studies of progressive diseases, like cancer or emphysema.

Using randomization

Randomized controlled trials (RCTs) are the gold standard for clinical research. In an RCT, the subjects are randomly allocated into treatment groups (in a parallel trial) or into treatment-sequence groups (in a crossover design). Randomization provides several advantages:

check.png It tends to eliminate selection bias — preferentially giving certain treatments to certain subjects (assigning a placebo to the less “likeable” subjects) — and confounding, where the treatment groups differ with respect to some characteristic that influences the outcome.

check.png It permits the application of statistical methods to the analysis of the data.

check.png It facilitates blinding. Blinding (also called masking) refers to concealing the identity of the treatment from subjects and researchers, and can be one of two types:

Single-blinding: The subjects don’t know what treatment they’re receiving, but the investigators do.

Double-blinding: Neither the subjects nor the investigators know which subjects are receiving which treatments.

Blinding eliminates bias resulting from the placebo effect, whereby subjects often respond favorably to any treatment (even a placebo), especially when the efficacy variables are subjective, such as pain level. Double-blinding also eliminates deliberate and subconscious bias in the investigator’s evaluation of a subject’s condition.

The simplest kind of randomization involves assigning each newly enrolled subject to a treatment group by the flip of a coin or a similar method. But simple randomization may produce an unbalanced pattern, like the one shown in Figure 5-1 for a small study of 12 subjects and two treatments: Drug (D) and Placebo (P).

9781118553992-fg0501.eps

Illustration by Wiley, Composition Services Graphics

Figure 5-1: Simple randomization.

If you were hoping to have six subjects in each group, you won’t like having only three subjects receiving the drug and nine receiving the placebo, but unbalanced patterns like this arise quite often from 12 coin flips. (Try it if you don’t believe me.)

A better approach is to require six subjects in each group, but to shuffle those six Ds and six Ps around randomly, as shown in Figure 5-2:

9781118553992-fg0502.eps

Illustration by Wiley, Composition Services Graphics

Figure 5-2: Random shuffling.

This arrangement is better (there are exactly six drug and six placebo subjects), but this particular random shuffle happens to assign more drugs to the earlier subjects and more placebos to the later subjects (again, bad luck of the draw). If these 12 subjects were enrolled over a period of five or six months, seasonal effects might be mistaken for treatment effects (an example of confounding).

To make sure that both treatments are evenly spread across the entire recruitment period, you can use blocked randomization, in which you divide your subjects into consecutive blocks and shuffle the assignments within each block. Often the block size is set to twice the number of treatment groups (for instance, a two-group study would use a block size of four), as shown in Figure 5-3.

tip.eps You can create simple and blocked randomization lists in Excel using the RAND built-in function to shuffle the assignments. You can also use the web page at graphpad.com/quickcalcs/randomize1.cfm to generate blocked randomization lists quickly and easily.

9781118553992-fg0503.eps

Illustration by Wiley, Composition Services Graphics

Figure 5-3: Blocked randomization.

Selecting the analyses to use

You should select the appropriate method for each of your study hypotheses based on the kind of data involved, the structure of the study, and the nature of the hypothesis. The rest of this book describes statistical methods to analyze the kinds of data you’re likely to encounter in clinical research. Changes in variables over time and differences between treatments in crossover studies are often analyzed by paired t tests and repeated-measures ANOVAs, and differences between groups of subjects in parallel studies are often analyzed by unpaired t tests and ANOVAs (see Chapter 12 for more on t tests and ANOVAs). Differences in the percentage of subjects responding to treatment or experiencing events are often compared with chi-square or Fisher Exact tests (see Chapters 13 and 14 for the scoop on these tests). The associations between two or more variables are usually analyzed by regression methods (get the lowdown on regression in Part IV). Survival times (and the times to the occurrence of other endpoint events) are analyzed by survival methods (turn to Part V for the specifics of survival analysis).

Defining analytical populations

remember.eps Analytical populations are precisely defined subsets of the enrolled subjects that are used for different kinds of statistical analysis. Most clinical trials include the following types of analytical populations:

check.png The safety population: This group usually consists of all subjects who received at least one dose of any study product (even a placebo) and had at least one subsequent safety-related visit or observation. All safety-related tabulations and analyses are done on the safety population.

check.png The intent-to-treat (ITT) population: This population usually consists of all subjects who received any study product. The ITT population is useful for assessing effectiveness — how well the product performs in the real world, where people don’t always take the product as recommended (because of laziness, inconvenience, unpleasant side effects, and so on).

check.png The per-protocol (PP) population: This group is usually defined as all subjects who complied with the rules of the study — those people who took the product as prescribed, made all test visits, and didn’t have any serious protocol violations. The PP population is useful for assessing efficacy — how well the product works in an ideal world where everyone takes it as prescribed.

Other special populations may be defined for special kinds of analysis. For example, if the study involves taking a special set of blood samples for pharmacokinetic (PK) calculations, the protocol usually defines a PK population consisting of all subjects who provided suitable PK samples.

Determining how many subjects to enroll

You should enroll enough subjects to provide sufficient statistical power (see Chapter 3) when testing the primary objective of the study. The specific way you calculate the required sample size depends on the statistical test that's used for the primary hypothesis. Each chapter of this book that describes hypothesis tests shows how to estimate the required sample size for that test. Also, you can use the formulas, tables, and charts in Chapter 26 and in the Cheat Sheet (at www.dummies.com/cheatsheet/biostatistics) to get quick sample-size estimates.

remember.eps You must also allow for some of the enrolled subjects dropping out or being unsuitable for analysis. If, for example, you need 64 analyzable subjects for sufficient power and you expect 15 percent attrition from the study (in other words, you expect only 85 percent of the enrolled subjects to have analyzable data), you need to enroll 64/0.85, or 76, subjects in the study.

Putting together the protocol

A protocol is a document that lays out exactly what you plan to do in a clinical study. Ideally, every study involving human subjects should have a protocol. The following sections list standard components and administrative information found in a protocol.

Standard elements

A formal drug trial protocol usually contains most of the following components:

check.png Title: A title conveys as much information about the trial as you can fit into one sentence, including the protocol ID, study name (if it has one), clinical phase, type and structure of trial, type of randomization and blinding, name of the product, treatment regimen, intended effect, and the population being studied (what medical condition, in what group of people). A title can be quite long — this one has all the preceding elements:

Protocol BCAM521-13-01 (ASPIRE-2) — a Phase-IIa, double-blind, placebo-controlled, randomized, parallel-group study of the safety and efficacy of three different doses of AM521, given intravenously, once per month for six months, for the relief of chronic pain, in adults with knee osteoporosis.

check.png Background information: This section includes info about the disease (such as its prevalence and impact), known physiology (at the molecular level, if known), treatments currently available (if any), and information about this drug (its mechanism of action, the results of prior testing, and known and potential risks and benefits to subjects).

check.png Rationale: The rationale for the study states why it makes sense to do this study at this time, including a justification for the choice of doses, how the drug is administered (such as orally or intravenously), and the duration of therapy and follow-up.

check.png Aims, objectives, and hypotheses: I discuss these items in the earlier section Aims, objectives, hypotheses, and variables.

check.png Detailed descriptions of all inclusion, exclusion, and withdrawal criteria: See the earlier section Deciding who will be in the study for more about these terms.

check.png Design of study: The study’s design defines its structure (check out the earlier section Choosing the structure of the study), the number of treatment groups, and the consecutive stages (screening, washout, treatment, follow-up, and so on). This section often includes a schematic diagram of the structure of the study.

check.png Product description: This description details each product that will be administered to the subjects, including the chemical composition (with the results of chemical analysis of the product, if available) and how to store, prepare, and administer the product.

check.png Blinding and randomization schemes: These schemes include descriptions of how and when the study will be unblinded (including the emergency unblinding of individual subjects, if necessary); see the earlier section Using randomization.

check.png Procedural descriptions: This section describes every procedure that will be performed at every visit, including administrative procedures (such as enrollment and informed consent) and diagnostic procedures (for example, physical exams and vital signs).

check.png Safety considerations: These factors include the known and potential side effects of the product and each test procedure (such as X-rays, MRI scans, and blood draws), including steps taken to minimize the risk to the subjects.

check.png Handling of adverse events: This section describes how adverse events will be recorded — description, severity, dates and times of onset and resolution, any medical treatment given for the event, and whether or not the investigator thinks the event was related to the study product. Reporting adverse events has become quite standardized over the years, so this section tends to be very similar for all studies.

check.png Definition of safety, efficacy, and other analytical populations: This section includes definitions of safety and efficacy variables and endpoints (variables or changes in variables that serve as indicators of safety or efficacy). See the earlier section Defining analytical populations.

check.png Planned enrollment and analyzable sample size: Justification for these numbers must also be provided.

check.png Proposed statistical analyses: Some protocols describe, in detail, every analysis for every objective; others have only a summary and refer to a separate Statistical Analysis Plan (SAP) document for details of the proposed analysis. This section should also include descriptions of the treatment of missing data, adjustments for multiple testing to control Type I errors (see Chapter 3), and whether any interim analyses are planned. If a separate SAP is used, it will usually contain a detailed description of all the calculations and analyses that will be carried out on the data, including the descriptive summaries of all data and the testing of all the hypotheses specified in the protocol. The SAP also usually contains mock-ups, or “shells” of all the tables, listings, and figures (referred to as TLFs) that will be generated from the data.

Administrative details

A protocol also has sections with more administrative information:

check.png Names of and contact info for the sponsor, medical expert, and primary investigator, plus the physicians, labs, and other major medical or technical groups involved

check.png A table of contents, similar to the kind you find in many books (including this one)

check.png A synopsis, which is a short (usually around two pages) summary of the main components of the protocol

check.png A list of abbreviations and terms appearing in the protocol

check.png A description of your policies for data handling, record-keeping, quality control, ethical considerations, access to source documents, and publication of results

check.png Financing and insurance agreements

check.png Descriptions of all amendments made to the original protocol

Carrying Out a Clinical Study

After you’ve designed your study and have described it in the protocol document, it’s time to set things in motion. The operational details will, of course, vary from one study to another, but a few aspects apply to all clinical studies. In any study involving human subjects, the most important consideration is protecting those subjects from harm, and an elaborate set of safeguards has evolved over the past century. And in any scientific investigation, the accurate collection of data is crucial to the success of the research.

Protecting your subjects

remember.eps In any research involving human subjects, two issues are of utmost importance:

check.png Safety: Minimizing the risk of physical harm to the subjects from the product being tested and from the procedures involved in the study

check.png Privacy/confidentiality: Ensuring that data collected during the study is not made public in a way that identifies a specific subject without the subject’s consent

The following sections describe some of the “infrastructure” that helps protect human subjects.

Surveying regulatory agencies

In the United States, several government organizations oversee human subjects’ protection:

check.png Commercial pharmaceutical research is governed by the Food and Drug Administration (FDA).

check.png Most academic biological research is sponsored by the National Institutes of Health (NIH) and is governed by the Office for Human Research Protections (OHRP).

Chapter 6 describes the ways investigators interact with these agencies during the course of clinical research.

technicalstuff.eps Other countries have similar agencies. There’s also an organization — the International Conference on Harmonization (ICH) — that works to establish a set of consistent standards that can be applied worldwide. The FDA and NIH have adopted many ICH standards (with some modifications).

Working with Institutional Review Boards

For all but the very simplest research involving human subjects, you need the approval of an IRB — an Institutional (or Independent) Review Board — before enrolling any subjects into your study. You have to submit an application along with the protocol and an ICF (see the next section) to an IRB with jurisdiction over your research.

Most medical centers and academic institutions — and some pharmaceutical companies — have their own IRBs with jurisdiction over research conducted at their institution. If you’re not affiliated with one of these centers or institutions (for example, if you’re a physician in private practice), you may need the services of a “free-standing” IRB. The sponsor of the research may suggest (or dictate) an IRB for the project.

Getting informed consent

An important part of protecting human subjects is making sure that they’re aware of the risks of a study before agreeing to participate in it. You must prepare an Informed Consent Form (ICF) describing, in simple language, the nature of the study, why it is being conducted, what is being tested, what procedures subjects will undergo, and what the risks and benefits are. Subjects must be told that they can refuse to participate and can withdraw at any time for any reason, without fear of retribution or the withholding of regular medical care. The IRB can usually provide ICF templates with examples of their recommended or required wording.

remember.eps Prior to performing any procedures on a potential subject (including screening tests), you must give the ICF document to the subject and give her time to read it and decide whether she wants to participate. The subject’s agreement must be signed and witnessed. The signed ICFs must be retained as part of the official documentation for the project, along with laboratory reports, ECG tracings, and records of all test products administered to the subjects and procedures performed on them. The sponsor, the regulatory agencies, the IRB, and other entities may call for these documents at any time.

Considering data safety monitoring boards and committees

For clinical trials of products that are likely to be of low risk, investigators are usually responsible for being on the lookout for signs of trouble (unexpected adverse events, abnormal laboratory tests, and so forth) during the course of the study. But for studies involving high-risk treatments (like cancer chemotherapy trials), a separate data safety monitoring board or committee (DSMB or DSMC) may be set up. A DSMB may be required by the sponsor, the investigator, the IRB, or a regulatory agency. A DSMB typically has about six members (usually expert clinicians in the relevant area of research and a statistician) who meet at regular intervals to review the safety data acquired up to that point. The committee is authorized to modify, suspend, or even terminate a study if it has serious concerns about the safety of the subjects.

Getting certified in human subjects protection and good clinical practice

As you've probably surmised from the preceding sections, clinical research is fraught with regulatory requirements (with severe penalties for noncompliance), and you shouldn't try to "wing it" and hope that everything goes well. You should ensure that you, along with any others who may be assisting you, are properly trained in matters relating to human subjects protection. Fortunately, such training is readily available. Most hospitals and medical centers provide yearly training (often as a half-day session), after which you receive a certification in human subjects protection. Most IRBs and funding agencies require proof of certification from all people who are involved in the research. If you don't have access to that training at your institution, you can get certified by taking an online tutorial offered by the NIH (grants.nih.gov/grants/policy/hs/training.htm).

You should also have one or more of the people who will be involved in the research take a course in “good clinical practice” (GCP). GCP certification is also available online (enter “GCP certification” in your favorite browser).

Collecting and validating data

If the case report form (CRF) has been carefully and logically designed, entering each subject’s data in the right place on the CRF should be straightforward. Then you need to get this data into a computer for analysis. You can enter your data directly into the statistics software you plan to use for the majority of the analysis (see Chapter 4 for some software options), or you can enter it into a general database program such as MS Access or a spreadsheet program like Excel. The structure of a computerized database usually reflects the structure of the CRF. If a study is simple enough that a single data sheet can hold all the data, then a single data file (called a table) or a single Excel worksheet will suffice. But for most studies, a more complicated database is required, consisting of a set of tables or Excel worksheets (one for each kind of data collection sheet in the CRF). If the design of the database is consistent with the structure of the CRF, entering the data from each CRF sheet into the corresponding data table shouldn’t be difficult.

remember.eps You must retain all the original source documents (lab reports, the examining physician’s notes, original questionnaire sheets, and so forth) in case questions about data accuracy arise later.

tip.eps Before you can analyze your data (see the next section), you must do one more crucially important task — check your data thoroughly for errors! And there will be errors — they can arise from transcribing data from the source documents onto the CRF or from entering the data from the CRFs into the computer. Consider some of the following error-checking techniques:

check.png Have one person read data from the source documents or CRFs while another looks at the data that’s in the computer. Ideally, this is done with all data for all subjects.

check.png Have the computer display the smallest and largest values of each variable. Better yet, have the computer display a sorted list of the values for each variable. Typing errors often produce very large or very small values.

check.png A more extreme approach, but one that’s sometimes done for crucially important studies, is to have two people enter all the data into separate copies of the database; then have the computer automatically compare every single data item between the two databases.

Chapter 7 has more details on describing, entering, and checking different types of data.

Analyzing Your Data

The remainder of this book explains the methods commonly used in biostatistics to summarize, graph, and analyze data. In the following sections, I describe some general situations that come up in all clinical research, regardless of what kind of analysis you use.

Dealing with missing data

Most clinical trials have incomplete data for one or more variables, which can be a real headache when analyzing your data. The statistical aspects of missing data are quite complicated, so you should consult a statistician if you have more than just occasional, isolated missing values. Here I describe some commonly used approaches to coping with missing data:

check.png Exclude a case from an analysis if any of the required variables for that analysis is missing. This approach can reduce the number of analyzable cases, sometimes quite severely (especially in multiple regression, where the whole case must be thrown out, even if only one of the variables in the regression is missing; see Chapter 19 for more information). And if the result is missing for a reason that’s related to treatment efficacy, excluding the case can bias your results.

check.png warning_bomb.eps Replace (impute) a missing value with the mean (or median) of all the available values for that variable. This approach is quite common, but it introduces several types of bias into your results, so it’s not a good technique to use.

check.png If one of a series of sequential measurements on a subject is missing (like the third of a series of weekly glucose values), use the previous value in the series. This technique is called Last Observation Carried Forward (LOCF) and is one of the most widely used strategies. LOCF usually produces “conservative” results, making it more difficult to prove efficacy. This approach is popular with regulators, who want to put the burden of proof on the drug.

warning_bomb.eps More complicated methods can also be used, such as estimating the missing value of a variable based on the relationship between that variable and other variables in the data set, or using an analytical method like mixed-model repeated measures (MMRM) analysis, which uses all available data and doesn’t reject a case just because one variable is missing. But these methods are far beyond the scope of this book, and you shouldn’t try them yourself.

Handling multiplicity

Every time you perform a statistical significance test, you run a chance of being fooled by random fluctuations into thinking that some real effect is present in your data when, in fact, none exists. This scenario is called a Type I error (see Chapter 3). When you say that you require p < 0.05 for significance, you’re testing at the 0.05 (or 5 percent) alpha level (see Chapter 3) or saying that you want to limit your Type I error rate to 5 percent. But that 5 percent error rate applies to each and every statistical test you run. The more analyses you perform on a data set, the more your overall alpha level increases: Perform two tests and your chance of at least one of them coming out falsely significant is about 10 percent; run 40 tests, and the overall alpha level jumps to 87 percent. This is referred to as the problem of multiplicity, or as Type I error inflation.

Some statistical methods involving multiple comparisons (like post-hoc tests following an ANOVA for comparing several groups, as described in Chapter 12) incorporate a built-in adjustment to keep the overall alpha at only 5 percent across all comparisons. But when you’re testing different hypotheses, like comparing different variables at different time points between different groups, it’s up to you to decide what kind of alpha control strategy (if any) you want to implement. You have several choices, including the following:

check.png Don’t control for multiplicity and accept the likelihood that some of your “significant” findings will be falsely significant. This strategy is often used with hypotheses related to secondary and exploratory objectives; the protocol usually states that no final inferences will be made from these exploratory tests. Any “significant” results will be considered only “signals” of possible real effects and will have to be confirmed in subsequent studies before any final conclusions are drawn.

check.png Control the alpha level across only the most important hypotheses. If you have two co-primary objectives, you can control alpha across the tests of those two objectives.

You can control alpha to 5 percent (or to any level you want) across a set of n hypothesis tests in several ways; following are some popular ones:

The Bonferroni adjustment: Test each hypothesis at the 0.05/n alpha level. So to control overall alpha to 0.05 across two primary endpoints, you need p < 0.025 for significance when testing each one.

A hierarchical testing strategy: Rank your endpoints in descending order of importance. Test the most important one first, and if it gives p < 0.05, conclude that the effect is real. Then test the next most important one, again using p < 0.05 for significance. Continue until you get a nonsignificant result (p > 0.05); then stop testing (or consider all further tests to be only exploratory and don’t draw any formal conclusions about them).

Controlling the false discovery rate (FDR): This approach has become popular in recent years to deal with large-scale multiplicity, which arises in areas like genomic testing and digital image analysis that may involve many thousands of tests (such as one per gene or one per pixel) instead of just a few. Instead of trying to avoid even a single false conclusion of significance (as the Bonferroni and other classic alpha control methods do), you simply want to control the proportion of tests that come out falsely positive, limiting that false discovery rate to some reasonable fraction of all the tests. These positive results can then be tested in a follow-up study.

Incorporating interim analyses

An interim analysis is one that’s carried out before the conclusion of a clinical trial, using only the data that has been obtained so far. Interim analyses can be blinded or unblinded and can be done for several reasons:

check.png An IRB may require an early look at the data to ensure that subjects aren’t being exposed to an unacceptable level of risk.

check.png You may want to examine data halfway through the trial to see whether the trial can be stopped early for one of the following reasons:

• The product is so effective that going to completion isn’t necessary to prove significance.

• The product is so ineffective that continuing the trial is futile.

check.png You may want to check some of the assumptions that went into the original design and sample-size calculations of the trial (like within-group variability, recruitment rates, base event rates, and so on) to see whether the total sample size should be adjusted upward or downward.

If the interim analysis could possibly lead to early stopping of the trial for proven efficacy, then the issue of multiplicity comes into play, and special methods must be used to control alpha across the interim and final analyses. These methods often involve some kind of alpha spending strategy. The concepts are subtle, and the calculations can be complicated, but here’s a very simple example that illustrates the basic concept. Suppose your original plan is to test the efficacy endpoint at the end of the trial at the 5 percent alpha level. If you want to design an interim analysis into this trial, you may use this two-part strategy:

1. Spend one-fifth of the available 5 percent alpha at the interim analysis.

The interim analysis p value must be < 0.01 to stop the trial early and claim efficacy.

2. Spend the remaining four-fifths of the 5 percent alpha at the end.

The end analysis p value must be < 0.04 to claim efficacy.

This strategy preserves the 5 percent overall alpha level while still giving the drug a chance to prove itself at an early point in the trial.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset