Biostatistics For Dummies®

Table of Contents


About This Book

Conventions Used in This Book

What You’re Not to Read

Foolish Assumptions

How This Book Is Organized

Part I: Beginning with Biostatistics Basics

Part II: Getting Down and Dirty with Data

Part III: Comparing Groups

Part IV: Looking for Relationships with Correlation and Regression

Part V: Analyzing Survival Data

Part VI: The Part of Tens

Icons Used in This Book

Where to Go from Here

Part I: Beginning with Biostatistics Basics

Chapter 1: Biostatistics 101

Brushing Up on Math and Stats Basics

Doing Calculations with the Greatest of Ease

Concentrating on Clinical Research

Drawing Conclusions from Your Data

Statistical estimation theory

Statistical decision theory

A Matter of Life and Death: Working with Survival Data

Figuring Out How Many Subjects You Need

Getting to Know Statistical Distributions

Chapter 2: Overcoming Mathophobia: Reading and Understanding Mathematical Expressions

Breaking Down the Basics of Mathematical Formulas

Displaying formulas in different ways

Checking out the building blocks of formulas

Focusing on Operations Found in Formulas

Basic mathematical operations

Powers, roots, and logarithms

Factorials and absolute values


Simple and complicated formulas


Counting on Collections of Numbers

One-dimensional arrays

Higher-dimensional arrays

Arrays in formulas

Sums and products of the elements of an array

Chapter 3: Getting Statistical: A Short Review of Basic Statistics

Taking a Chance on Probability

Thinking of probability as a number

Following a few basic rules

Comparing odds versus probability

Some Random Thoughts about Randomness

Picking Samples from Populations

Recognizing that sampling isn’t perfect

Digging into probability distributions

Introducing Statistical Inference

Statistical estimation theory

Statistical decision theory

Homing In on Hypothesis Testing

Getting the language down

Testing for significance

Understanding the meaning of “p value” as the result of a test

Examining Type I and Type II errors

Grasping the power of a test

Going Outside the Norm with Nonparametric Statistics

Chapter 4: Counting on Statistical Software

Desk Job: Personal Computer Software

Checking out commercial software

Focusing on free software

On the Go: Calculators and Mobile Devices

Scientific and programmable calculators

Mobile devices

Gone Surfin’: Web-Based Software

On Paper: Printed Calculators

Chapter 5: Conducting Clinical Research

Designing a Clinical Study

Identifying aims, objectives, hypotheses, and variables

Deciding who will be in the study

Choosing the structure of the study

Using randomization

Selecting the analyses to use

Defining analytical populations

Determining how many subjects to enroll

Putting together the protocol

Carrying Out a Clinical Study

Protecting your subjects

Collecting and validating data

Analyzing Your Data

Dealing with missing data

Handling multiplicity

Incorporating interim analyses

Chapter 6: Looking at Clinical Trials and Drug Development

Not Ready for Human Consumption: Doing Preclinical Studies

Testing on People during Clinical Trials to Check a Drug’s Safety and Efficacy

Phase I: Determining the maximum tolerated dose

Phase II: Finding out about the drug’s performance

Phase III: Proving that the drug works

Phase IV: Keeping an eye on the marketed drug

Holding Other Kinds of Clinical Trials

Pharmacokinetics and pharmacodynamics (PK/PD studies)

Bioequivalence studies

Thorough QT studies

Part II: Getting Down and Dirty with Data

Chapter 7: Getting Your Data into the Computer

Looking at Levels of Measurement

Classifying and Recording Different Kinds of Data

Dealing with free-text data

Assigning subject identification (ID) numbers

Organizing name and address data

Collecting categorical data

Recording numerical data

Entering date and time data

Checking Your Entered Data for Errors

Creating a File that Describes Your Data File

Chapter 8: Summarizing and Graphing Your Data

Summarizing and Graphing Categorical Data

Summarizing Numerical Data

Locating the center of your data

Describing the spread of your data

Showing the symmetry and shape of the distribution

Structuring Numerical Summaries into Descriptive Tables

Graphing Numerical Data

Showing the distribution with histograms

Summarizing grouped data with bars, boxes, and whiskers

Depicting the relationships between numerical variables with other graphs

Chapter 9: Aiming for Accuracy and Precision

Beginning with the Basics of Accuracy and Precision

Getting to know sample statistics and population parameters

Understanding accuracy and precision in terms of the sampling distribution

Thinking of measurement as a kind of sampling

Expressing errors in terms of accuracy and precision

Improving Accuracy and Precision

Enhancing sampling accuracy

Getting more accurate measurements

Improving sampling precision

Increasing the precision of your measurements

Calculating Standard Errors for Different Sample Statistics

A mean

A proportion

Event counts and rates

A regression coefficient

Chapter 10: Having Confidence in Your Results

Feeling Confident about Confidence Interval Basics

Defining confidence intervals

Looking at confidence levels

Taking sides with confidence intervals

Calculating Confidence Intervals

Before you begin: Formulas for confidence limits in large samples

The confidence interval around a mean

The confidence interval around a proportion

The confidence interval around an event count or rate

The confidence interval around a regression coefficient

Relating Confidence Intervals and Significance Testing

Chapter 11: Fuzzy In Equals Fuzzy Out: Pushing Imprecision through a Formula

Understanding the Concept of Error Propagation

Using Simple Error Propagation Formulas for Simple Expressions

Adding or subtracting a constant doesn’t change the SE

Multiplying (or dividing) by a constant multiplies (or divides) the SE by the same amount

For sums and differences: Add the squares of SEs together

For averages: The square root law takes over

For products and ratios: Squares of relative SEs are added together

For powers and roots: Multiply the relative SE by the power

Handling More Complicated Expressions

Using the simple rules consecutively

Checking out an online calculator

Simulating error propagation — easy, accurate, and versatile

Part III: Comparing Groups

Chapter 12: Comparing Average Values between Groups

Knowing That Different Situations Need Different Tests

Comparing the mean of a group of numbers to a hypothesized value

Comparing two groups of numbers

Comparing three or more groups of numbers

Analyzing data grouped on several different variables

Adjusting for a “nuisance variable” when comparing numbers

Comparing sets of matched numbers

Comparing within-group changes between groups

Trying the Tests Used for Comparing Averages

Surveying Student t tests

Assessing the ANOVA

Running Student t tests and ANOVAs from summary data

Running nonparametric tests

Estimating the Sample Size You Need for Comparing Averages

Simple formulas

Software and web pages

A sample-size nomogram

Chapter 13: Comparing Proportions and Analyzing Cross-Tabulations

Examining Two Variables with the Pearson Chi-Square Test

Understanding how the chi-square test works

Pointing out the pros and cons of the chi-square test

Modifying the chi-square test: The Yates continuity correction

Focusing on the Fisher Exact Test

Understanding how the Fisher Exact test works

Noting the pros and cons of the Fisher Exact test

Analyzing Ordinal Categorical Data with the Kendall Test

Studying Stratified Data with the Mantel-Haenszel Chi-Square Test

Chapter 14: Taking a Closer Look at Fourfold Tables

Focusing on the Fundamentals of Fourfold Tables

Choosing the Right Sampling Strategy

Producing Fourfold Tables in a Variety of Situations

Describing the association between two binary variables

Assessing risk factors

Evaluating diagnostic procedures

Investigating treatments

Looking at inter- and intra-rater reliability

Chapter 15: Analyzing Incidence and Prevalence Rates in Epidemiologic Data

Understanding Incidence and Prevalence

Prevalence: The fraction of a population with a particular condition

Incidence: Counting new cases

Understanding how incidence and prevalence are related

Analyzing Incidence Rates

Expressing the precision of an incidence rate

Comparing incidences with the rate ratio

Calculating confidence intervals for a rate ratio

Comparing two event rates

Comparing two event counts with identical exposure

Estimating the Required Sample Size

Chapter 16: Feeling Noninferior (Or Equivalent)

Understanding the Absence of an Effect

Defining the effect size: How different are the groups?

Defining an important effect size: How close is close enough?

Recognizing effects: Can you spot a difference if there really is one?

Proving Equivalence and Noninferiority

Using significance tests

Using confidence intervals

Some precautions about noninferiority testing

Part IV: Looking for Relationships with Correlation and Regression

Chapter 17: Introducing Correlation and Regression

Correlation: How Strongly Are Two Variables Associated?

Lining up the Pearson correlation coefficient

Analyzing correlation coefficients

Regression: What Equation Connects the Variables?

Understanding the purpose of regression analysis

Talking about terminology and mathematical notation

Classifying different kinds of regression

Chapter 18: Getting Straight Talk on Straight-Line Regression

Knowing When to Use Straight-Line Regression

Understanding the Basics of Straight-Line Regression

Running a Straight-Line Regression

Taking a few basic steps

Walking through an example

Interpreting the Output of Straight-Line Regression

Seeing what you told the program to do

Looking at residuals

Making your way through the regression table

Wrapping up with measures of goodness-of-fit

Scientific fortune-telling with the prediction formula

Recognizing What Can Go Wrong with Straight-Line Regression

Figuring Out the Sample Size You Need

Chapter 19: More of a Good Thing: Multiple Regression

Understanding the Basics of Multiple Regression

Defining a few important terms

Knowing when to use multiple regression

Being aware of how the calculations work

Running Multiple Regression Software

Preparing categorical variables

Recoding categorical variables as numerical

Creating scatter plots before you jump into your multiple regression

Taking a few steps with your software

Interpreting the Output of a Multiple Regression

Examining typical output from most programs

Checking out optional output available from some programs

Deciding whether your data is suitable for regression analysis

Determining how well the model fits the data

Watching Out for Special Situations that Arise in Multiple Regression

Synergy and anti-synergy

Collinearity and the mystery of the disappearing significance

Figuring How Many Subjects You Need

Chapter 20: A Yes-or-No Proposition: Logistic Regression

Using Logistic Regression

Understanding the Basics of Logistic Regression

Gathering and graphing your data

Fitting a function with an S shape to your data

Handling multiple predictors in your logistic model

Running a Logistic Regression with Software

Interpreting the Output of Logistic Regression

Seeing summary information about the variables

Assessing the adequacy of the model

Checking out the table of regression coefficients

Predicting probabilities with the fitted logistic formula

Making yes or no predictions

Heads Up: Knowing What Can Go Wrong with Logistic Regression

Don’t fit a logistic function to nonlogistic data

Watch out for collinearity and disappearing significance

Check for inadvertent reverse-coding of the outcome variable

Don’t misinterpret odds ratios for numerical predictors

Don’t misinterpret odds ratios for categorical predictors

Beware the complete separation problem

Figuring Out the Sample Size You Need for Logistic Regression

Chapter 21: Other Useful Kinds of Regression

Analyzing Counts and Rates with Poisson Regression

Introducing the generalized linear model

Running a Poisson regression

Interpreting the Poisson regression output

Discovering other things that Poisson regression can do

Anything Goes with Nonlinear Regression

Distinguishing nonlinear regression from other kinds

Checking out an example from drug research

Running a nonlinear regression

Interpreting the output

Using equivalent functions to fit the parameters you really want

Smoothing Nonparametric Data with LOWESS

Running LOWESS

Adjusting the amount of smoothing

Part V: Analyzing Survival Data

Chapter 22: Summarizing and Graphing Survival Data

Understanding the Basics of Survival Data

Knowing that survival times are intervals

Recognizing that survival times aren’t normally distributed

Considering censoring

Looking at the Life-Table Method

Making a life table

Interpreting a life table

Graphing hazard rates and survival probabilities from a life table

Digging Deeper with the Kaplan-Meier Method

Heeding a Few Guidelines for Life Tables and the Kaplan-Meier Method

Recording survival times the right way

Recording censoring information correctly

Interpreting those strange-looking survival curves

Doing Even More with Survival Data

Chapter 23: Comparing Survival Times

Comparing Survival between Two Groups with the Log-Rank Test

Understanding what the log-rank test is doing

Running the log-rank test on software

Looking at the calculations

Assessing the assumptions

Considering More Complicated Comparisons

Coming Up with the Sample Size Needed for Survival Comparisons

Chapter 24: Survival Regression

Knowing When to Use Survival Regression

Explaining the Concepts behind Survival Regression

The steps of Cox PH regression

Hazard ratios

Running a Survival Regression

Interpreting the Output of a Survival Regression

Testing the validity of the assumptions

Checking out the table of regression coefficients

Homing in on hazard ratios and their confidence intervals

Assessing goodness-of-fit and predictive ability of the model

Focusing on baseline survival and hazard functions

How Long Have I Got, Doc? Constructing Prognosis Curves

Running the proportional-hazards regression

Finding h

Estimating the Required Sample Size for a Survival Regression

Part VI: The Part of Tens

Chapter 25: Ten Distributions Worth Knowing

The Uniform Distribution

The Normal Distribution

The Log-Normal Distribution

The Binomial Distribution

The Poisson Distribution

The Exponential Distribution

The Weibull Distribution

The Student t Distribution

The Chi-Square Distribution

The Fisher F Distribution

Chapter 26: Ten Easy Ways to Estimate How Many Subjects You Need

Comparing Means between Two Groups

Comparing Means among Three, Four, or Five Groups

Comparing Paired Values

Comparing Proportions between Two Groups

Testing for a Significant Correlation

Comparing Survival between Two Groups

Scaling from 80 Percent to Some Other Power

Scaling from 0.05 to Some Other Alpha Level

Making Adjustments for Unequal Group Sizes

Allowing for Attrition

