1.1 Regression and Model Building
2.1 Simple Linear Regression Model
2.2 Least-Squares Estimation of the Parameters
2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model
2.2.4 Alternate Form of the Model
2.3 Hypothesis Testing on the Slope and Intercept
2.3.2 Testing Significance of Regression
2.4 Interval Estimation in Simple Linear Regression
2.4.1 Confidence Intervals on β0, β1 and σ2
2.4.2 Interval Estimation of the Mean Response
2.5 Prediction of New Observations
2.6 Coefficient of Determination
2.7 A Service Industry Application of Regression
2.8 Using SAS® and R for Simple Linear Regression
2.9 Some Considerations in the Use of Regression
2.10 Regression Through the Origin
2.11 Estimation by Maximum Likelihood
2.12 Case Where the Regressor x is Random
2.12.1 x and y Jointly Distributed
2.12.2 x and y Jointly Normally Distributed: Correlation Model
3.1 Multiple Regression Models
3.2 Estimation of the Model Parameters
3.2.1 Least-Squares Estimation of the Regression Coefficients
3.2.2 Geometrical Interpretation of Least Squares
3.2.3 Properties of the Least-Squares Estimators
3.2.5 Inadequacy of Scatter Diagrams in Multiple Regression
3.2.6 Maximum-Likelihood Estimation
3.3 Hypothesis Testing in Multiple Linear Regression
3.3.1 Test for Significance of Regression
3.3.2 Tests on Individual Regression Coefficients and Subsets of Coefficients
3.3.3 Special Case of Orthogonal Columns in X
3.3.4 Testing the General Linear Hypothesis
3.4 Confidence Intervals in Multiple Regression
3.4.1 Confidence Intervals on the Regression Coefficients
3.4.2 CI Estimation of the Mean Response
3.4.3 Simultaneous Confidence Intervals on Regression Coefficients
3.5 Prediction of New Observations
3.6 A Multiple Regression Model for the Patient Satisfaction Data
3.7 Using SAS and R for Basic Multiple Linear Regression
3.8 Hidden Extrapolation in Multiple Regression
3.9 Standardized Regression Coefficients
3.11 Why Do Regression Coefficients Have the Wrong Sign?
4.2.2 Methods for Scaling Residuals
4.2.4 Partial Regression and Partial Residual Plots
4.2.5 Using Minitab®, SAS, and R for Residual Analysis
4.2.6 Other Residual Plotting and Analysis Methods
4.4 Detection and Treatment of Outliers
4.5 Lack of Fit of the Regression Model
4.5.1 Formal Test for Lack of Fit
4.5.2 Estimation of Pure Error from Near Neighbors
5. TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL INADEQUACIES
5.2 Variance-Stabilizing Transformations
5.3 Transformations to Linearize the Model
5.4 Analytical Methods for Selecting a Transformation
5.4.1 Transformations on y: The Box-Cox Method
5.4.2 Transformations on the Regressor Variables
5.5 Generalized and Weighted Least Squares
5.5.1 Generalized Least Squares
5.6 Regression Models with Random Effect
5.6.2 The General Situation for a Regression Model with a Single Random Effect
5.6.3 The Importance of the Mixed Model in Regression
6. DIAGNOSTICS FOR LEVERAGE AND INFLUENCE
6.1 Importance of Detecting Influential Observations
6.3 Measures of Influence: Cook's D
6.4 Measures of Influence: DFFITS and DFBETAS
6.5 A Measure of Model Performance
6.6 Detecting Groups of Influential Observations
6.7 Treatment of Influential Observations
7. POLYNOMIAL REGRESSION MODELS
7.2 Polynomial Models in One Variable
7.2.2 Piecewise Polynomial Fitting (Splines)
7.2.3 Polynomial and Trigonometric Terms
7.3.2 Locally Weighted Regression (Loess)
7.4 Polynomial Models in Two or More Variables
8.1 General Concept of Indicator Variables
8.2 Comments on the Use of Indicator Variables
8.2.1 Indicator Variables versus Regression on Allocated Codes
8.2.2 Indicator Variables as a Substitute for a Quantitative Regressor
8.3 Regression Approach to Analysis of Variance
9.2 Sources of Multicollinearity
9.3 Effects of Multicollinearity
9.4 Multicollinearity Diagnostics
9.4.1 Examination of the Correlation Matrix
9.4.2 Variance Inflation Factors
9.4.3 Eigensystem Analysis of X′X
9.4.5 SAS and R Code for Generating Multicollinearity Diagnostics
9.5 Methods for Dealing with Multicollinearity
9.5.1 Collecting Additional Data
9.5.4 Principal-Component Regression
9.5.5 Comparison and Evaluation of Biased Estimators
9.6 Using SAS to Perform Ridge and Principal-Component Regression
10. VARIABLE SELECTION AND MODEL BUILDING
10.1.2 Consequences of Model Misspecification
10.1.3 Criteria for Evaluating Subset Regression Models
10.2 Computational Techniques for Variable Selection
10.2.1 All Possible Regressions
10.2.2 Stepwise Regression Methods
10.3 Strategy for Variable Selection and Model Building
10.4 Case Study: Gorman and Toman Asphalt Data Using SAS
11. VALIDATION OF REGRESSION MODELS
11.2.1 Analysis of Model Coefficients and Predicted Values
11.2.2 Collecting Fresh Data—Confirmation Runs
11.3 Data from Planned Experiments
12. INTRODUCTION TO NONLINEAR REGRESSION
12.1 Linear and Nonlinear Regression Models
12.1.1 Linear Regression Models
12.2.2 Nonlinear Regression Models
12.2 Origins of Nonlinear Models
12.4 Transformation to a Linear Model
12.5 Parameter Estimation in a Nonlinear System
12.5.2 Other Parameter Estimation Methods
12.6 Statistical Inference in Nonlinear Regression
12.7 Examples of Nonlinear Regression Models
13.2 Logistic Regression Models
13.2.1 Models with a Binary Response Variable
13.2.2 Estimating the Parameters in a Logistic Regression Model
13.2.3 Interpretation of the Parameters in a Logistic Regression Model
13.2.4 Statistical Inference on Model Parameters
13.2.5 Diagnostic Checking in Logistic Regression
13.2.6 Other Models for Binary Response Data
13.2.7 More Than Two Categorical Outcomes
13.4 The Generalized Linear Model
13.4.1 Link Functions and Linear Predictors
13.4.2 Parameter Estimation and Inference in the GLM
13.4.3 Prediction and Estimation with the GLM
13.4.4 Residual Analysis in the GLM
13.4.5 Using R to Perform GLM Analysis
14. REGRESSION ANALYSIS OF TIME SERIES DATA
14.1 Introduction to Regression Models for Time Series Data
14.2 Detecting Autocorrelation: The Durbin-Watson Test
14.3 Estimating the Parameters in Time Series Regression Models
15. OTHER TOPICS IN THE USE OF REGRESSION ANALYSIS
15.1.1 Need for Robust Regression
15.1.3 Properties of Robust Estimators
15.2 Effect of Measurement Errors in the Regressors
15.2.1 Simple Linear Regression
15.3 Inverse Estimation—The Calibration Problem
15.4 Bootstrapping in Regression
15.4.1 Bootstrap Sampling in Regression
15.4.2 Bootstrap Confidence Intervals
15.5 Classification and Regression Trees (CART)
15.7 Designed Experiments for Regression
APPENDIX A. STATISTICAL TABLES
APPENDIX B. DATA SETS FOR EXERCISES
APPENDIX C. SUPPLEMENTAL TECHNICAL MATERIAL
C.1 Background on Basic Test Statistics
C.2 Background from the Theory of Linear Models
C.3 Important Results on SSR and SSRes
C.4 Gauss-Markov Theorem, Var(ε) = σ2I
C.5 Computational Aspects of Multiple Regression
C.6 Result on the Inverse of a Matrix
C.7 Development of the PRESS Statistic
C.9 Outlier Test Based on R-Student
C.10 Independence of Residuals and Fitted Values
C.11 Gauss-Markov Theorem, Var(ε) = V
C.12 Bias in MSRes When the Model Is Underspecified
C.13 Computation of Influence Diagnostics
C.14 Generalized Linear Models
APPENDIX D. INTRODUCTION TO SAS
D.2 Creating Permanent SAS Data Sets
D.3 Importing Data from an EXCEL File
D.6 Adding Variables to an Existing SAS Data Set
APPENDIX E. INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS