CONTENTS

PREFACE

1. INTRODUCTION

1.1 Regression and Model Building

1.2 Data Collection

1.3 Uses of Regression

1.4 Role of the Computer

2. SIMPLE LINEAR REGRESSION

2.1 Simple Linear Regression Model

2.2 Least-Squares Estimation of the Parameters

2.2.1 Estimation of β0 and β1

2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model

2.2.3 Estimation of σ2

2.2.4 Alternate Form of the Model

2.3 Hypothesis Testing on the Slope and Intercept

2.3.1 Use of t Tests

2.3.2 Testing Significance of Regression

2.3.3 Analysis of Variance

2.4 Interval Estimation in Simple Linear Regression

2.4.1 Confidence Intervals on β0, β1 and σ2

2.4.2 Interval Estimation of the Mean Response

2.5 Prediction of New Observations

2.6 Coefficient of Determination

2.7 A Service Industry Application of Regression

2.8 Using SAS® and R for Simple Linear Regression

2.9 Some Considerations in the Use of Regression

2.10 Regression Through the Origin

2.11 Estimation by Maximum Likelihood

2.12 Case Where the Regressor x is Random

2.12.1 x and y Jointly Distributed

2.12.2 x and y Jointly Normally Distributed: Correlation Model

Problems

3. MULTIPLE LINEAR REGRESSION

3.1 Multiple Regression Models

3.2 Estimation of the Model Parameters

3.2.1 Least-Squares Estimation of the Regression Coefficients

3.2.2 Geometrical Interpretation of Least Squares

3.2.3 Properties of the Least-Squares Estimators

3.2.4 Estimation of σ2

3.2.5 Inadequacy of Scatter Diagrams in Multiple Regression

3.2.6 Maximum-Likelihood Estimation

3.3 Hypothesis Testing in Multiple Linear Regression

3.3.1 Test for Significance of Regression

3.3.2 Tests on Individual Regression Coefficients and Subsets of Coefficients

3.3.3 Special Case of Orthogonal Columns in X

3.3.4 Testing the General Linear Hypothesis

3.4 Confidence Intervals in Multiple Regression

3.4.1 Confidence Intervals on the Regression Coefficients

3.4.2 CI Estimation of the Mean Response

3.4.3 Simultaneous Confidence Intervals on Regression Coefficients

3.5 Prediction of New Observations

3.6 A Multiple Regression Model for the Patient Satisfaction Data

3.7 Using SAS and R for Basic Multiple Linear Regression

3.8 Hidden Extrapolation in Multiple Regression

3.9 Standardized Regression Coefficients

3.10 Multicollinearity

3.11 Why Do Regression Coefficients Have the Wrong Sign?

Problems

4. MODEL ADEQUACY CHECKING

4.1 Introduction

4.2 Residual Analysis

4.2.1 Definition of Residuals

4.2.2 Methods for Scaling Residuals

4.2.3 Residual Plots

4.2.4 Partial Regression and Partial Residual Plots

4.2.5 Using Minitab®, SAS, and R for Residual Analysis

4.2.6 Other Residual Plotting and Analysis Methods

4.3 PRESS Statistic

4.4 Detection and Treatment of Outliers

4.5 Lack of Fit of the Regression Model

4.5.1 Formal Test for Lack of Fit

4.5.2 Estimation of Pure Error from Near Neighbors

Problems

5. TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL INADEQUACIES

5.1 Introduction

5.2 Variance-Stabilizing Transformations

5.3 Transformations to Linearize the Model

5.4 Analytical Methods for Selecting a Transformation

5.4.1 Transformations on y: The Box-Cox Method

5.4.2 Transformations on the Regressor Variables

5.5 Generalized and Weighted Least Squares

5.5.1 Generalized Least Squares

5.5.2 Weighted Least Squares

5.5.3 Some Practical Issues

5.6 Regression Models with Random Effect

5.6.1 Subsampling

5.6.2 The General Situation for a Regression Model with a Single Random Effect

5.6.3 The Importance of the Mixed Model in Regression

Problems

6. DIAGNOSTICS FOR LEVERAGE AND INFLUENCE

6.1 Importance of Detecting Influential Observations

6.2 Leverage

6.3 Measures of Influence: Cook's D

6.4 Measures of Influence: DFFITS and DFBETAS

6.5 A Measure of Model Performance

6.6 Detecting Groups of Influential Observations

6.7 Treatment of Influential Observations

Problems

7. POLYNOMIAL REGRESSION MODELS

7.1 Introduction

7.2 Polynomial Models in One Variable

7.2.1 Basic Principles

7.2.2 Piecewise Polynomial Fitting (Splines)

7.2.3 Polynomial and Trigonometric Terms

7.3 Nonparametric Regression

7.3.1 Kernel Regression

7.3.2 Locally Weighted Regression (Loess)

7.3.3 Final Cautions

7.4 Polynomial Models in Two or More Variables

7.5 Orthogonal Polynomials

Problems

8. INDICATOR VARIABLES

8.1 General Concept of Indicator Variables

8.2 Comments on the Use of Indicator Variables

8.2.1 Indicator Variables versus Regression on Allocated Codes

8.2.2 Indicator Variables as a Substitute for a Quantitative Regressor

8.3 Regression Approach to Analysis of Variance

Problems

9. MULTICOLLINEARITY

9.1 Introduction

9.2 Sources of Multicollinearity

9.3 Effects of Multicollinearity

9.4 Multicollinearity Diagnostics

9.4.1 Examination of the Correlation Matrix

9.4.2 Variance Inflation Factors

9.4.3 Eigensystem Analysis of X′X

9.4.4 Other Diagnostics

9.4.5 SAS and R Code for Generating Multicollinearity Diagnostics

9.5 Methods for Dealing with Multicollinearity

9.5.1 Collecting Additional Data

9.5.2 Model Respecification

9.5.3 Ridge Regression

9.5.4 Principal-Component Regression

9.5.5 Comparison and Evaluation of Biased Estimators

9.6 Using SAS to Perform Ridge and Principal-Component Regression

Problems

10. VARIABLE SELECTION AND MODEL BUILDING

10.1 Introduction

10.1.1 Model-Building Problem

10.1.2 Consequences of Model Misspecification

10.1.3 Criteria for Evaluating Subset Regression Models

10.2 Computational Techniques for Variable Selection

10.2.1 All Possible Regressions

10.2.2 Stepwise Regression Methods

10.3 Strategy for Variable Selection and Model Building

10.4 Case Study: Gorman and Toman Asphalt Data Using SAS

Problems

11. VALIDATION OF REGRESSION MODELS

11.1 Introduction

11.2 Validation Techniques

11.2.1 Analysis of Model Coefficients and Predicted Values

11.2.2 Collecting Fresh Data—Confirmation Runs

11.2.3 Data Splitting

11.3 Data from Planned Experiments

Problems

12. INTRODUCTION TO NONLINEAR REGRESSION

12.1 Linear and Nonlinear Regression Models

12.1.1 Linear Regression Models

12.2.2 Nonlinear Regression Models

12.2 Origins of Nonlinear Models

12.3 Nonlinear Least Squares

12.4 Transformation to a Linear Model

12.5 Parameter Estimation in a Nonlinear System

12.5.1 Linearization

12.5.2 Other Parameter Estimation Methods

12.5.3 Starting Values

12.6 Statistical Inference in Nonlinear Regression

12.7 Examples of Nonlinear Regression Models

12.8 Using SAS and R

Problems

13. GENERALIZED LINEAR MODELS

13.1 Introduction

13.2 Logistic Regression Models

13.2.1 Models with a Binary Response Variable

13.2.2 Estimating the Parameters in a Logistic Regression Model

13.2.3 Interpretation of the Parameters in a Logistic Regression Model

13.2.4 Statistical Inference on Model Parameters

13.2.5 Diagnostic Checking in Logistic Regression

13.2.6 Other Models for Binary Response Data

13.2.7 More Than Two Categorical Outcomes

13.3 Poisson Regression

13.4 The Generalized Linear Model

13.4.1 Link Functions and Linear Predictors

13.4.2 Parameter Estimation and Inference in the GLM

13.4.3 Prediction and Estimation with the GLM

13.4.4 Residual Analysis in the GLM

13.4.5 Using R to Perform GLM Analysis

13.4.6 Overdispersion

Problems

14. REGRESSION ANALYSIS OF TIME SERIES DATA

14.1 Introduction to Regression Models for Time Series Data

14.2 Detecting Autocorrelation: The Durbin-Watson Test

14.3 Estimating the Parameters in Time Series Regression Models

Problems

15. OTHER TOPICS IN THE USE OF REGRESSION ANALYSIS

15.1 Robust Regression

15.1.1 Need for Robust Regression

15.1.2 M-Estimators

15.1.3 Properties of Robust Estimators

15.2 Effect of Measurement Errors in the Regressors

15.2.1 Simple Linear Regression

15.2.2 The Berkson Model

15.3 Inverse Estimation—The Calibration Problem

15.4 Bootstrapping in Regression

15.4.1 Bootstrap Sampling in Regression

15.4.2 Bootstrap Confidence Intervals

15.5 Classification and Regression Trees (CART)

15.6 Neural Networks

15.7 Designed Experiments for Regression

Problems

APPENDIX A. STATISTICAL TABLES

APPENDIX B. DATA SETS FOR EXERCISES

APPENDIX C. SUPPLEMENTAL TECHNICAL MATERIAL

C.1 Background on Basic Test Statistics

C.2 Background from the Theory of Linear Models

C.3 Important Results on SSR and SSRes

C.4 Gauss-Markov Theorem, Var(ε) = σ2I

C.5 Computational Aspects of Multiple Regression

C.6 Result on the Inverse of a Matrix

C.7 Development of the PRESS Statistic

C.8 Development of S2(i)

C.9 Outlier Test Based on R-Student

C.10 Independence of Residuals and Fitted Values

C.11 Gauss-Markov Theorem, Var(ε) = V

C.12 Bias in MSRes When the Model Is Underspecified

C.13 Computation of Influence Diagnostics

C.14 Generalized Linear Models

APPENDIX D. INTRODUCTION TO SAS

D.1 Basic Data Entry

D.2 Creating Permanent SAS Data Sets

D.3 Importing Data from an EXCEL File

D.4 Output Command

D.5 Log File

D.6 Adding Variables to an Existing SAS Data Set

APPENDIX E. INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS

E.1 Basic Background on R

E.2 Basic Data Entry

E.3 Brief Comments on Other Functionality in R

E.4 R Commander

REFERENCES

INDEX

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset