A
Adjusted multiple correlation coefficient, 285
AIC, 287, 290–291, 295, 308, 320, 327, 343
bias corrected, 287
Akaike Information Criteria, 287
Analysis of covariance, 12, 15
Analysis of variance, 12, 15, 67, 140
ANOVA, 140
multiple regression, 67
simple regression, 71
Assumption
constant variance, 86
Homogeneity, 86
independent-errors, 87
Autocorrelation, 18, 87, 180, 197, 201
B
Backward elimination, 289
Bayes Information Criteria, 287
Best linear unbiased estimator, 61, 83
BIC, 287, 290–291, 295, 308, 320, 327
Binary
response data, 195
variable, 12
C
Centering, 239
Classification, 336
Coefficient of determination, 42, 61
Collinearity-influential observations, 246
Computer repair data, 26, 30–31, 35, 42
Concordance Index, 328
Condition number, 244, 276, 288
Confidence
interval, 38
interval for βj, 63
region, 37
Constant variance assumption, 86
Cook's Distance, 103
Corrected sums of squares and products matrix, 60
Correlated errors, 141
Correlation coefficient, 21, 24, 240
population, 36
Correlation matrix, 117, 241, 243, 250, 252
, 83
Cross-sectional data, 128, 214–215
Cumulative distribution function, 319
D
Data set
Anscombe's quartet, 24
bacteria deaths, 155
cigarette consumption, 1, 79–80, 312
college expense, 182
computer repair (expanded), 116
consumer expenditure, 198, 216
corn yields, 147
cost of education, 182
DJIA, 216
domestic immigration, 6
education expenditure, 141, 143, 185
Egyptian skulls, 6
equal educational opportunity, 223
exponential growth, 108
field goal kicking, 339
fuel consumption, 256
Hamilton, 95
hard disk prices, 177
heights of husbands and wives, 47–48, 78
homicide, 296
housing starts, 206
injury incidents, 161
labor force, 47
magazine advertising, 174
milk production, 3
new drugs, 343
New York rivers, 10, 6, 99, 101, 105
newspapers, 48
nonlinear, 24
PCB, 348
preemployment testing, 132
presidential election, 148–150, 176, 255, 312
quantitative sensory testing, 14
real estate, 309
right-to-work laws, 4
Scottish Hills Races, 111
Space Shuttle, 339
supervisor performance, 54–55, 63–64, 292
wind chill factor, 175
Data
collection, 11
re-expression, 13
transformation, 13
Degrees of freedom, 33, 58, 89
DFITS, 104
Discriminant analysis, 336
Distance
Cook's, 103
orthogonal, 29
perpendicular, 29
vertical, 29
Distribution
binomial, 342
chi-square, 323
gamma, 342
inverse Gaussian, 342
Draftsman's matrix, 94
Dummy variable, 121
Durbin-Watson statistic, 200
E
Eigenvalues, 243
Eigenvector, 243
Externally studentized residual, 90
F
F-test, 65
Fitted
regression equation, 37
Fitting
models to data, 14
Forecast interval, 38
Forward selection, 289
Full model, 65
Function
cumulative distribution, 319
intrinsically nonlinear, 13
linear, 13
linearizable, 13
link, 343
logistic, 195
G
Generalized linear models, xv, 341
Goodness-of-fit index, 42
H
Hadi's influence measure, 105
Heteroscedasticity, 86, 159, 179
Homogeneity (Homoscedasticity), 86, 159
I
Independent-errors assumption, 87
Indicator variable, 118, 121, 211
Influence measures, 103
Welsch and Kuh (DFITS), 106
Cook's Distance, 103, 106, 117
Welsch and Kuh (DFITS), 104, 106
Influential observations, 91, 98
Interaction effects, 125
Intercept, 57
Internally studentized residual, 90
Interpretation of regression coefficients, 58
Iterative process, 16
L
L-R plot, 107
Ladder of transformation, 171
Lagged variables, 215
Least squares, 14
estimates, 30
line, 30
properties, 60
Leverage-residual plot, 107
Leverage values, 89
Link function, 343
polytomous, 329
ordered response variable, xv, 18
Logistic
distribution, 319
regression, 12, 15, 18, 140, 317, 319
regression diagnostics, 323
response function, 318
Logit, 320
link function, 342
transformation, 319
M
Masking problem, 101
Matrix, 241
corrected sums of squares and products, 60
correlation, 241, 243, 250, 252
draftsman's, 94
hat, 89
projection, 89
Maximum likelihood, 14, 320, 342
Maximum likelihood method, 335, 330
Model
fitting, 14
full, 65
random walk, 219
reduced, 65
through the origin, 42
Multinomial logit model, 329
Multiple correlation coefficient, 61, 67, 70, 87, 285
Multiple regression, 15, 54, 61
ANOVA table, 67
assumptions, 60
Multiplicative effect, 125
Multivariate regression, 14–15
N
No-intercept model, 42, 183, 186
Nominal model, 335
Nonlinear regression, 15
Nonparametric statistics, 200
Normal
property of, 160
scores, 97
Normalizing transformations, 153
O
Odds ratio, 319
One-sample t-test, 44
Ordered response category, 334
Ordered response variable, xv, 18
Ordinal logistic regression, 334
Ordinal model, 335
Orthogonal
regression, 29
P
P-R plot, 107
P-value for
F-test, 66
t-test, 63
for t-test, 34
Parameter estimation, 14
Parsimony, 69
Partial
regression coefficients, 53, 58
regression plot, 110
residual plot, 110
Plot
L-R, 107
P-R, 107
partial regression, 110
partial residual, 110
residual plus component, 95
sequence, 199
Poisson regression, xv, 18, 342
Potential-residual plot, 107, 117
Predicted value, 15
Prediction
errors, 219
interval, 38
Principal components, 14, 239, 243, 255, 257
Proportional odds model, 329, 334–335
Pure error, 184
R
Random walk model, 219
Reduced model, 65
Regression
assumptions, 86
definition, 1
elements of, 7
examples of, 3
line, 30
linear, 15
model through the origin, 42
nonlinear, 15
partial coefficients, 53
ridge, 268
robust, 115
simple, 13, 15, 17, 54, 63, 71, 74
sum of squares, 40
trivial models, 44
Relationship between simple and multiple regression coefficients, 76
Residual
adjusted, 105
internally studentized, 90
ordinary least squares, 30, 57
plots, 90
plus component plot, 95, 109–110
standardized, 89
Ridge
bias of estimators, 278
method, 14
parameter, 268
regression, 268
variance of estimators, 278
Robust regression, xv, 115, 341, 345
S
Sample
mean, 22
standard deviation, 44
variance, 44
Sampling distribution of
, 37
Scaling, 239
unit length, 240
Sequence plot, 199
Shrinkage estimators, 269
Simple regression, 15, 54, 63, 71, 74
ANOVA table, 71
Simultaneous confidence region, 37
Standard deviation, 33
Standard error of , 83
estimate, 33
Standard normal distribution, 90
Standardized
deviance residuals, 323
Personian residuals, 323
residual, 89
variables, 240
Standardizing, 240
Stepwise method, 289
Stimulus-response relationships, 195
Sum of squared residuals, 58
Swamping problem, 101
T
Test
t-, 44
Durbin-Watson, 200
one-sample t, 44
runs, 200
two-sample t, 45
Time series data, 130, 199, 214
Total
sum of squares, 40
variance, 268
variance of OLS estimators, 269
variance of ridge estimators, 268
ladder of, 171
to achieve linearity, 153
to achieve normality, 153
variance-stabilizing, 153
Trivial regression models, 44
Two-sample t-test, 45
U
Uniqueness of LS solution, 30, 57
V
Variable
dependent, 1
explanatory, 1
independent, 2
lagged, 215
predictor, 1
quantitative, 12
regressor, 2
response, 1
role of, 313
selection, 18
selection procedures, 281
standardized, 240
Variance-covariance matrix, 241, 257
Variance-stabilizing transformations, 153
Variance
inflation factor, 236, 278, 288
of ridge estimators, 278
sample, 44
W
Wald Test, 321
Web site
Case book, 2
DASL, 3
Electronic Dataset Service, 3
Weighted
leverages, 323
Welsch and Kuh measure, 104