Analysis of variance (ANOVA), 83–86
first-order, 158–159
second-order, 159
treating, 162–165
Confidence interval, 57
for average fitted value, 61–62
for predicted value, 62
Correlation
examples, 25–27
lagged, 158–159
matrix, 24
negative, 16
positive, 16
strong, 1
zero, 16
Correlation coefficient
calculation of
by hand, 16–18
using Excel, 18–24
defined, 15
hypothesis testing, 28–40
test statistics, 29
Cross-sectional data, 159
Data cleaning, 76–78
Data sets, 5–14
Dependent variable, 3, 135–146
Dummy variable
as dependent variable, 135–146
examples, 131–134
Durbin-Watson test, 160–162, 165–166
Equal variance assumption, for multiple regression, 72
Excel
correlation coefficient calculation using, 18–24
correlation coefficient hypothesis testing, 31–32
model building using, 114–130
multiple regression calculation using, 73–82
simple regression analysis using, 47–51
First-order correlation, 158–159
F-test, for multiple regression model, 82–86
Goodness of fit, 86–92
Heteroscedasticity, 166–172
example, 168–171
treating, 172
Hyperplane regression, 72
Hyperspace regression, 72
Hypothesis testing
correlation coefficient, 28–40
on individual variables, automating, 99–102
Independence assumption, for multiple regression, 73
Independent variable, 3–4
Intercept shifters. See Dummy variable
Interdependence, 93
Lagged correlation, 158–159
Least squares regression, 43
Linearity assumption, for multiple regression, 72
Linear relationship, 1
negative, 2
positive, 2
Mean squared error (MSE), 61
Model building, 103–172
partial F-test, 104–114
qualitative data in multiple regression, including, 130–146
dummy variable, as dependent variable, 135–146
dummy variable examples, 131–134
more than two possible values, 134–135
regression model validity, testing
autocorrelation, 158–166
heteroscedasticity, 166–172
multicollinearity, 146–157
using Excel, 114–130
Multicollinearity, 93, 146–157
causes of, 152–154
high-, 149–152
no, 146–149
spotting, 154–156
treating, 156–157
Multiple regression, 4, 67–102
assumptions for, 72–73
calculation using Excel, 73–82
F-test for, 82–86
goodness of fit, 86–92
qualitative data in, including, 130–146
as several simple regression runs, 67–71
testing of significance, 92–102
Negative linear relationship, 2, 16
Nonmulticollinearity assumption, for multiple regression, 72
Normality assumption, for multiple regression, 72
Outlier, 13
Partial F-test, 104–114
Population regression model, 71
Positive linear relationship, 2, 16
Power rank equation, 80
Qualitative data in multiple regression, including, 130–146
dummy variable, as dependent variable, 135–146
dummy variable examples, 131–134
more than two possible values, 134–135
Regression
coefficients, 45
equations, normal, 44
least squares, 43
multiple. See Multiple regression
model validity, testing
autocorrelation, 158–166
heteroscedasticity, 166–172
multicollinearity, 146–157
Repeated-measures test, 94–99
Sample regression model, 71
Scatterplots, 4–5
Second-order correlation, 159
Simple regression, 4, 41–65, 68–71
calculation of, 46–63
using Excel, 47–51
using SPSS, 51–53
equation, 41
with error term, 42
for estimates, 42–43
for specific data points, 42
backward regression in, 110–112
correlation coefficient calculation using, 19–20, 24–25
correlation coefficient hypothesis testing, 32–33
forward regression in, 108–110
simple regression analysis using, 51–53
stepwise regression in, 112–114
Straight line, equation for, 41
Strong correlation, 1
Sum of squared errors (SSE), 44
Sum of squares, 43
t-statistic, 57
Weighted least squares, 172
XY chart, 4–5
Zero correlation, 16