The regression models we have seen are linear models. However, at times there exist nonlinear relationships between variables. Some simple variable transformations can be used to create an apparently linear model from a nonlinear relationship. This allows us to use Excel and other linear regression programs to perform the calculations. We will demonstrate this in the following example.
On every new automobile sold in the United States, the fuel efficiency (as measured by miles per gallon [MPG] of gasoline) of the automobile is prominently displayed on the window sticker. The MPG is related to several factors, one of which is the weight of the automobile. Engineers at Colonel Motors, in an attempt to improve fuel efficiency, have been asked to study the impact of weight on MPG. They have decided that a regression model should be used to do this.
A sample of 12 new automobiles was selected, and the weight and MPG rating were recorded. Table 4.6 provides these data. A scatter diagram of the data in Figure 4.6A shows the weight and MPG. A linear regression line is drawn through the points. Excel was used to develop a simple linear regression equation to relate the MPG (Y) to the weight in thousands of pounds in the form
MPG | WEIGHT (1,000s LB.) | MPG | WEIGHT (1,000s LB.) |
---|---|---|---|
12 | 4.58 | 20 | 3.18 |
13 | 4.66 | 23 | 2.68 |
15 | 4.02 | 24 | 2.65 |
18 | 2.53 | 33 | 1.70 |
19 | 3.09 | 36 | 1.95 |
19 | 3.11 | 42 | 1.92 |
The Excel output is shown in Program 4.6. From this, we get the equation
or
The model is useful since the significance level for the F test is small and However, further examination of the graph in Figure 4.6A brings into question the use of a linear model. Perhaps a nonlinear relationship exists, and maybe the model should be modified to account for this. A quadratic model is illustrated in Figure 4.6B. This model would be of the form
The easiest way to develop this model is to define a new variable
This gives us the model
We can create another column in Excel and again run the regression tool. The output is shown in Program 4.7. The new equation is
The significance level for F is low (0.0002), so the model is useful, and The adjusted increased from 0.719 to 0.814, so this new variable definitely improved the model.
This model is good for prediction purposes. However, we should not try to interpret the coefficients of the variables due to the correlation between (weight) and (weight squared). Normally, we would interpret the coefficient for as the change in Y that results from a 1-unit change in while holding all other variables constant. Obviously, holding one variable constant while changing the other is impossible in this example, since If changes, then must change also. This is an example of a problem that exists when multicollinearity is present.
Other types of nonlinearities can be handled using a similar approach. A number of transformations exist that may help to develop a linear model from variables with nonlinear relationships.