Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4.8 Binary or Dummy Variables

All of the variables we have used in regression examples have been quantitative variables such as sales figures, payroll numbers, square footage, and age. These have all been easily measurable and have had numbers associated with them. There are many times when we believe a qualitative variable rather than a quantitative variable would be helpful in predicting the dependent variable Y. For example, regression may be used to find a relationship between annual income and certain characteristics of the employees. Years of experience at a particular job would be a quantitative variable. However, information regarding whether or not a person has a college degree might also be important. This would not be a measurable value or quantity, so a special variable called a dummy variable (or a binary variable or an indicator variable) would be used. A dummy variable is assigned a value of 1 if a particular condition is met (e.g., a person has a college degree) and a value of 0 otherwise.

Return to the Jenny Wilson Realty example. Jenny believes that a better model can be developed if the condition of the property is included. To incorporate the condition of the house into the model, Jenny looks at the information available (see Table 4.5) and sees that the three categories are good condition, excellent condition, and mint condition. Since these are not quantitative variables, she must use dummy variables. These are defined as

\begin{array}{l} X_{3} & = & 1 if house is in excellent condition \\ = & 0 otherwise \\ X_{4} & = & 1 if house is in mint condition \\ = & 0 otherwise \end{array}

$\begin{array}{l} X_{3} & = & 1 if house is in excellent condition \\ = & 0 otherwise \\ X_{4} & = & 1 if house is in mint condition \\ = & 0 otherwise \end{array}$

Notice there is no separate variable for “good” condition. If $X_{3}$ $X_{3}$ and $X_{4}$ $X_{4}$ are both 0, then the house cannot be in excellent or mint condition, so it must be in good condition. When using dummy variables, the number of variables must be 1 less than the number of categories. In this problem, there were three categories (good, excellent, and mint condition), so we must have two dummy variables. If we had mistakenly used too many variables and the number of dummy variables equaled the number of categories, then the mathematical computations could not be performed or would not give reliable values.

These dummy variables will be used with the two previous variables ( $X_{1}$ $X_{1}$ - square footage, and $X_{2}$ $X_{2}$ - age) to try to predict the selling prices of houses for Jenny Wilson. Programs 4.5A and 4.5B provide the Excel input and output for these new data, and this shows how the dummy variables were coded. The significance level for the F test is 0.00017, so this model is statistically significant. The coefficient of determination $(r^{2})$ $(r^{2})$ is 0.898, so this is a much better model than the previous one. The regression equation is

\hat{Y} = 121, 658 + 56.43 X_{1} - 3, 962 X_{2} + 33, 162 X_{3} + 47, 369 X_{4}

$\hat{Y} = 121, 658 + 56.43 X_{1} - 3, 962 X_{2} + 33, 162 X_{3} + 47, 369 X_{4}$

This indicates that a house in excellent condition $(X_{3} = 1, X_{4} = 0)$ $(X_{3} = 1, X_{4} = 0)$ would sell for about $33,162 more than a house in good condition $(X_{3} = 0, X_{4} = 0) .$ $(X_{3} = 0, X_{4} = 0) .$ A house in mint condition $(X_{3} = 0, X_{4} = 1)$ $(X_{3} = 0, X_{4} = 1)$ would sell for about $47,369 more than a house in good condition.

A screenshot showing the Jenny Wilson Realty table from a prior figure, with columns A: Sell Price, B: SF, and C: Age; Column D titled X 3 open parens Exc.

Figure 4.5A Full Alternative Text

A screenshot of the Summary Output table for the dummy variables is shown. — Program 4.5B Output Screen for Jenny Wilson Realty Example with Dummy Variables in Excel 2016

Figure 4.5B Full Alternative Text

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for
4.8 Binary or Dummy Variables

4.8 Binary or Dummy Variables

Program 4.5A Input Screen for Jenny Wilson Realty Example with Dummy Variables in Excel 2016

Program 4.5B Output Screen for Jenny Wilson Realty Example with Dummy Variables in Excel 2016

Table of Contents for 4.8 Binary or Dummy Variables

Create new playlist

Sign In

Sign Up

Table of Contents for
4.8 Binary or Dummy Variables