Multicollinearity in real estate data. D. Hamilton illustrated the multicollinearity problem with an example using the data shown in the accompanying table. The values of x1, x2, and y in the table at right represent appraised land value, appraised improvements value, and sale price, respectively, of a randomly selected residential property. (All measurements are in thousands of dollars.)
(a) Calculate the coefficient of correlation between y and x1. Is there evidence of a linear relationship between sale price and appraised land value?
(b) Calculate the coefficient of correlation between y and x2. Is there evidence of a linear relationship between sale price and appraised improvements?
(c) Based on the results in parts a and b, do you think the model E(y) = β0 + β1x1 + β2x2 will be useful for predicting sale price?
(d) Use a statistical computer software package to fit the model in part c, and conduct a test of model adequacy. In particular, note the value of R2. Does the result agree with your answer to part c?
(e) Calculate the coefficient of correlation between x1 and x2. What does the result imply?
(f) Many researchers avoid the problems of multicollinearity by always omitting all but one of the ‘‘redundant'' variables from the model. Would you recommend this strategy for this example? Explain. (Hamilton notes that, in this case, such a strategy ‘‘can amount to throwing out the baby with the bathwater.'')