1. Short Answer:
a. Agree or Disagree (and justify your answer): If the distribution of u in a population regression model is not normal, then the OLS estimators are not BLUE.
b. Agree or Disagree (and justify your answer): If you add an independent variable to a multiple regression model and the R-squared value rises, this indicates that adding the variable to the model was a good idea.
c. Consider the following population regression model explaining the yield per acre of a plot of land planted with corn (yield) as a function of the inches of rain falling on the plot in the first month of the growing season (rain) and the kilograms of a certain brand of fertilizer added to the plot (fertilizer).
yield = β0 + β1rain + β2fertilizer + β3fertilizer*rain + u
If the value of β3 is positive, what does this tell us about the effectiveness of this brand of fertilizer?
d. A regression was estimated in which the dependent variable was the median home price in a neighborhood. The independent variables were crime, the crime rate in the neighborhood, nox, the amount of nitrous oxide in the neighborhood's air (a measure of pollution), and rooms, the average number of rooms in the neighborhood's houses. The Stata output from this regression is below.
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
crime | -199.701 35.0532 -5.70 0.000 -268.5701 -130.8319
nox | -1306.057 266.1392 -4.91 0.000 -1828.941 -783.1733
rooms | 7933.184 407.8665 19.45 0.000 7131.849 8734.52
_cons | -19371.47 3250.938 -5.96 0.000 -25758.59 -12984.35
-----------------------------------------------------------------------------
Agree or Disagree (and justify your answer): these results indicate that air pollution is a more important determinant of housing prices than is crime.
e. In a multiple regression model using 310 students to explain college grade point average, the following explanatory variables are initially included in the regression: high school GPA, ACT score, number of credits completed, mother's years of education, and father's years of education. The R-squared is .436. When the two parents' education variables are dropped, the R-squared becomes .381. Are the parents' education variables significant at the 5% level? (Hint: remember that the lectures and the book both explain how to conduct an F-test of joint significance using R-squared values from two regressions).
2. One of the measures used in studying health and obesity is the Body Mass Index (BMI). A BMI measure between 20 and 25 is considered healthy; a measure of over 30 makes a person obese. The average BMI in the United States rose from 25 to about 27.75 in the last 25 years. This is considered a large increase, and many social scientists have been trying to explain this trend. The results below come from a regression estimated using a national cross section sample of adults. The dependent variable was BMI, and the sample size was over 42000.
Independent Variable
|
Sample Mean of Variable
(Standard Deviation)
|
Coefficient in Regression
(Standard Error)
|
Restaurants per capita in respondent's state
|
11.04
(2.27)
|
.473
(.324)
|
Age in years
|
41.24
(16.68)
|
.326
(.016)
|
Age squared
|
--
|
-.003
(.0002)
|
Cigarette tax per pack in cents in respondent's state
|
22.58
(9.77)
|
.058
(.032)
|
Cigarette tax squared
|
--
|
-.001
(.0004)
|
Household income in 10,000s of dollars
|
2.91
(2.39)
|
-.393
(.168)
|
a. According to this regression, at about what age does BMI reach its highest level in American adults, other things equal?
b. The authors included the cigarette tax in the regression because of the observation that people who quit smoking tend to gain weight. According to these results, what would happen to the average adult BMI in a state if it raised its cigarette tax from 10 cents to 25 cents per pack?
c. Suppose that household income is positively correlated with the number of restaurants per capita. What would happen to the coefficient of restaurants per capita if household income were left out of the regression?
d. Test, with alpha=.05, the hypothesis that people with higher incomes tend to have a lower BMI. Specify the null and alternative hypotheses and the test statistic.
e. Test, with alpha=.05, the hypothesis that the number of restaurants per capita in a state has an effect on the BMI of the people in the state. Specify the null and alternative hypotheses and the test statistic.
3. Use the data set NBASAL.DTA to answer this question. The data set contains information on NBA players, including their annual salary, personal characteristics, and statistical measures of their ability.
a. Estimate a multiple regression model with lwage as the dependent variable and with the explanatory variables coll, exper, and expersq, which is the square of years of experience. The variable coll measures years spent playing in college, and some of the players have coll=0, because they were drafted right out of high school. Interpret the coefficient on coll. Do you believe that playing more years in college really causes a player to earn less as a professional? Explain.
b. The variables points, rebounds, and assists are included in the data set as measures of player ability. Add these variables to the regression in part c, and test the null hypothesis that player ability does not affect player pay.
c. Interpret the coefficient on rebounds
d. Explain what happens to the coefficient on coll in terms of its magnitude and its statistical significance when points, rebounds, and assists are added to the regression. Why do you think this happened?