1. Consider the one-variable regression model: Yi = β0 + β1Xi + ui, and suppose that it satisfies the assumption in Key Concept 4.3 (from your textbook). Suppose that Yi is measured with error, so that the data are Y^i = Yi + ωi, where ωi is the measurement error which is i.i.d. and independent of Yi and Xi. Consider the population regression Y^i = β0 + β1Xi + vi, where vi is the regression error using the mismeasured dependent variable Y˜i.
(a) Show that vi = ui + wi.
(b) Show that the regression Y˜i = β0 + β1Xi + vi satisfies the assumptions in Key Concept 4.3.
(c) Are the OLS estimators consistent? Can confidence intervals be constructed in the usual way? If so, how do they differ from those that we would obtain if we had (measurement error free) observations on Yi? Explain.
Now, assume Yi is measured without error, but Xi is measured with error. Let X˜i = Xi + ωi denote the observed regressor, where wi denotes a generic measurement error. Now, the population regression is given by Yi = β0 + β1X˜i + vi. Note this is not the same vi as above.
(d) Express vi in terms of ui and ωi.
(e) Is there Error-In-Variables bias, if ωi is uncorrelated with Xi? If so, what is the direction of the bias? Explain. You may assume Cov(Xi, ui) = Cov(ui, ωi)= 0.
2.
Traffic crashes are the leading cause of death for Americans between the ages of 5 and 32. Through various spending policies, the federal government has encouraged states to institute mandatory seat belt laws to reduce the number of fatalities and serious injuries. In this exercise you will investigate how effective these laws are in increasing seat belt use and reducing fatalities. The data file Seatbelts contains a panel of data from 50 states plus the District of Columbia for the years 1983 through 1997.1 A detailed description is given in Seatbelts_Description.
a. Estimate the effect of seat belt use on fatalities by regressing FatalityRate on sb_useage, speed65, speed70, ba08, drinkage21, ln(income), and age. Does the estimated regression suggest that increased seat belt use reduces fatalities?
b. Do the results change when you add state fixed effects? Provide an intuitive explanation for why the results changed.
c. Do the results change when you add time fixed effects plus state fixed effects? d. Which regression specification-(a), (b), or (c)-is most reliable? Explain why.
e. Using the results in (c), discuss the size of the coefficient on sb_useage. Is it large? Small? How many lives would be saved if seat belt use increased from 52% to 90%?
f. There are two ways that mandatory seat belt laws are enforced: "Primary" enforce- ment means that a police officer can stop a car and ticket the driver if the officer observes an occupant not wearing a seat belt; "secondary" enforcement means that a police officer can write a ticket if an occupant is not wearing a seat belt, but must have another reason to stop the car. In the data set, primary is a binary variable for primary enforcement and secondary is a binary variable for secondary enforcement. Run a regression of sb_useage on primary, secondary, speed65, speed70, ba08, drinkage21, ln(income), and age, including fixed state and time effects in the regression. Does primary enforcement lead to more seat belt use? What about secondary enforcement?
g. In 2000, New Jersey changed from secondary enforcement to primary enforcement. Estimate the number of lives saved per year by making this change.