Question: This problem demonstrates how multiple regression models can be used to measure discrimination in labor markets. The data, taken from the 1991 Current Population Survey, contain information on wages, education, years of labor market experience, gender, and union status.
a. Calculate the mean hourly wage for men and women in this sample. What is the difference in mean wages? Is this evidence of discrimination? Why or why not?
b. Estimate a model relating the log of wage to gender. How do you interpret the coefficient on gender? How does it relate to your answer in part (a)? How do you interpret the other coefficients?
c. Estimate a model relating the log of wage to education, years of labor market experience, gender, and union status. What happens to the coefficient on gender compared to part (b)? Why? Does this suggest discrimination? Why or why not? Suggest other variables you might include in the model.
d. Suppose you could add a series of dummy variables indicating each person's occupation to the model estimated in part (d). What might happen to the coefficient on gender? Is this a legitimate variable to include in the analysis? Why or why not?