1. Consider the following causal question:
• What is the effect of health insurance on health?
Answer the following:
(a) What is the outcome variable and what is the treatment?
(b) Define the counterfactual outcomes Y0i and Y1i.
(c) What plausible causal channel(s) runs directly from the treatment to the outcome?
(d) What are possible sources of selection bias in the raw comparison of outcomes by treatment status? Which way would you expect the
bias to go and why?
2. For this question we will use a dataset from a randomized experiment conducted by Marianne Bertrand and Sendhil Mullainathan, who sent 4,870 fake resumes out to employers in response to job adverts in Boston and Chicago in 2001. The resumes differ in various attributes including the names of the applications, and different resumes were randomly allocated to job openings. Some of the names are distinctly white were sounding and some distinctly black sounding. The researchers collecting these data were interested to learn whether black sounding names obtain fewer callbacks for interviews than white names. Download the paper and read the introduction. Download the data set lakisha aer.dta from Blackboard.
(a) Create two dummy variables:
generate female if sex == ‘‘f''
generate black if race == ‘‘b''
(b) Tabulate female (female) by black, using the command which will give you cross-tabulation of female and race, and display the percentages of males and females in each race group.
tab female black, col Tabulate computer skills (computerskills) by race, using the following command.
tab computerskills black, col
Do gender and computer skills look balanced across race groups?
(c) Do a similar tabulation for education and the number of jobs previous held (of jobs) ? These variables take on 5 and 7 different values,
respectively. Does education and the number of previous jobs look balanced across race groups?
(d) Use the summarize command to look at the mean and standard deviation for the variable for years of experience (yearsexp) separately
for black and whites (using the if modifier). Does this variable look similar by race?
(e) What do you make of the overall results on resume characteristics? Why do we care about whether these variables look similar across
the race groups?
(f) The variable of interest on the data set is the variable call, which indicates a call back for an interview. Do you find differences in call
back rates by race?
(g) What do you conclude from the results of the Bertrand and Mullainathan experiment?