1. Consider the following equation relating dollars spent on soft drink advertising by a company to dollars of sales of its product in a quarter:
Sales = 2,326,000,000 + 1.25 advertising - 0.00000000000004 advertising2
(429,000,000) (0.432) (0.0000000000000015)
n = 36 R2 = .1988
(There are 13 zeroes after the decimal place and before the 4 in the advertising coefficient.
There are 14 zeroes after the decimal place and before the 1 in its standard error term. Where sales represents dollars sales of the product and advertising represents dollars expended on advertising.
a. At what point does the effect of advertising on sales become negative?
b. Does it make sense to include the quadratic term in the model?
c. Does it make sense to spend money on advertising? At which level of advertising is the increase in sales from advertising less than the amount spent on advertising?
d. Define advbil as advertisments measured in millions of dollars. advbil = advertising/1,000,000. Rewrite the estimated equation with advbil and advbil2 as the independent variables.
2. The following three equations were estimated using 1,000 observations:
Prate = 80.34 + 6.17 mratre + .272 age -.00025 totemp
(3.20) (.56) (.049) (.00008)
R2 = .110 R' = 0.091
Prate = 90.61 + 6.87 mratre + .314 age -2.72 log(totemp)
(1.86) (.51) (.044) (.28)
R2 = .156
Prate = 80.61 + 6.54 mratre + .295 age -.00025 totemp + .0000000041 totemp2
(.86) (.52) (.040) (.00009) (.0000000010)
R2 = .123
Where Prate is participation rate in a 401K plan, mrate is a firm's matching rate, age is the age of a worker, and totemp is the number of employees working for the firm.
Which of these three models do you prefer and why?
3. Consider the following equation in which salaries are predicted based on education, race, and gender.
Log (salary) = 10.59 + .127 log (postsec) - 0.145 (female) - 0.069 (black)
(1.22) (.053) (0.041) (0.025)
+ 0.021 (female*black)
(.0052)
n = 3,333 R2 = .238
where salary is the dollar salary of a worker, postsec is the number of years of post high school education of a worker, female is a dummy variable set equal to 1 if the worker is female and 0 if male, black is a dummy variable set equal to 1 if the worker is black and 0 otherwise.
a.) Is there strong evidence that postsec should be included in the model?
b.) What is the approximate estimated percentage difference in expected salary between nonblack females and nonblack males holding postsec fixed? Also, what is the exact percentage difference in estimated salary between nonblack females and nonblack males holding postsec fixed?
c.) What is the approximate estimated percentage difference in salary between nonblack males and black males? Test the null hypothesis that there is no difference in their salaries against the alternative that there is a difference.
d.) What is the approximate estimated percentage difference in salary between black females and nonblack females? What would you need to do to test whether the difference is statistically significant?
4. Consider the following model:
Rrate = 45.2 + 18.4(drug)
(11.4) (5.6)
n = 11,500 R2 = .459
where Rrate is the remission rate from a type of lymphoma for cancer patients and drug is a dummy variable indicating if the patient tried a new experimental drug during chemotherapy.
Let nodrug be a dummy variable equal to one if the patient didn't try the new experimental drug during chemotherapy and equal to zero otherwise.
a) If nodrug is used in place of drug in the model above, what happens to the intercept in the estimated equation? What will be the coefficient on nodrug?
b) What will happen to R2 if nodrug is used in place of drug in the model above?
c) Should nodrug and drug both be included as independent variables in the model? Explain.
5. Consider a linear model to explain the unemployment rate in each state:
unemployment = β0 + β1 incomei + β2 educationi + β3 agei + ui
E(u|income, education, age) = 0
Var (u|income, education, age) = σ2/incomei
where incomei is the per capita income of state i. Similarly, educationi and age i are the average education and age levels for state i.
Write the transformed equation that has a homoskedastic error term.
6. Consider the following relationship between number of pieces of candy eaten per day and pounds gained per month:
Person Candy Pounds
1 3 6
2 5 7
3 1 2
4 8 10
5 3 5
6 3 6
7 5 7
8 1 2
9 8 10
10 3 5
a. Estimate the OLS regression for pounds (y) vs. candy (x). (Hint: the slopes and intercepts should be similar to that on a question on the first exam.)
b. Once this is done perform the Breusch-Pagan test for heteroskedasticity to see if the variance in the error term may depend on the number of pieces of candy eaten.