Use the data in APPLE.RAW to answer this question.
i. Define a binary variable as ecobuy 5 1 if ecolbs 0 and ecobuy 5 0 if ecolbs 50. In other words, ecobuy indicates whether, at the prices given, a family would buy any ecologically friendly apples. What fraction of families claim they would buy ecolabeled apples?
ii. Estimate the linear probability model ecobuy 5 b 0 1 b 1 ecoprc 1 b 2 regprc 1 b 3 faminc 1 b 4 hhsize 1 b 5 educ 1 b 6 age 1 u, and report the results in the usual form. Carefully interpret the coefficients on the price variables.
iii. Are the nonprice variables jointly significant in the LPM? (Use the usual F statis- tic, even though it is not valid when there is heteroskedasticity.) Which explana- tory variable other than the price variables seems to have the most important effect on the decision to buy ecolabeled apples? Does this make sense to you?
iv. In the model from part (ii), replace faminc with log( faminc ). Which model fits the data better, using faminc or log( faminc )? Interpret the coefficient on log( faminc ).
v. In the estimation in part (iv), how many estimated probabilities are negative? How many are bigger than one? Should you be concerned?
vi. For the estimation in part (iv), compute the percent correctly predicted for each outcome, ecobuy 5 0 and ecobuy 5 1. Which outcome is best predicted by the model?