We want to understand the determinant factors that explain students' performance in fifth-grade tests. We observe a sample of 420 districts in California and the following variables:
- TESTSCR: average score of the district in math & reading test
- STR: average Students-Teacher Ratio in the district
- AVGINC: average income in the district (measured in thousand of $)
- EL_PCT: % of students for which English is second language
- MEAL_PCT: % of students in the district eligible for reduced price lunch
- COMP_STU: average number of computers per student
We estimate by OLS the following regression model in Rviews:
Dependent Variable: LOG(TESTSCR)
|
Included observations: 420
|
Variable
|
Coefficient
|
Std. Error
|
C
|
6.47807
|
0.012361
|
STR
|
-0.000764
|
0.000365
|
LOG(AVGINC)
|
0.018158
|
0.002708
|
EL_PCT
|
-0.000431
|
0.000117
|
EL PCT^2
|
0.000002
|
0.000001
|
MEAL PCT
|
-0.000588
|
4.76E-05
|
COMP_STU
|
0.024809
|
0.010622
|
R-squared
|
0.802032
|
Sum squared resid
|
0.070317
|
1. Interpret the coefficient of LOG(AVGINC). Are its sign consistent with your expectations? Justify your answer.
2. Test the significance of the coefficient of MEAL_PCT at the 5% significance level) against the alternative hypothesis that it is negative.
3. Based on the regression results in table 1, is there evidence of a quadratic relationship between the dependent variable and EL_PCT? Justify your answer.
4. Test the overall significance of the regression
5. Discuss the goodness of fit of the model in table 1.
You are provided the following regression output:
Dependent Variable: LOG(TESTSCR)
|
|
Included observations: 420
|
|
Variable
|
Coefficient
|
Std. Error
|
C
|
6.387050
|
0.010417
|
STR
|
-0.000306
|
0.000410
|
LOG(AVGINC)
|
0.043558
|
0.002089
|
EL PCT
|
-0.001161
|
0.000119
|
EL_PCTA2
|
8.55E-06
|
1.91 E-06
|
R-squared
|
|
0.725978
|
Sum squared resid
|
|
0.097331
|