Practice Questions for the Final Exam:
Theoretical Part-
1. Define dummy variable and give two examples.
2. Analyze the three different types of data (cross-sectional, time series, panel data).
3. Define R2 and R-2. What is their important property? Show the relation between them and their differences. Analyze as much as you can.
4. (a) Analyze fully the assumption of "homoscedasticity" of a CLRM. Moreover, which are the differences with heteroscedasticity? (b) Analyze fully the assumption of specification bias and use an example to show your intuition.
5. State the CLRM (Classical linear Regression Model) Assumptions.
6. State the Gauss-Markov Theorem and provide full definitions of the characteristics of a BLUE estimator (unbiasedness, linearity, efficiency).
7. Analyze the procedure of the Maximum Likelihood (ML) Estimator for the following bivariate regression model: Yi = β1 + β2Xi + ui. Specify any advantages or disadvantages the ML estimator has over the OLS estimator. The formula for Log Likelihood is given as:
inL = -nlnσ2 - n/2 ln(2π) - 1/2 ∑ (Yi - β1 - β2Xi)2/σ2
8. You are given the following non-linear regression model
Y = β1X-β_2eu, where, Y: dependent variable, X: independent variable, e: exponent, u: error term, betas: coefficients. Make the necessary transformation/s so that the model can be estimated by using OLS method.
9. You are given the following two models:
(GPDI)^ = -1026.5 + 0.30GDP
se = (257.58) (0.04)
where GDPI and GDP are measured in billions of dollars
(GPDI*)^ = 0.94GDP*
se = (0.115)
where GDPI* and GDP* are the standardized versions of the variables GDPI and GDP.
Interpret the coefficient of GDP and GDP* with economic reasoning.
10. Analyze briefly the different reasons of doing hypothesis testing in a multiple regression model.
11. Discuss two types of specifications errors that we may have in a classical linear regression model. Use an example for each case.
11. Choose the correct answer.
i. The α in confidence interval given by Pr (βi^ - δ ≤ βi ≤ βi^ + δ) = 1 - α is known as:
a. Confidence coefficient
b. Level of confidence
c. Level of significance
d. Significance coefficient
ii. The 1 - α in confidence interval given by Pr (βi^ - δ ≤ βi ≤ βi^ + δ) = 1 - α is known as:
a. Confidence coefficient
b. Level of confidence
c. level of significance
d. Significance coefficient
iii. Standard error of an estimator is a measure of
a. Population estimator
b. Precision of the estimator
c. Power of the estimator
d. Confidence interval of the estimator
iv. For a regression through the origin, the intercept is equal to
a. 1
b. 2
c. 0
d. -1
v. Which of the following statements is correct?
a. Multicollinearity arises when explanatory variables are highly correlated with each other.
b. Multicollinearity can be identified by examining the pattern of correlations among explanatory variables
c. A high correlation between the dependent variables and a given independent variable is a sign of mulicollinearity.
d. All of the above are correct
e. Only (a) and (b) are correct
f. Only (b) and (c) are correct
Empirical Part-
Problem 1 - The table contains the ACT scores and the GPA for eight college students. GPA is based on a 4- point scale and has been rounded to one digit after the decimal.
Student
|
GPA
|
ACT
|
1
|
2.8
|
21
|
2
|
3.4
|
24
|
3
|
3.0
|
26
|
4
|
3.5
|
27
|
5
|
3.6
|
29
|
6
|
3.0
|
25
|
7
|
2.7
|
25
|
8
|
3.7
|
30
|
Obtain the estimates α0^ and α1^ in the linear regression model: (GPAi)^ = α0^ + α1^ACT
Problem 2 -
(I) The following equation is part of a nutrition-based efficiency wage model: Total calories Cals were regressed on the number of meals MG given to guests at ceremonies, the number of meals ME given to employees and the number of meals MO given to guests on other occasions:
Cals = β0 + β1MG + β2ME + β3MO +e (Model A)
The expected sign of the coefficients are β1 > 0, β2 > 0, β3 > 0
You ran the regression in Eviews, by using OLS, for the period 1960-1999 and you obtained the following output:
Dependent Variable: CALS
Method: Least Squares
Included observations: 40 after adjustments
Variable
|
Coefficient
|
Std. Error t-Statistic
|
|
C
|
27.59394
|
17.41539
|
MG
|
0.607160
|
0.157120
|
ME
|
0.092188
|
2.311452
|
MO
|
0.244860
|
0.011095
|
R-squared
|
|
Mean dependent var
|
Adjusted R-squared
|
0.989590
|
S.D. dependent var
|
19.53879
|
S.E. of regression
|
|
Akaike info criterion
|
|
Sum squared resid
|
143.0726
|
Schwarz criterion
|
|
Log likelihood
|
-82.24700
|
Hannan-Quinn criter.
|
|
F-statistic
|
|
Durbin-Watson stat
|
0.897776
|
a. Some of the standard errors and t-statistics of the coefficients are missing from the output. Calculate them (be careful about the signs). Remember that the Null Hypothesis is:
H0: βi = 0. Test in α = 5% significance level. Your t-critical is given as 2.03. Do you reject or fail to reject the Null?
b. Interpret the effect of MG and ME variables with economic reasoning.
c. The value of R2 is missing from the output. Calculate it.
d. The standard error of regression is not visible in the above output. Calculate it by using the relevant formula: σ^2 = ∑u^i2/n-k, where n is the number of observations and k is the number of parameters.
e. The F value (for overall significance) you obtained is missing. Calculate it and test it for 5% significance level. The Fcritical(5%,3,36) is given as 2.87. Note that we are testing jointly for all the coefficients, excluding the intercept.
f. Calculate the 95% Confidence Intervals for the coefficients of MG and MO.
(II) You decide to re-specify Model A, by dropping variable MO. Your model becomes:
Cals = β'0 + β'1 MG + β'2 ME + ε (Model B)
You run the model and you obtain the following output:
Dependent Variable: CALS
Method: Least Squares
Included observations: 40 after adjustments
Variable
|
Coefficient
|
Std. Error t-Statistic
|
.
|
C
|
35.45670
|
|
|
MG
|
2.567990
|
|
|
ME
|
0.892676
|
|
|
R-squared
|
0.860390
|
Mean dependent var
|
|
Adjusted R-squared
|
0.852844
|
S.D. dependent var
|
19.53879
|
S.E. of regression
|
7.495264
|
Akaike info criterion
|
|
Sum squared resid
|
2078.622
|
Schwarz criterion
|
|
Log likelihood
|
-135.7692
|
Hannan-Quinn criter.
|
|
F-statistic
|
114.0122
|
Durbin-Watson stat
|
0.678483
|
a. We do not know which model is better performed between model A and model B. Which model is better in terms of model building? Make use of the Information Criteria to find out which model is better.
b. The output is problematic. The standard errors and the t-statistics are missing. Use the Wald Test to do Restriction Testing between models A and B. The Null Hypothesis is: H0: β3 = 0. The Fcritical is given as 4.11 and the X12 critical value is given as 3.841 (both for a 5% significance level). What can you conclude regarding the dropped variable MO?
Problem 3 - lnipt+1 = β0 + β1lnolt + β2lnrsrt + β3emplt + ut
The above linear regression model states that the industrial productivity (lnip) is positively affected by the stock market returns (lnrsr) and the employment ratio (empl) and negatively affected by the oil prices (lnlo). We are examining the USA market growth during the period 1960-1999.
In 1980 we have the introduction of the personal computer and we want to test whether there is any structural change after the year of 1980:
H0: There was no structural break (or change) after 1980
H1: There was a structural break (or change) after 1980
You get the following three outputs by doing the Chow Test.
Dependent Variable: LNIP
Method: Least Squares
Sample: 1960 1999
Included observations: 40
Variable
|
Coefficient
|
Std. Error t-Statistic
|
Prob.
|
C
|
27.59394
|
1.584458 17.41539
|
0.0000
|
LNOL
|
-0.607160
|
0.157120 -3.864300
|
0.0004
|
LNRSR
|
0.092188
|
0.039883 2.311452
|
0.0266
|
EMPL
|
0.244860
|
0.011095 22.06862
|
0.0000
|
R-squared
|
0.990391
|
Mean dependent var
|
50.56725
|
Adjusted R-squared
|
0.989590
|
S.D. dependent var
|
19.53879
|
S.E. of regression
|
1.993549
|
Akaike info criterion
|
4.312350
|
Sum squared resid
|
143.0726
|
Schwarz criterion
|
4.481238
|
Log likelihood
|
-82.24700
|
Hannan-Quinn criter.
|
4.373414
|
F-statistic
|
1236.776
|
Durbin-Watson stat
|
0.897776
|
Prob(F-statistic)
|
0.000000
|
|
|
Dependent Variable: LNLIP
Method: Least Squares
Sample: 1960 1979
Included observations: 20
Variable
|
Coefficient
|
Std. Error t-Statistic
|
Prob.
|
C
|
27.59882
|
2.433883 11.33942
|
0.0000
|
LNOL
|
-0.899693
|
0.297873 -3.020394
|
0.0081
|
LNRSR
|
0.181932
|
0.098121 1.854171
|
0.0822
|
EMPL
|
0.265328
|
0.058970 4.499342
|
0.0004
|
R-squared
|
0.913357
|
Mean dependent var
|
34.28700
|
Adjusted R-squared
|
0.897112
|
S.D. dependent var
|
6.199594
|
S.E. of regression
|
1.988596
|
Akaike info criterion
|
4.389592
|
Sum squared resid
|
63.27225
|
Schwarz criterion
|
4.588738
|
Log likelihood
|
-39.89592
|
Hannan-Quinn criter.
|
4.428467
|
F-statistic
|
56.22198
|
Durbin-Watson stat
|
1.116410
|
Prob(F-statistic)
|
0.000000
|
|
|
Dependent Variable: LNLIP
Method: Least Squares
Sample: 1980 1999
Included observations: 20
Variable
|
Coefficient
|
Std. Error t-Statistic
|
Prob.
|
C
|
16.18376
|
3.874379 4.177124
|
0.0007
|
LNOL
|
-0.345689
|
0.136746 -2.527964
|
0.0224
|
LNRSR
|
0.151866
|
0.046860 3.240840
|
0.0051
|
EMPL
|
0.272712
|
0.008611 31.67100
|
0.0000
|
R-squared
|
0.993168
|
Mean dependent var
|
66.84750
|
Adjusted R-squared
|
0.991887
|
S.D. dependent var
|
13.68186
|
S.E. of regression
|
1.232329
|
Akaike info criterion
|
3.432546
|
Sum squared resid
|
24.29817
|
Schwarz criterion
|
3.631692
|
Log likelihood
|
-30.32546
|
Hannan-Quinn criter.
|
3.471421
|
F-statistic
|
775.3400
|
Durbin-Watson stat
|
1.665783
|
Prob(F-statistic)
|
0.000000
|
|
|
a. Explain the procedure of the Chow Tests, i.e the steps you have to take in order to make conclusions about the structural stability.
b. Calculate the Chow F-Statistic. According to the F statistic you just calculated do you reject or fail to reject the Null Hypothesis? What that means for your data? You are given that Fcritical(0.05, 4, 32) = 2.69.