Assignment:
Choose the dependent variable (the response variable to be "explained") and the independent variable (the predictor or explanatory variable) as you judge appropriate.
1 Are the variables cross-sectional data or time-series data?
2 How do you imagine the data were collected?
3 Is the sample size sufficient to yield a good estimate? If not, do you think more data could easily be obtained, given the nature of the problem?
4 State your a priori hypothesis about the sign of the slope. Is it reasonable to suppose a cause and effect relationship?
5 Make a scatter plot of Y against X. Discuss what it tells you.
6 Use Excel's Add Trendline feature to fit a linear regression to the scatter plot. Is a linear model credible?
7 Interpret the slope. Does the intercept have meaning, given the range of the data?
8 Use Excel, MegaStat, or MINITAB to fit the regression model, including residuals and standardized residuals.
9 (a) Does the 95 percent confidence interval for the slope include zero? If so, what does it mean? If not, what does it mean? (b) Do a two-tailed t test for zero slope at α = .05. State the hypotheses, degrees of freedom, and critical value for your test. (c) Interpret the p-value for the slope. (d) Which approach do you prefer, the t test or the p-value? Why? (e) Did the sample support your hypothesis about the sign of the slope?
10 (a) Based on the R2 and ANOVA table for your model, how would you assess the fit? (b) Interpret the p-value for the F statistic. (c) Would you say that your model's fit is good enough to be of practical value?
11 Study the table of residuals. Identify as outliers any standardized residuals that exceed 3 and as unusual any that exceed 2. Can you suggest any reasons for these unusual residuals?
12 (a) Make a histogram (or normal probability plot) of the residuals and discuss its appearance. (b) Do you see evidence that your regression may violate the assumption of normal errors?
13 Inspect the residual plot to check for heteroscedasticity and report your conclusions.
14 Is an autocorrelation test appropriate for your data? If so, perform one or more tests of the residuals (eyeball inspection of residual plot against observation order, runs test, and/or Durbin-Watson test).
15 Use MegaStat or MINITAB to generate 95 percent confidence and prediction intervals for various X-values.
16 Use MegaStat or MINITAB to identify observations with high leverage.
DATA SET B Employees and Revenue Large Automotive Companies in 1999 (n = 24) CarFirms
Company Employees Revenue Company Employees Revenue
BMW 119.9 35.9 MAN 64.1 13.8
DaimlerChrysler 441.5 154.6 Mazda Motor 31.9 16.1
Dana 86.4 12.8 Mitsubishi Motors 26.7 27.5
Denso 72.4 13.8 Nissan Motor 131.3 51.5
Fiat 220.5 51.0 Peugeot 156.5 37.5
Ford Motor 345.2 144.4 Renault 138.3 41.4
Fuji Heavy 19.9 10.6 Robert Bosch 189.5 28.6
General Motors 594.0 161.3 Suzuki Motor 13.9 11.4
Honda Motor 112.2 48.7 Toyota Motor 183.9 99.7
Isuzu Motors 28.5 12.7 TRW 78.0 11.9
Johnson Controls 89.0 12.6 Volkswagen 297.9 76.3
Lear 65.5 9.1 Volvo 70.3 26.8