1. A regression analysis relating test scores (Y) to training hours (X) produced the following fitted equation: Yˆ =13.2 +1.3X .
(a) What is the fitted value of the response variable corresponding to X = 8?
(b) What is the residual corresponding to the data point with X = 6 and Y = 18.
(c) If X increases 4 units, how does Yˆ change?
(d) Consider the data point in part (b). An additional test score is to be obtained from a new observation at X = 6. Would the test score for the new observation necessarily be 18? Explain.
(e) The error sum of squares (SSE) for this model was found to be 14. If there were n = 20 observations, provide the best estimate for σ2.
(f) Rewrite the regression equation in terms of X*, where X* is training time measured in minutes. Show that your answer makes sense, i.e., gives the same prediction as the original equation (an example is sufficient).
2. Explain the difference between the following two equations:
3. Consider Figure 1.3 in KNNL (your primary textbook). If only the data over years 8-15 were considered, a reasonable linear fit could be obtained. This model, however, would profoundly over-predict the steroid level when X = 25. Use this result in explaining what is
meant by "scope of the model".
4. For this problem, use the grade point average data described in KNNL (the full data set is on the CD that accompanies the text, file CH01PR19.dat).
(a) Plot the data using PROC GPLOT in SAS. Include a smoothed function in the plot. Make sure to include the smoothing number in the title of the plot. Is the relationship approximately linear?
(b) Plot the data using PROC GPLOT, but now include the linear regression line on the plot.
(c) Using SAS, run a linear regression to predict GPA based on the ACT score. Give the regression equation.
(d) Based on your answer in (c), predict the GPA of a student who scored 20 on the ACT.
(e) Based on your answer to (c), find e1, e2, and e3 (the residuals for the first three observations).
(f) Find X and Y . Using your answer to (c), what is the predicted GPA for a student whose ACT score is equal to X ?
(g) Find SSE and MSE for this model.
(h) What is the estimate of σ from this analysis? (Recall our model is: Y = β0 +β1X +ε , where σ is the standard deviation of ε.)
5. For this problem use the plastic hardness data described in KNNL Problem #1.22.
(a) Plot the data using PROC GPLOT. Include a linear regression line on the plot. Is the relationship approximately linear?
(b) Using SAS, run a linear regression to predict hardness from time. Give the estimated regression equation.
6. For each of the following questions use the summary information to find the least-squares regression equation to predict Y from X.
(a) SSXX = 218. SSYY = 47. SSXY = -145.
(b)4,133 2,212,388