The Quantitative Environmental Learning Project looked at "Characteristics of Selected Streams Along The West Side of The Sacramento Valley," including annual rainfall (in inches), drainage area (in square miles), and annual runoff (in inches per unit area).
a. Use software to produce fitted line plots for regressing runoff on rainfall and runoff on drainage; which of the two plots shows more scatter about the regression line?
b. Report the value of s (typical prediction error size) for regression of runoff on rainfall.
c. Use software to produce a 95% prediction interval for runoff when rainfall equals 30 (the approximate mean rainfall value).
d. In some circumstances, the prediction interval's margin of error is close to 2s, so the interval width is approximately 4s. Explain why in this situation the interval is substantially wider than 4s
e. Report the value of s (typical prediction error size) for regression of runoff on drainage.
f. Use software to produce a 95% prediction interval for runoff when drainage equals 140 (the approximate mean drainage value).
g. Report the width of your prediction interval in part (f) and verify that it is wider than 4s.
h. Report correlations for the regression of runoff on rainfall and for the regression of runoff on drainage.
i. Is the correlation higher for the larger or for the smaller value of typical prediction size s? Explain why this is the case in terms of how tightly clustered or loosely scattered the scatterplot points are.
j. Both sample sizes are the same, but one of the P-values for testing for a relationship between runoff and the explanatory variable (rainfall or drainage) is much smaller. Does it correspond to the regression with the higher or lower value of correlation r?
k. If you wanted to predict annual runoff for another creek, would it be more helpful to know the annual rainfall or the drainage area at that location?