Growth of pine trees. The Department of Biology at Kenyon College conducted an experiment to study the growth of pine trees. In April 1990, volunteers planted 1000 white pine (Pinus strobus) seedlings at the Brown Family Environmental Center. The seedlings were planted in two grids, distinguished by 10- and 15-foot spacings between the seedlings. Table 28.12 (page 28-67) shows the first 10 rows of a subset of the data collected by students at Kenyon College.14
(a) Use tree height at the time of planting (Hgt90) and the indicator variable for fertilizer (Fert) to fit a multiple regression model for predicting Hgt97. Specify the estimated regression model and the regression standard error. Are you happy with the fit of this model? Comment on the value of R2 and the plot of the residuals against the predicted values.
(b) Construct a correlation matrix with Hgt90, Hgt96, Diam96, Grow96, Hgt97, Diam97, Spread97, and Needles97. Which variable is most strongly correlated with the response variable of interest (Hgt97)? Does this make sense to you?
(c) Add tree height in September 1996 (Hgt96) to the model in part (a). Does this model do a better job of predicting tree height in 1997? Explain.
(d) What happened to the individual t statistic for Hgt90 when Hgt96 was added to the model? Explain why this change occurred.
(e) Fit a multiple regression model for predicting Hgt97 based on the explanatory variables Diam97, Hgt96, and Fert. Summarize the results of the individual t tests. Does this model provide a better fit than the previous models? Explain by comparing the values of R2and s for each model.
(f) Does the parameter estimate for the variable indicating whether a tree was fertilized or not have the sign you expected? Explain. (Experiments can produce surprising results!)
(g) Do you think that the model in part (e) should be used for predicting growth in other pine seedlings? Think carefully about the conditions for inference.