Refer to Exercise 7 in Chapter 2 for the placekicking data collected by Berry and Wood (2004). Using Distance, Weather, Wind15, Temperature, Grass, Pressure, and Ice as explanatory variables and Good as the response, perform model selection in the following ways:
(a) Compare which variables are selected under forward selection, backward elimination, and stepwise selection using BIC.
(b) Perform BMA using BIC. Which variables seem important and which seem clearly unimportant?
(c) Estimate the regression parameters for the stepwise analyses. Compare these estimates to those of BMA. How are they different?
(d) Explain why the regression parameter estimate for Grass using BMA is somewhat closer to 0 than it is using stepwise.
Exercise 7
Exercise 17 of Chapter 1 examined data from Berry and Wood (2004) to determine if if an "icing the kicker" strategy implemented by the opposing team would reduce the probability of success for a field goal. Additional data collected for this investigation are included in the placekick.BW.csv file. Below are descriptions of the variables available in this file:
• Game Num: Identifies the year and game
• Kicker: Last name of kicker • Good: Response variable ("Y" = success, "N" = failure)
• Distance: Length in yards of the field goal
• Weather: Levels of "Clouds", "Inside", "SnowRain", and "Sun"
• Wind15: 1 if wind speed is ≥15 miles per hour and the placekick is outdoors, 0 otherwise.
• Temperature: Levels of "Nice" (40?F
• Grass: 1 if kicking on a grass field, 0 otherwise
• Pressure: "Y" if attempt is in the last 3 minutes of a game and a successful field goal causes a lead change, "N" otherwise
• Ice: 1 if Pressure = 1 and a time-out is called prior to the attempt, 0 otherwise
Notice that these variables are similar but not all are exactly the same as given for the placekicking data described in Section
2.2.1 (e.g., information was collected on field goals only, so there is no PAT variable). Using this new data set, complete the following:
(a) When using a formula argument value of Good ~ Distance in glm(), how do you know if R is modeling the probability of success or failure? Explain.
(b) Estimate the model from part (a), and plot it using the curve() function.
(c) Add to the plot in part (b) the logit() = 5.8121 - 0.1150distance model estimated in Section 2.2.1. Notice that the models are quite similar. Why is this desirable?
Exercise 17
Before a placekicker attempts a field goal in a pressure situation, an opposing team may call a time-out. The purpose of this time-out is to give the kicker more time to think about the attempt in the hopes that this extra time will cause him to become more nervous and lower his probability of success. This strategy is often referred to as "icing the kicker." Berry and Wood (2004) collected data from the 2002 and 2003 National Football League seasons to investigate whether or not the strategy actually lowers the probability of success when implemented. Table 1.7 contains the results from the 31-40 yard field goals during these seasons under pressure situations (attempt is in the last 3 minutes of a game and a successful field goal causes a lead change). Complete the following:
(a) Calculate the Wald and Agresti-Caffo confidence intervals for the difference in probabilities of success conditioning on the strategy. Interpret the intervals.
(b) Perform a score test, Pearson chi-square test, and LRT to test for the equality of the success probabilities.
(c) Estimate the relative risk and calculate the corresponding confidence interval for it. Interpret the results.
(d) Estimate the odds ratio and calculate the corresponding confidence interval for it. Interpret the results.
(e) Is there sufficient evidence to conclude that icing the kicker is a good strategy to follow? Explain.