1 45 points use the female bears data data from n 19


1. Use the "Female Bears Data." Data from n = 19 female bears of varying ages are used to develop an equation for estimating Y = female bear's weight from X = female bear's neck circumference.

a. Fit a simple linear regression model with Y = female bear's weight and X = female bear's neck circumference. Click the "Storage" button in the Minitab Regression Dialog and select each of the items in the left-hand list (i.e., Fits, Residuals, Standardized residuals, Deleted residuals, Leverages, Cook's distance, DFITS). Write down the estimated regression equation and the MSE for this model.

b. Which bear number has the highest leverage and what is that leverage? [Leverages are in the column labeled "HI1"]

c. Is the leverage in the previous part higher than the threshold 3(p/n)?

d. Use the estimated regression equation from part (a) to calculate the fitted value for bear #6. [You can check your answer with the one Minitab provides in the column labeled "FITS1".]

e. Use your answer from the previous part together with the actual weight of bear #6 to calculate the residual for this bear. [You can check your answer with the one Minitab provides in the column labeled "RESI1".]

f. What is the leverage for bear #6?

g. Use the residual from part (e), the MSE from part (a), and the leverage from part (f) to calculate the internally studentized residual for bear #6. [You can check your answer with the one Minitab provides in the column labeled "SRES1" - remember Minitab calls these "Standardized residuals."]

h. Delete bear #6 from the dataset as follows: select Data > Subset Worksheet, click "Specify which rows to exclude," click "Row numbers," and type "6" into the adjoining box. Then refit the simple linear regression model with Y = female bear's weight and X = female bear's neck circumference. Write down the estimated regression equation and the MSE for this model.

i. Use the residual from part (e), the MSE from part (h), and the leverage from part (f) to calculate the externally studentized residual for bear #6. [You can check your answer with the one Minitab provides in the column labeled "TRES1" in the original worksheet - remember Minitab calls these simply "Deleted residuals."]

j. Use the estimated regression equation from part (h) to calculate the predicted value for bear #6 (i.e., based on the model fit to the subset worksheet excluding bear #6). [Note: the answer won't make a whole lot of sense, but don't worry about this since we're simply going to use this predicted value for part (k).]

k. Use the fitted value from part (d), the predicted value from part (j), the MSE from part (h), and the leverage from part (f) to calculate the DFFITS for bear #6. [You can check your answer with the one Minitab provides in the column labeled "DFIT1" in the original worksheet.]

l. Is the absolute value of DFFITS in the previous part higher than the threshold given in the online notes, ?

m. Use the residual from part (e), the MSE from part (a), and the leverage from part (f) to calculate the Cook's distance for bear #6. [You can check your answer with the one Minitab provides in the column labeled "COOK1" in the original worksheet.]

n. Is the Cook's distance from the previous part higher than the upper threshold given in the notes, 1?

o. Briefly summarize your findings with respect to bear #6. You might want to consider graphical evidence too!

2. (27 points) Use the "College GPA Data." Data from n = 40 college students are used to develop an equation for estimating Y = grade point average (GPA) from X1 = verbal score on a college entrance exam (percentile) and X2 = math score on a college entrance exam (percentile).

a. Fit a "full quadratic" multiple linear regression model with Y, X1, X2, X12, X22, and X1 X2. [In Minitab: Select Y as the Response, X1 and X2 as the Continuous predictors, click "Model," select both X1 and X2 together in the Predictors box and click the Add buttons next to "Interactions through order 2" and "Terms through order 2."] Also click the "Storage" button in the Minitab Regression Dialog and select Deleted residuals, Leverages, and Cook's distance. Write down the estimated regression equation.

b. Which student has the largest absolute externally studentized residual and what is that externally studentized residual?

c. Is the externally studentized residual from the previous part greater in absolute value than 3? What do we call such points?

d. Which student has the highest leverage and what is that leverage?

e. Is the leverage from the previous part higher than the threshold 3(p/n)?

f. What is it about the student identified in part (d) that gives him/her such a high leverage? (Hint: compare this student's exam scores with other students' scores.)

g. Which student has the highest Cook's distance and what is that Cook's distance?

h. Is the Cook's distance from the previous part higher than the upper threshold given in the notes, 1?

i. Investigate whether removing any of the observations identified in the previous parts dramatically alters the model results.

3. (4+4+8+6+6=28 points) Use the "Brand Preference Data." Here, n = 16 observations are used to develop an equation for estimating Y = Degree of brand liking from X1 = Moisture content of the product and X2 = Sweetness of the product. The results were obtained from an experiment based on a completely randomized design (the data is coded).

a. Obtain the studentized deleted residuals and identify any outlying Y observations using the Bonferroni outlier test procedure with α = 0.10. State the decision rule and your conclusion. (In Minitab: Use "Storage" and check "Deleted residuals" under "Stat > Regression > Regression > Fit Regression Model ..." to get studentized deleted residuals).

b. Use the leverage values to explain if any of the observations outlying with regard to their X-values according to the rule of thumb 3(p/n)?
(In Minitab, use "Storage" and check "Leverages" under "Stat > Regression > Regression > Fit Regression Model ..." to get leverage values).

c. The Management wishes to estimate the mean degree of brand liking for moisture content X1 = 10 and sweetness X2 = 3. Construct a scatter plot of X2 against X1 and determine visually whether this prediction involves an extrapolation beyond the range of the data. Also, use equation (10.29) of the textbook to determine whether an extrapolation is involved. Do your conclusions from the two methods agree?

d. The largest absolute studentized deleted residual is for case 14 (see part (a)). Obtain the DFFlTS, and Cook's distance values for this case to assess the influence of this case. What do you conclude from each of the above values?

e. Calculate the average absolute percent difference in the fitted values with and without the case 14. What does this measure indicate about the influence of case 14?

Solution Preview :

Prepared by a verified Expert
Basic Statistics: 1 45 points use the female bears data data from n 19
Reference No:- TGS01224848

Now Priced at $20 (50% Discount)

Recommended (96%)

Rated (4.8/5)