Refer to the CDI data set in Appendix C2 and Project 19.53. The metropolitan areas identified in Project 19.53 are to be considered in a study of the effects of region (factor A: variable 17) and percent below poverty level (factor B: variable 13) on crime rate (variable 10-:- variable 5), with percent of population 65 or older (variable 7) as a concomitant variable. For purposes of this analysis of covariance study, percent below poverty level is to be classified into two categories: less than 8.0 percent, and 8.0 percent or more.
a. Obtain the residuals for covariance model (22.26).
b. For each treatment, plot the residuals against the fitted values. Also prepare a normal probability plot of the residuals and calculate the coefficient of correlation between the ordered residuals and their expected values under normality. What do you conclude from your analysis?
c. State the generalized regression model to be employed for testing whether or not the treatment regression lines have the same slope. Conduct this test using α = .001. State the alternatives, decision rule, and conclusion. What is the P-value of the test?
Project 19.53
Refer to the CDI data set in Appendix C.2. The following metropolitan areas are to be considered in a study of the effects of region (factor A: variable 17) and percent below poverty level (factor B: variable 13) on the crime rate (variable 10 variable 5):
For purposes of this ANOVA study, percent of population below poverty level is to be classified into two categories: less than 8 percent, 8 percent or more.
a. Assemble the required data and obtain the fitted values for ANOVA model (19.23).
b. Obtain the residuals.
c. Prepare aligned residual dot plots for the treatments. What departures from ANOVA model
(19.23) can be studied from these plots? What are your findings?
d. Prepare a normal probability plot of the residuals. Also obtain the coefficient of correlation between the ordered residuals and their expected values under normality. Does the normality assumption appear to be reasonable here?
Appendix C.2
This data set provides selected county demographic information (CDI) for 440 of the most populous counties in the United States. Each line of the data set has an identification number with a county name and state abbreviation and provides information on 14 variables for a single county. Counties with missing data were deleted from the data set. The information generally pertains to the years 1990 and 1992. The 17 variables are: