Assignment:
The National Transportation Safety Board collects data by state (including the District of Columbia) on traffic fatalities. Part of this data is shown in the following table. along with potentially related factors including population, number of licensed drivers, number of registered vehicles, and total number of vehicle miles driven. (The complete data are available in the file traffic.xls.) You have been asked to develop a model to help explain the factors that underlie traffic fatalities.
State |
Trafic fatalities |
population (thousands) |
Licensed drivers (thousands |
Registered vehicles (thousands) |
Vegicle miles traveled (millions) |
AL |
1083 |
4219 |
3043 |
3422 |
48,956 |
AK |
85 |
606 |
443 |
508 |
4,150 |
AZ |
903 |
4075 |
2654 |
2980 |
38,774 |
AR |
610 |
2453 |
1770 |
1560 |
24,948 |
CA |
4226 |
31431 |
20359 |
23518 |
271,943 |
CO |
585 |
3656 |
2620 |
3144 |
33,705 |
CT |
310 |
3275 |
2205 |
2638 |
27,138 |
DE |
112 |
706 |
512 |
568 |
7,025 |
DC |
69 |
570 |
366 |
270 |
3,448 |
FL |
2687 |
13953 |
10885 |
10132 |
121,989 |
|
|
|
|
|
|
a. Build a linear model to predict traffic fatalities based on all four potential explanatory variables as they are measured in the table. Evaluate the model in terms of overall goodness-of-fit. Evaluate the results for each regression parameter: Are the signs appropriate? Are the values different from zero?
b. Can yo improve the model in (a) by removing one or more of the explanatory variables from the regression? If so, compare the advantages and disadvantages of the resulting regression from the one in (a).
c. Can yo improve the models in (a) or (b) by transforming one or more of the explanatory variables from the regression? If so, compare the advantages and disadvantages of the resulting regression from the ones in (a) and (b)
Provide complete and step by step solution for the question and show calculations and use formulas.