Data set:
CarType
|
mpg
|
disp
|
hp
|
drat
|
wt
|
Mazda RX4
|
21
|
160
|
110
|
3.9
|
2.62
|
Mazda RX4 Wag
|
21
|
160
|
110
|
3.9
|
2.875
|
Datsun 710
|
22.8
|
108
|
93
|
3.85
|
2.32
|
Hornet 4 Drive
|
21.4
|
258
|
110
|
3.08
|
3.215
|
Hornet Sportabout
|
18.7
|
360
|
175
|
3.15
|
3.44
|
Valiant
|
18.1
|
225
|
105
|
2.76
|
3.46
|
Duster 360
|
14.3
|
360
|
245
|
3.21
|
3.57
|
Merc 240D
|
24.4
|
146.7
|
62
|
3.69
|
3.19
|
Merc 230
|
22.8
|
140.8
|
95
|
3.92
|
3.15
|
Merc 280
|
19.2
|
167.6
|
123
|
3.92
|
3.44
|
Merc 280C
|
17.8
|
167.6
|
123
|
3.92
|
3.44
|
Merc 450SE
|
16.4
|
275.8
|
180
|
3.07
|
4.07
|
Merc 450SL
|
17.3
|
275.8
|
180
|
3.07
|
3.73
|
Merc 450SLC
|
15.2
|
275.8
|
180
|
3.07
|
3.78
|
Cadillac Fleetwood
|
10.4
|
472
|
205
|
2.93
|
5.25
|
Lincoln Continental
|
10.4
|
460
|
215
|
3
|
5.424
|
Chrysler Imperial
|
14.7
|
440
|
230
|
3.23
|
5.345
|
Fiat 128
|
32.4
|
78.7
|
66
|
4.08
|
2.2
|
Honda Civic
|
30.4
|
75.7
|
52
|
4.93
|
1.615
|
Toyota Corolla
|
33.9
|
71.1
|
65
|
4.22
|
1.835
|
Toyota Corona
|
21.5
|
120.1
|
97
|
3.7
|
2.465
|
Dodge Challenger
|
15.5
|
318
|
150
|
2.76
|
3.52
|
AMC Javelin
|
15.2
|
304
|
150
|
3.15
|
3.435
|
Camaro Z28
|
13.3
|
350
|
245
|
3.73
|
3.84
|
Pontiac Firebird
|
19.2
|
400
|
175
|
3.08
|
3.845
|
Fiat X1-9
|
27.3
|
79
|
66
|
4.08
|
1.935
|
Porsche 914-2
|
26
|
120.3
|
91
|
4.43
|
2.14
|
Lotus Europa
|
30.4
|
95.1
|
113
|
3.77
|
1.513
|
Ford Pantera L
|
15.8
|
351
|
264
|
4.22
|
3.17
|
Ferrari Dino
|
19.7
|
145
|
175
|
3.62
|
2.77
|
Maserati Bora
|
15
|
301
|
335
|
3.54
|
3.57
|
Volvo 142E
|
21.4
|
121
|
109
|
4.11
|
2.78
|
1) Create scatter plots of mpg (Y var) against disp, hp, drat and wt (X var). Which variable looks to have the best fit? Does your opinion change after log transforming the X variables?
2) What are the correlation coefficients of the X variables (non-transformed) against the Y? Do they agree with your opinion about the best fit?
3) Is the correlation coefficient significant for mpg against hp at the 0.05 level? Show your work.
4) Regress mpg against each variable individually. Report the regression equations for each.
5) Which regression has the best fit? Why?
6) Transform the variables however you like. Does this improve fit?
7) Interpret the b1 for the wt equation.
8) I'm looking at a car that has a weight (wt) of 7.1. How many miles per gallon (mpg) do you predict I'll get? Are you worried about making this prediction? Why or why not?
9) If the regression statistics section didn't print, how could you find R squared using the ANOVA table in the Excel output?
10) Without using the p-value or t-stat, how could you determine if the regression coefficient is significant?
11) Are there any issues with the regression assumptions? Providence evidence for why or why not.
12) Do any unusual observations exist? Provide evidence for why or why not.
13) Which variable would you recommend I use for the best prediction of mpg? Why? Take all pieces of regression into account.