The modern Olympic Games are modified revival of Greek Olympian Games which came to be largely through the efforts of French sportsman and educator Baron Pierre de Coubertin. The Games are international athletic competition which has been held at different site every four years since their inauguration in 1896, with occasional interruptions in the times of world wars.
The data for gold medal performances in long jump, high jump, and explain throw are given below (in inches). Year is coded to be zero in 1900.
Year ( + 1900)
|
Long Jump (in)
|
High Jump(in)
|
Discus Throw (in)
|
-4
|
249.75
|
71.25
|
1147.5
|
0
|
282.875
|
74.8
|
1418.9
|
4
|
289
|
71
|
1546.5
|
8
|
294.5
|
75
|
1610
|
12
|
299.25
|
76
|
1780
|
20
|
281.5
|
76.25
|
1759.25
|
24
|
293.125
|
78
|
1817.125
|
28
|
304.75
|
76.375
|
1863
|
32
|
300.75
|
77.625
|
1948.875
|
36
|
317.3125
|
79.9375
|
1987.375
|
48
|
308
|
78
|
2078
|
52
|
298
|
80.32
|
2166.85
|
56
|
308.25
|
83.25
|
2218.5
|
60
|
319.75
|
85
|
2330
|
64
|
317.75
|
85.75
|
2401.5
|
68
|
350.5
|
88.25
|
2550.5
|
72
|
324.5
|
87.75
|
2535
|
76
|
328.5
|
88.5
|
2657.4
|
80
|
336.25
|
92.75
|
2624
|
84
|
336.25
|
92.5
|
2622
|
88
|
343.25
|
93.5
|
2709.25
|
92
|
342.5
|
92
|
2563.75
|
96
|
336.4
|
94.1
|
2732.3
|
100
|
336.6
|
92.5
|
2728.3
|
1. We've seen how residuals are used to assess the fit of a regression line to data, but they've another significant role. How are residuals employed in the definition of least-squares regression?
2. Carry out exploratory data analysis using year as the explanatory variable and EITHER high jump OR discus throw as the response variable. Include in your EDA summary statistics (including r and r2), a scatterplot with a LSRL, a normal probability plot, a residual plot, and the least squares regression line equation.
3. Are there any points that may be outliers? Influential?
4. How would you characterize the relationship between the two variables you chose?
5. Using the LSRL what would be the predictive winning distances for the missing "war" years (1916, 1940, and 1944)?
6. Calculate the residuals for the following years, 1908, 1932, and 1984. Indicate if they line above or below the LSRL.