Regression and correlation. Pick any two columns that have a correlation coefficient greater than 0.6 or less than -0.6. Make sure to pick the one with the highest absolute value.
a. Draw the scatter diagram of Y against X, and explain any noted significance.
b. Compute correlation coefficient (ρ or r), and what do you find? Make sure to explain thoroughly what you mean.
c. Obtain a and b of the regression equation defined as Y = a + b X, and the Coefficient of Determination (r2) from the Excel regression output, what can you tell? What is the relationship between r2 and ρ?
d. Compute the above statistics in 4) step by step using SXiYi, SXi, SYi, SXi2, SYi2 from Excel, and compare them with the results in C).
e. Draw the fitted regression line on the scatter diagram, obtain the residuals and plot them on the scatter diagram too. Explain your findings.
f. Write a paragraph or so on any observations you may have on the data, regression estimates or the regression residuals;
g. Calculate the additional y values for at least five other x values that do not appear in our data. Include that information in your report above and comment on whether you believe the calculate y value seems realistic and consistent with the other information you have calculated in each of the parts above..