Question: a. Use the Minitab macro RNLTSMUL with Nsamp = 600 to approximate the exact LTS(.25) solution for the following data set. (Note: It may take an hour or more for the macro to run on some computers if a RAM disk is not set up for Minitab.)
The Y-values were generated randomly from the regressor values, but one value was changed considerably to represent an error. Can you determine the bad data point? Could this determination have been made using an OLS analysis? Are there any other points that appear to be questionable (from the LTS squared residuals) so that trimming more than one observation seems desirable, or would it be better to bound the influence of those data points? Would it be preferable to use a sequential trimming approach for LTS, even if we didnt know that there was exactly one bad data point?
b. Use OLS and continue trimming each point whose standardized residual is outside (-2, 2). After you have trimmed four points, com-pare those points with the ones that were trimmed by the LTS(.25) estimator. Comment. Does it make sense to continue trimming points in this manner?
c. What is the value of R2 with and without the single bad data point? What does the large difference between the two numbers suggest?