Using the Loans data set, demonstrate that it is bad practice to include interest with the other predictors, as follows: Based on your work in the previous exercises, what is the lesson we should learn? For Exercises 13-16, using the Loans data set, demonstrate that different sortings may lead to different numbers of clusters. Make sure you do not include interest as an input to the clustering algorithms.
Exercises 13
Generate four different sortings of the Loans_training data set. Together with the original order from the No Interest model you generated earlier, this makes five different sortings.
Exercises 14
Run BIRCH on each of the five different sortings. Report the value of k and the MS for each.
Exercises 15
Calculate model cost for each of the five different sortings. Which model has the highest profitability or the lowest cost?
Exercises 16
Briefly profile the clusters for the winning model from the previous exercise.