Problem
I. A data mining routine has been applied to a transaction dataset and has classified 88 records as fraudulent (30 correctly so) and 952 as non-fraudulent (920 correctly so). Construct the confusion matrix and calculate the overall error rate.
II. Suppose that the routine in the above problem has an adjustable cutoff (threshold) mechanism by which you can alter the proportion of records classified as fraudulent. Describe how moving the cutoff up or down would affect
i. the classification error rate for records that are truly fraudulent
ii. the classification error rate for records that are truly nonfraudulent
III. The following table in "data.csv" shows a small set of predictive model validation results for a classification model, with both actual values and propensities.
i. Calculate error rates, sensitivity, and specificity using cutoffs of 0.25, 0.5, and 0.75.
ii. Create a decile-wise lift chart in R.