You are the owner of a spam filtering service:
a) Currently, your server gets 2,000 spams per hour and only 500 good messages. The filter classifies 95% of the spams correctly, and misclassifies the other 5%. It classifies 99% of the good messages correctly, and misclassifies the other 1%. Tabulate the results. How many false positives and how many false negatives do you expect to see each hour? Calculate the precision, sensitivity, and specificity of the spam filter.
b) You receive word that a criminal gang is planning to double the amount of spam that it sens to your server. There will now be 4,000 spam messages per hour instead of 2,000. The number of good messages stays unchanged at 500. In our marketing literature, you have quoted the precision, recall, snsitivity, and specificity of your spam filter. You assume that the misclassification rates will stay the same (5% for spam, 1% for good messages). Should you put in a call to the technical writing department warning them that the literature will need to be revised? Why or why not? And if so, what will need to b revised?