In this homework, your goal is to devise an algorithm that detects spam email. This homework is quite long but I belive you will find this addictive.
"It was from back when Enron got sued for doctoring financial statement to manipulate stock price. The email has marked spam for being spam email and ham for being a real useful email."
"You may notice that this is different from all the exercises you have done so far. There is no number at all. I just gave you a bunch of email. The algorithm we learned can only deal with numbers. So, you will need to extract feature from these email. Some example would be counting how many keywords like viagra, cheap, etc appear in the email. You can think of a bunch other variable that may help you detect spam email. You will need to process the raw data into a bunch of numbers yourself."
Attachment:- Spam Detection.rar