Google Weka and find the homepage for Weka.
When you install it, you may need to change your classpath to reference the jar file: weka-3-6\weka-src.jar
In the folder, you will find the icon to run Weka. Once you have it running.
Feel free to play around with the tool.
Part 2: Understand ARFF Format
Weka uses the ARFF (attribute-relation file format) file format.
Read through this document and familiarize yourself with the format.In addition, read chapters 1 and 2 of the text. Make sure you understand how sparse data is handled.You are expected to know this format well by next Tuesday!
Part 3: Problem
• create an arff file with the following data types
o flags, unit_id, names must be nominal
o timestamps (ts) must be date
o users and other ids must be numeric
o comments must be strings
• create an arff file that contains sparse data (eliminate the timestamps)
• test that these files can be loaded into Weka. You can load these via the Explorer or the Tools/ArffViewer.
There are arff convertor programs available online and Weka is able to read csv files and create arff files, but they won't create the files using the required data types. They can be used to get you started though.
Dataset 2
id,unit_id,name,created_ts,created_user,deactivated_ts,deactivated_user,active_flag,comments
1,ACC,Minor ACC,2/2/2001 22:00,21,NULL,NULL,1,
2,ACC,BA ACC-CMA,8/13/2001 6:34,21,NULL,NULL,1,
3,ACC,BS ACC-CMA,2/2/2001 7:15,21,2/2/2001 17:30,9,0,
8,MTH,BS Actuarial Science,10/12/2001 20:15,9,NULL,NULL,1,
11,MTH,BS Applied Mathematics,2/2/2001 12:00,9,8/13/2004 6:34,9,0,dropped
16,BIO,BA Biology,3/12/2001 19:34,21,NULL,NULL,1,
17,BIO,BS Biology,2/7/2001 12:00,13,2/21/2001 12:45,9,0,renamed
30,CSC,BA Computer Science,2/21/2001 12:00,21,NULL,NULL,1,
31,CSC,BS Computer Science,2/2/2001 8:43,9,NULL,NULL,1,