SVM models Development Using SAS Enterprise Miner
1. Select from the data sets available (or ones designated by your instructor or other available sources). Provide a thorough description of the data set to include the number of cases, description of the inputs, target variable, description of the variables that could be used to develop predictive models, etc. (NOTE: predictive models are developed better with larger data sets that have many cases and possible inputs from which to select. Part of your grade for this assignment will be based on the robustness of the data set used.)
2. Explore the data by searching for anticipated relationships, unanticipated trends and anomalies - to gain deeper understanding and ideas. Use the SEMMA explore option to examine the data set you have created and look for interesting anomalies or relationships.
3. Cleanse and modify the data by removing errors, imputing missing values (as appropriate), transforming the variable distributions as necessary, and creating and selecting appropriate variables. Use the appropriate SEMMA options to cleanse the dataset as necessary. Investigate and discuss any "feature engineering" done for the data set.
4. Develop predictive models using the appropriate predictive modeling technique. Develop complete prediction models. There should be at least two models developed, compared and explained. The imbalanced target variable must be addressed and accounted for using one or more of the methods outlined in earlier lessons.
5. Using appropriate accuracy measures, assess the resultant models. Provide a complete assessment of the different models created using the SAS Enterprise Miner assessment ? options. Explain clearly any insights or conclusions from the accuracy measures.
6. Conclusions and takeaways. Provide clear and concise conclusions about the project to include lessons learned and any suggested improvements for future development. Suggest future enhancements for the analysis.
Note: Submitted report must be either in MS Word or PDF format and titled: "Assignment2_LastName". Only one document will be allowed to be submitted.
Content (note that the document must have at clearly marked sections at least for the items listed below)
1) Title page (1 page limit): course number and term, assignment number and project title, student name and contact information, instructor's name. Format it so it looks pleasant and presentable. Follow formatting guidelines above.
2) Introduction. Provide a brief outline of the dataset(s) you are using for this assignment. You may use the same classification data set you used for assignment 1, if it meets the following criteria: 1) sufficient number of cases (2000 or more); 2) reasonable number of possible input features (12-15 or more); 3) binary target variable; 4) heavily skewed target variable (at least 75% one outcome). NOTE: Dealing with a variety of different data sets provides you with more experience in cleaning and preparing data sets for model development. SVM models are best used with data sets with binary targets.
Briefly explain the content of the data to include a description of the variables in the data sets, the number of cases, etc. Include a screenshot of the data (not all cases need be shown, but be sure all relevant variables are visible). Provide a clear description of the purpose of the model being developed.
3) Data cleansing and/or preparation. Explain what was done and why it was necessary.
4) Predictive models developed. Clearly present, compare, and explain the models.
5) Results. Include appropriate results for the models. Interpret the results for meaning.
6) Conclusions and takeaways. Provide clear and concise conclusions about the project to include lessons learned and any suggested improvements for future development.
7) References (1 page limit): List all references in APA format used in preparing this report. It is strongly recommended to use outside knowledge in setting-up the analysis or discussing the results where possible.
8) Appendix (6 page limit): Include any appropriate workbooks, screenshots (figures, tables, diagrams) used in this assignment. Make sure all tables or figures or diagrams are easily readable and visually presentable.
Attachment:- Assignment Files.rar