Review the classification algorithms provided below and compile a thorough summary and comparison of the algorithms. Summary should serve as a resource for algorithm selection and should include items like:
- type of classification: binary, multiclass
- type of model (whichever applies): linear, nonlinear, probabilistic, generative, discriminative, eager, lazy, etc.
- high bias, high variance?
- main application domains
- general characteristics
- advantages
- disadvantages
- training/classification speed
- how much parameter tuning needed
- for description, prediction, or both (interpretability)
- what attribute types can be handles; mixed?
- type of data preprocessing needed (e.g., data type, scaling, transformation)
- ability to handle noise, irrelevant features
- ability to handle high-dimensional data
- things to watch out for
and so on. I expect that you will reach outside the class resources to find more information; however, please any resource used should be referenced appropriately.
You are free to format your summary according to what you think best summarizes the algorithms and provides the best road-map for algorithm selection - it may be a list, a table, a mind-map, or a road-map (e.g., for example scikit- learn algorithm cheat-sheet or Microsoft Azure ML cheatsheet).
Classifications & Algorithms:
- Decision Trees (ID3, CART, CHAID)
- Nearest Neighbor (K-NN)
- Bayesian Classifiers
- Artificial Neural Networks
- Support Vector Machines
- Ensemble Methods
- Class Imbalance and Multiclass Problems