Decision Tree Learning
(a) Describe the main steps in the basic decision tree learning algorithm. The table below contains a sample S of ten examples. Each example is described using two Boolean attributes A and B. Each is labelled (classied) by the target Boolean function.
(b) What is the entropy of thse examples with respect to the given classication ?
[Note: you must show how you got your answer using the standard formula.] This table gives approximate values of entropy for frequencies of positive examples in a two-class sample.
(c) What is the information gain of attribute A on sample S above ?
(d) What is the information gain of attribute B on sample S above ?
(e) Which would be chosen as the \best" attribute by a decision tree learner using the information gain splitting criterion ? Why ?
(f) Describe a method for overtting-avoidance in decision tree learning.