Assignment
Answer the following questions:
1. Consider the data set shown in Table below:
Instance
|
A
|
B
|
C
|
Class
|
1
|
0
|
0
|
0
|
+
|
2
|
0
|
0
|
1
|
-
|
3
|
0
|
1
|
1
|
-
|
4
|
0
|
1
|
1
|
-
|
5
|
0
|
0
|
1
|
+
|
6
|
1
|
0
|
1
|
+
|
7
|
1
|
0
|
1
|
-
|
8
|
1
|
0
|
1
|
-
|
9
|
1
|
1
|
1
|
+
|
10
|
1
|
0
|
1
|
+
|
(a) Estimatethe conditional probabilities for
P (A|+), P(B|+), P(C|+), P(A|-), P (B|-), P (C|-)
(b) Use the estimate of conditional probabilities given in the previous questionto predict the class label for a test sample (A=0, B=1,C=0) using the naive Bayes approach.
(c) Estimate the conditional probabilities using the m-estimate approach,with p:I/2 and m:4.
(d) Repeat part (b) using the conditional probabilities given in part (c).
(e) Compare the two methods for estimating probabilities. Which method isbetter and why?
2. Consider the data set shown in Table below:
Instance
|
A
|
B
|
C
|
Class
|
1
|
0
|
0
|
1
|
-
|
2
|
1
|
0
|
1
|
+
|
3
|
0
|
1
|
0
|
-
|
4
|
1
|
0
|
0
|
-
|
5
|
1
|
0
|
1
|
+
|
6
|
0
|
0
|
1
|
+
|
7
|
1
|
1
|
0
|
-
|
8
|
0
|
0
|
0
|
-
|
9
|
0
|
1
|
0
|
+
|
10
|
1
|
1
|
1
|
+
|
(a) Estimate the conditional probabilities for
P(A = 1|+), P(B = 1|+), P(C = 1|+), P(A = 1|-), P(B = 1|-), P(C = 1|-)
using the same approach as in the previous problem.
(b) Use the conditional probabilities in part (a) to predict the class label for a test sample (A =1, B=1,C=) using the naive Bayes approach.
(c) Compare P(A=1), P(B=1), and P(A=1,8=1) State the relationshipsbetween A and B.
(d) Repeat the analysis in part (c) using P(A=1), P(B=0), and P(A=1,B=0).
(e) Compare P(A=1,B=1|Class=+) against P(A=1|Class=+) and P(B=1|Class = +). Are the variables conditionally independent given the class?
3. For each of the Boolean functions given below, state whether the problem is linearly separable.
(a) A AND B AND C
(b) NOT A AND B
(c) (A OR B) AND (A OR C)
(d) (A XOR B) AND (A OR B)
4. Following is a data set that contains two attributes, X and Y, and two classlabels, "+" and "-". Each attribute can take three different values: 0, 1, or 2.The concept for the "+" class is Y = 1 and the concept for the "-" class is X = 0 V X = 2
(a) Build a decision tree on the data set. Does the tree capture "+" and the "-" concepts?
(b) What are the accuracy, precision, recall, and F1-measure of the decisiontree? (Note that precision, recall, and F1-measure are defined with respect to the "+" class.)
(c) Build a new decision tree with the following cost function:
= 0, if i = j;
C (i, j) = 1, if i = +, j = -;
= Number of - instance, if i = -, j = +;
Number of - instance
(Hint: only the leaves of the old decision tree need to be changed.) Does the decision tree captures the "+" concept?
(d) What are the accuracy precision, recall, and F1-measure of the new decision tree?
5. Given the Bayesian network shown in Figure below, compute the following probabilities:
(a) P(B = good, F = empty, G = empty, S= yes).
(b) P(B = bad, F = empty, G = not empty, S = no).
(c) Given that the battery is bad, compute the probability that the car willstart.
Format your assignment according to the following formatting requirements:
1. The answer should be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides.
2. The response also include a cover page containing the title of the assignment, the student's name, the course title, and the date. The cover page is not included in the required page length.
3. Also Include a reference page. The Citations and references should follow APA format. The reference page is not included in the required page length.