Q1 Bayesian Networks: Metastatic cancer is a possible cause of brain tumors and is also an explanation for increased total serum calcium. In turn, either of these could explain a patient falling into a coma. Severe headache is also associated with brain tumors. A BN representation of this metastatic cancer example is shown below (Figure 1). All the nodes are Booleans. Given that a patient has severe headache, has a brain tumor, not in coma and does not have symptoms of increased serum calcium, determine the probability that the patient has metastatic cancer.
Q2 Given the following classification rule on weather data, prune it so that it is not an overfit. The goal is to obtain a good rule whose support is at least 3 and accuracy is 50% or more. The current rule has a support of 1 and accuracy of 100%. Show your work.
Outlook=sunny and temp=cool and humidity=normal and windy=false ==> Play = Yes
What to submit? Submit a pdf file with your answers via the Blackboard. Your output should look like this:
Name Course HW#
Q1 Work and results for Q1
Q2 Work and results for Q2
Q1. Given the following data, show ways to discretize age based on (i) Equal-width binning (4 bins) (ii) Equal frequency binning (4 bins) (iii) :
Entropy-based discretization. Salary is the outcome class.
Age
|
Experience
|
Education
|
Salary
|
45
|
20
|
MS
|
High
|
65
|
40
|
BS
|
Medium
|
25
|
5
|
HS
|
Low
|
35
|
10
|
BS
|
High
|
27
|
5
|
BS
|
High
|
22
|
0
|
BS
|
Low
|
'30
|
3
|
MS
|
Medium
|
66
|
40
|
MS
|
Medium
|
50
|
25
|
BS
|
Medium
|
37
|
15
|
BS
|
High
|
33
|
10
|
MS
|
Medium
|
40
|
15
|
MS
|
High
|
23
|
5
|
HS
|
Low
|
24
|
2
|
ES
|
Medium
|
Q2. Transform salary into binary variables using the standard method, the err-correcting code method, and nested dichotomies.