Question 1: Bayesian Networks: Metastatic cancer is a possible cause of brain tumors and is also an explanation for increased total serum calcium. In turn, either of these could explain a patient falling into a coma. Severe headache is also associated with brain tumors. A BN representation of this metastatic cancer example is shown below (Figure 1). All the nodes are Booleans. Given that a patient has severe headache, has a brain tumor, not in coma, and does not have symptoms of increased serum calcium, determine the probability that the patient has metastatic cancer.
Question 2: Given the following classification rule on weather data, prune it so that it is not an overfit. The goal is to obtain a good rule whose support is at least 3 and accuracy is 50% or more. The current rule has a support of 1 and accuracy of 100%.
Show your work.
Outlook=sunny and temp=cool and humidity=normal and windy=false ==> Play = Yes
What to submit? Submit a pdf file with your answers via the Blackboard. Your output should look like this:
Question 3: Given the following data, show ways to discretize age based on (i) Equal-width binning (4 bins) (i) Equal frequency binning (4 bins) (iii) Entropy-based discretization. Salary is the outcome class.
Age
|
Experience
|
Education
|
Salary
|
45
|
20
|
MS
|
High
|
65
|
40
|
6S
|
Medium
|
25
|
5
|
HS
|
Low
|
35
|
10
|
6S
|
High
|
27
|
5
|
BS
|
High
|
22
|
0
|
BS
|
Low
|
.30
|
3
|
MS
|
Medium
|
66
|
40
|
MS
|
Medium
|
50
|
25
|
BS
|
Medium
|
37
|
15
|
BS
|
High
|
33
|
10
|
MS
|
Medium
|
40
|
15
|
MS
|
High
|
23
|
5
|
HS
|
Low
|
24
|
2
|
BS
|
Medium
|
Question 4: Transform salary into binary variables using the standard method, the err-correcting code method, and nested dichotomies.