What is maximum width of each transaction in binarized data


Assignment

1- Consider the traffic accident data set shown in Table below.

Traffic accident data set.

WeatherCondition

Driver's Condition

TrafficViolation

Seat Belt

Crash Severity

Good

Bad

Good

Bad

Bad

Bad

Bad

Good

Good

Bad

Good

Bad

Alcohol-impaired

Sober  

Sober  

Alcohol-impaired

Alcohol-impaired

Alcohol-impaired

Alcohol-impaired

Sober  

Alcohol-impaired

Sober

Alcohol-impaired

Sober

Exceed speed limit

None

Disobey stop sign

Exceed speed limit

Disobey traffic signal

Disobey stop sign

None

Disobey traffic signal

None

None

Exceed speed limit

Disobey stop sign

No

Yes

No

Yes

No

Yes

Yes

Yes

No

Yes

Yes

Yes

Major

Minor

Minor

Major

Major

Minor

Major

Minor

Minor

Major

Major

Minor

a. Show a binarized version of the data set.

b. What is the maximum width of each transaction in the binarized data?

c. Assuming that support threshold is 30%, how many candidate and frequentitemsets will be generated?

2- Consider the data set shown in Table below. The first attribute is continuous,while theremaining two attributes are asymmetric binary. A rule is consideredto be strong if itssupport exceeds 15% and its confidence exceeds 60%.The data given in Table below supports the following two strong rules:

(i) {(1 ≤ A ≤ 2),B = 1} → {C = 1}
(ii) {(5 ≤ A ≤ 8),B = 1} → {C = 1}

A

B

C

1

2

3

4

5

6

7

8

9

10

11

12

1

1

1

1

1

0

0

1

0

0

0

0

1

1

0

0

1

1

0

1

0

0

0

1

a. Compute the support and confidence for both rules.

S ({(1 ≤ A ≤ 2),B= 1} → {C = 1}) =
C ({(1 ≤ A ≤ 2),B = 1} → {C = 0}) =
S ({(5 ≤ A ≤9),B = 1} → {C = 1}) =
C ({(5 ≤ A ≤9),B = 1} → {C = 1}) =

3. Consider the data set shown in Table below. Suppose we are interested inextracting the following association rule:

{α1 ≤ Age ≤α2, Play Piano = Yes} → {Enjoy Classical Music = Yes}

Age

Play Piano

Enjoy Classical Music

9

11

14

17

19

21

25

29

33

39

41

47

Yes

Yes

Yes

Yes

Yes

No

No

Yes

Yes

Yes

No

No

Yes

Yes

No

No

Yes

No

No

No

No

Yes

Yes

Yes

To handle the continuous attribute, we apply the equal-frequency approachwith 3, 4, and 6 intervals. Categorical attributes are handled by introducingas many new asymmetric binary attributes as the number of categorical values.Assume that the support threshold is 10% and the confidence thresholdis 70%.

(a) Suppose we discretize the Age attribute into 3 equal-frequency intervals.Find a pair of values for α1 and α2 that satisfy the minimum supportand minimum confidence requirements.

(b) Repeat part (a) by discretizing the Age attribute into 4 equal-frequency intervals. Compare the extracted rules against the ones you had obtained in part (a).

(c) Repeat part (a) by discretizing the Age attribute into 6 equal-frequency intervals. Compare the extracted rules against the ones you had obtained in part (a).

4. For each of the sequence w = below, determine whether theyare subsequences of the following data sequence:
<{A,B}{C,D}{A,B}{C,D}{A,B}{C,D}>

subjected to the following timing constraints:
mingap = 0 (interval between last event in ei and first eventin ei+1 is > 0)
maxgap = 2 (interval between first event in ei and last eventin ei+1 is ≤ 2)
maxspan = 6 (interval between first event in e1 and last eventin elast is ≤ 6)
ws = 1 (time between first and last events in ei is ≤ 1)

a. w = <{A}{B}{C}{D}>
b. w = <{A}{B,C,D}{A}>
c. w = <{A}{B,C,D}{A}>
d. w = <{B,C}{A,D}{B,C}>
e. w = <{A,B,C,D}{A,B,C,D}>

5. Draw all candidate subgraphs obtained from joining the pair of graphs shownin Figure below Assume the edge-growing method is used to expand the subgraphs.

1025_Graphs.jpg

Format your assignment according to the following formatting requirements:

1. The answer should be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides.

2. The response also includes a cover page containing the title of the assignment, the student's name, the course title, and the date. The cover page is not included in the required page length.

3. Also include a reference page. The Citations and references should follow APA format. The reference page is not included in the required page length.

Request for Solution File

Ask an Expert for Answer!!
Database Management System: What is maximum width of each transaction in binarized data
Reference No:- TGS02976051

Expected delivery within 24 Hours