Data Mining Assignment
Question 1
Classify the following attributes by number of values (binary, discrete or continuous) and applicable operations (nominal, ordinal, interval, ratio). Explain your choice.
a) Number of patients in a hospital
b) ISBN numbers for books
c) Ability to pass light in terms of the following values: opaque, translucent, transparent
d) Military rank
e) Distance from the center of campus
Question 2
Which visualization techniques would you use for data that has:
a) a large number of samples and a large number of features
b) a small number of samples and a large number of features
c) a large number of samples and a small number of features
d) a small number of samples and a small number of features
Explain your choices.
Question 3
Using the techniques discussed in class, map the following categorical values to numerical representations.
a) Days of the week
b) Car Types: Sports, Luxury, Family
c) Letters of the Alphabet
Question 4
Discuss the differences and similarities of the fields of statistics and data mining.
Question 5
Using Weka, visualize two datasets of your choice (excluding the IRIS dataset) from the UCI repository and discuss the results (including screen shots). Hint: you might want to try a dataset with numerical feature values and a discrete (as opposed to continuous) class attribute.