Question 1: Clustering has been popularly recognized as a significant data mining task with broad applications. Give one application illustration for each of the given cases:
a) An application which takes clustering as a main data mining function.
b) An application which takes clustering as a preprocessing tool for data preparation for other data mining tasks.
Question 2: Why is outlier mining significant? In brief explain the various approaches behind statistical-based outlier detection, distanced-based outlier detection, density-based local outlier detection and deviation-based outlier detection.
Question 3: Describe the difference between K-means and k-medoids algorithm.
Question 4: Describe the efficiency of k-medoids algorithm on large data sets.
Question 5: Explain the diverse dimensions in a spatial data cube.
Question 6: How to construct a data cube for multimedia data analysis?
Question 7: How to find out the similarity between documents?
Question 8: Describe how to mine spatial databases.