1) Association rule mining frequently generates large number of rules. Explain effective methods which can be used to reduce number of rules generated while still preserving most of the interesting rules.
2) What do you mean by associative classification? Why is associative classification able to achieve higher classification accuracy than a classical tree method? Describe how associative classification can be used for text document classification?
3) Sketch a Privacy preserving clustering method so that a data owner would be able to ask third party to mine the data for quality clustering without worrying about the potential inappropriate disclosure of certain private or sensitive information stored in the data.
4) The concept of micro clustering has been popular for on-line maintenance of clustering information for data streams. By exploring the power of micro clustering, design the effective density based clustering method for clustering evolving data streams.
5) TF-IDF has been used as effective measure in document classification.
a) Write down one example to show that TF-IDF may not be always a good measure in document classification.
b) Explain another measure which may overcome this difficulty.