Q1. Describe two common methods for assessing the classifier accuracy, based on the randomly-sampled partitions of the given data. Are there general methods for improving the classifier accuracy?
Q2. Illustrate the fields in which clustering methods are used? Illustrate any four fields. Describe fundamental requirements of the cluster analysis.
Q3. Why is outlier mining significant? In brief explain the different approaches behind the statistical based outlier detection and distanced based outlier detection.
Q4. Explain the reason why decision tree induction popular? Describe over-fitting of an induced tree and two approaches to avoid the over-fitting by using appropriate example or diagrams.
Q5. How can you employ the Web as a data source for your data warehouse? What kinds of information can you obtain from the Web?
Q6. Describe how data mining is employed in the banking industry.
Q7. Name the main stages of a data mining operation. Out of these stages, pick two and explain the types of activities in these two stages.
Q8. Describe data granularity and how it is applicable to data warehouse.