Problem
1. What is curse of dimensionality? How does it affect distance and similarity measures?
2. What is clustering? Describe an example algorithm that performs clustering. How can we know whether it produced decent clusters on our data set?
3. How might we be able to estimate the right number of clusters to use with a given data set?