Try different numbers of clusters in your program (K=2...15) and build a plot that shows the dependency between number K and value of RSS function on the last iteration. What is the optimal number of clusters K for a given data set? Did you get any empty clusters? What is the possible solution for this problem? Present output of your program in the report and give explanations.