i) Set the number of data points to 20 and the number of clusters to 2.
Check box Show History.
Click on Initialize.
Click on Start.
Click on Step .
Observe which points change the cluster membership as you continue clicking step.
What operation is done by k-means algorithm when you click on each of these buttons?
(ii) Try to rearrange the points and initial cluster center positions such that the clustering result depends on the initialization of cluster centers. That is:
(a) start and run the clustering with some initial cluster center positions. Remember where are the final cluster positions.
(b) Change the initial cluster center position and click on start and run again. Try to find the initial cluster positions such that the clustering solution is different than in (a) .
You can move the points by dragging them with mouse pointer.
You can add points by clicking in empty space.
Reset restarts the clustering but does not change the distribution of points.
(iii) Try to arrange the points such that it takes more that 5 iterations before it ends. You can add and move points as well as add and move cluster centers.
Observe the history and how the algorithm converges.
What are the main drawbacks of k-means?
What other clustering method do you know, which do not have these problems?