I've come into a situation where I dont understand How to choose the number of clusters. The WithinSS increases suddenly after 6. How/What do I interpret of this graph ? Background : I've applied PCA on the data set and used 4 PCs. The clustering is done on the PCs. I'm using K-means cluster algorithm in R.
Asked
Active
Viewed 158 times
1 Answers
0
"The within-cluster sum of squares is a measure of the variability of the observations within each cluster. In general, a cluster that has a small sum of squares is more compact than a cluster that has a large sum of squares. Clusters that have higher values exhibit greater variability of the observations within the cluster."
Source here
There are a lots of methods that you can use to identify the right number of clusters, here a great topic.

kjetil b halvorsen
- 63,378
- 26
- 142
- 467

Terru_theTerror
- 101
- 2