I have a data which contains several columns which I later reduced using a PCA algorithms to two different components. I then applied the k-means algorithms to the data.
Now, how can I verify that my data clustered well into each group? Or how do I determine misclassification rate?
For instance, using R, if I check the cluster vector say k$cluster against the labels of the data I had previously before clustering can I just draw a confusion matrix from that and assume that 1 in the clustered vector is equivalent to 1 in my labels?
col3 col2 Col1 lables
123 2.32 2.50 0
124 2.81 3.10 1
125 2.72 3.09 2
126 2.92 3.03 3
127 2.32 2.95 4
Please note this is a hypothetical data; my data is way bigger than this.