I have a dataset where of socio-demographic features of a population (expressed as percentages over the total population of the municipality: e.g. 12% of freelancers, 5% of unemployed etc.), each observation is a municipality of the city. My goal is to politically classify each municipality in left/right (roughly). I compare both K-Means and hierarchical clustering using the Ward method, and I find that the latter performs way better, misclassifying only 2% of the points, while kmeans does a little worse, with a 6% of wrong points.
My question is: from a theoretical pov, how do I interpret this result? Why should one perform better than another in such a situation?