0

enter image description hereI am a beginner. I have a dataset of 1700 samples with 4 features and I have to perform Hierarchical Clustering (the agglomerative version) and I need to decide whether or not to scale the data and perform PCA. If I scale the data (I tried almost all the "scalers") and then perform PCA, the clustering seems to be incorrect, while if I perform PCA without data scaling, it seems correct. Why is this? Or do you think I am probably doing something wrong?

Sandra
  • 1
  • 1
  • 2
    It's better if you share how you assessed the "correctness" of your clustering because the task is unsupervised. – gunes May 24 '19 at 10:09
  • Can you please give more information about your data. The two plots shown appear qualitatively very similar. There are three cluster and a handful of outliers. As the axes shown are largely on the same level too, I suspect that the original data might not be in a vastly different scales either... – usεr11852 May 24 '19 at 11:48

0 Answers0