0

I want to make dimensionality reduction to data that contains features in different scales (for example height in meters and weight in Kg). Usually I use PCoA with euclidean or bray-curtis distance, but I'm afraid that it will give higher weight to features with higher scale. Any good distance metric to these kind of data? Maybe should I use Z-score for each feature? Or should I do PCA (which is based on correlation and not on distance) or any other algorithm instead?

Thanks!

  • Before you use PCA, you should scale your data first. https://stats.stackexchange.com/questions/69157/why-do-we-need-to-normalize-data-before-principal-component-analysis-pca – thomaskolasa May 12 '20 at 21:53
  • Thanks, It seems that I can use PCA based on correlation matrix and not on covariance matrix, which will be equal to scale. – Jes biol May 13 '20 at 05:48

0 Answers0