0

My goal is to carry out an hierarchical cluster analysis using the principal components that explain most of the variance.

None of my variables is normal and therefore I think I should transform them (I am following a paper that worked with similar data and they square root and log transformed the variables with a non-normal distribution).

Nevertheless, I am unsure if it is ok to transform and then scale the variables before the PCA. Could you confirm if this is correct?

Also, to choose the best type of transformation should I try both and then check the normality again of the transformed variables or is there a better method?

Thank you in advance.

  • PCA has little to do with Normality and certainly doesn't require or assume it. For an example of how nonlinear transformations might enhance a subsequent PCA, see https://stats.stackexchange.com/a/259223/919. – whuber Feb 18 '21 at 19:49
  • Thank you for answering but my question is not if it is necessary rather then if it is correct to both normalize (log/square root transformation) and scale my variables. – Catarina Toscano Feb 19 '21 at 11:18
  • "Correct" doesn't seem like an appropriate property; but if so, we would welcome your clarification of what you might mean by this. – whuber Feb 19 '21 at 14:43

1 Answers1

0

I found in another post that it is ok and advisable in certain types of data Why log-transforming the data before performing principal component analysis?