2

This is a follow-up question from the post: PCA on correlation or covariance?

The accepted answer quotes:

You tend to use the covariance matrix when the variable scales are similar and the correlation matrix when variables are on different scales.

I have a data set that are coefficients of a linear model. Let's say $c_1, c_2, \cdots, c_n$. This means they are dimensionless and were obtained by regression. I computed the correlation among the coefficients and found out that the maximum correlation is 0.88 while lowest is -0.79.

I also made a boxplot and the results showed that the variance in some coefficients are larger and some are smaller.

I wanted to know which would be more appropriate to use: PCA cov or correlation?

amoeba
  • 93,463
  • 28
  • 275
  • 317
cgo
  • 7,445
  • 10
  • 42
  • 61
  • Are your scales similar? If yes, covariance and correlation will be equivalent. If not, correlation will be equivalent to scaled covariance. Always use correlation to be on the safe side. – Digio Sep 06 '18 at 12:21
  • Most of them are $|c| \leq 1$. But there are some values like 3 or 4. But nothing like 200. So it is hard to judge because these are dimensionless constants. – cgo Sep 06 '18 at 12:47

0 Answers0