5

Conceptually, aren't the eigenvalues of a correlation matrix and the singular values of the associated scaled data matrix supposed to be the same? The below illustration is saying that it isn't so. Please point out what I am missing.

> M
     [,1] [,2] [,3]
[1,]    1    6   11
[2,]    2    7   12
[3,]    3    8   21
[4,]    4    9   14
[5,]    5   10   34
> M.scale = scale(M)
> M.cor.eigen = eigen(cor(M))
> M.prcomp = prcomp(M.scale)
> M.svd = svd(M.scale)
> M.cor.eigen$values
    [1] 2.729542e+00 2.704577e-01 1.198779e-16
    > M.prcomp$sdev ^ 2
[1] 2.729542e+00 2.704577e-01 5.960165e-34
> M.svd$d
[1] 3.304265e+00 1.040111e+00 1.953076e-16
amoeba
  • 93,463
  • 28
  • 275
  • 317
  • 2
    Singular values of `M` are sq. root of eigenvalues of `M'M`. Correlation matrix for the columns of `M` [is](http://stats.stackexchange.com/a/22520/3277) `Z'Z/(n-1)` where `Z` is `M` after standardizing its columns. Therefore, to see the equivalence you have to divide your last singular values, after squaring them, by `5-1`. This is what Marc is saying in his answer. – ttnphns Nov 06 '14 at 09:53

1 Answers1

5

I believe this has to do with the fact that for both cor() and prcomp "variances are computed with the usual divisor N-1" (also see comments of this post). If you do an svd() on the correlation matrix, the results are the same:

svd(cor(M))$d
#[1] 2.729542e+00 2.704577e-01 1.317513e-16

Or, as somone commented earlier, dividing the squared singular values by (N-1) will also work:

M.svd$d^2 / (nrow(M)-1)
#[1] 2.729542e+00 2.704577e-01 9.536263e-33

A more direct comparison to SVD would be to use princomp(), which "uses divisor N for the covariance matrix":

M <- matrix(rnorm(200, mean=3, sd=5), 20, 10)
M.scale = scale(M)
x1 <- princomp(M.scale)$sdev^2 * nrow(M)
x2 <- svd(M.scale)$d^2
plot(x1, x2)
abline(0,1)

enter image description here

Marc in the box
  • 3,532
  • 3
  • 33
  • 47