The effect of non-positive-definite covariance matrix (in $p>n$ case) on PCA

Question

Gene data has large number of dimensions as compared to samples. This leads to a non-positive-definite covariance matrix. In R when I try to use princomp which does the eigendecomposition of covariance matrix, it complains that sample size should be larger than dimensions. Whereas prcomp works fine since it performs SVD. That is understood.

There has been a lot of research on estimation of covariance and inverse covariance matrix for $p\gg n$ problems. I am trying to figure out, what is the exact effect on "principal components" obtained by eigendecomposition of non-positive-definite matrix and how it can affect:

Clustering
Classification

I think that [this answer](http://stats.stackexchange.com/a/147983/28500) makes it clear that you get the same principal components however you approach it (except perhaps for differences relating to numerical approximations). If that answer isn't sufficient, please edit your question to specify your yet-unanswered question. — EdM, Jun 15 '16 at 18:43
@EdM Thank you for referring to other answer which was really helpful. Please correct me if I am wrong, What I have understood is that its just the limitation of computation that we don't do the eigendecomposition in case of p >> n. The only thing I am still confused about is that if we are doing eigendecomposition of non positive definite matrix, we will get negative eigen values. What will be the effect of having negative eigen values in the principal components? Or it doesn't makes any difference? — Mustafa Arif, Jun 15 '16 at 20:00
In the p > n case, the covariance matrix is positive semi-definite. There will be eigenvalues with 0 values, but none with negative values. (Consider the symmetry in the construction of the covariance matrix.) The positive eigenvalues will be the same whether the analysis is thought of as on the covariance matrix or on the Gram matrix, as noted in the answer to which I provided the link. — EdM, Jun 15 '16 at 21:27
Plus to EdM's link, two more similar questions: http://stats.stackexchange.com/q/219064/3277; http://stats.stackexchange.com/q/46475/3277. — ttnphns, Jun 15 '16 at 21:43

The effect of non-positive-definite covariance matrix (in $p>n$ case) on PCA

0 Answers0