Is anything wrong with this approach to PCA?

Question

I'm working on an implementation of PCA that works on very large data sets.

Based on my understanding of the algorithm, the first step is to do an SVD of the input m x n matrix, X. This SVD looks like X = WΣV^T. The "interesting" output Y of this process -- from Wikipedia, "The PCA transformation that preserves dimensionality (that is, gives the same number of principal components as original variables)" -- is given by the following equation:

equation

Based on my reading, if I can compute the W component of the SVD, then I can compute Y as:

equation

The upshot here is that I'd only have to compute W. In terms of computational and memory complexity, this approach is significantly more efficient because the only matrix above and beyond the initial data set I'd have to load has size m x m, which (at least in my case) is much smaller than V, which would be n x n.

Is there some reason why this derivation won't work that I'm not seeing?

That is correct, but how are you going to compute $W$ without computing everything else? What you should use is an "economy size" SVD that does not return those $n-m$ columns of $V$ that are filled with zeros. Then you don't have a problem. See point 10 in my answer here: [Relationship between SVD and PCA. How to use SVD to perform PCA?](http://stats.stackexchange.com/questions/134282) — amoeba, Jan 27 '15 at 16:16

Is anything wrong with this approach to PCA?

0 Answers0