0

PCA performs a linear transformation on a data set to obtain a new data set, this time with eigenvector basis and eigenvalue loadings. If Z is our data set, P our linear transformation, and Y our new data set with eigenvector basis, the transformation is:

P*Z = Y .

but, quiet often Z is not our original data set. If the units of the variables are different, we must perform some sort of standardization/normalization transformation, say K, so that they are comparable. If K is our transformation matrix, then to obtain the new, standardized data set, we perform:

K*X = Z

My question is, what are the loadings in terms of our original data set X? In more practical terms, if b1 is the first measurement type of Z, PC1 might look like..

b1 5.0
b2 2.7
b3 3.0
...
bn 10.0

Where the values of b1->bn are given by P. But what is PC1 in terms of the original data set X, with measurements a1->an?


a1 ?
a2 ?
a3 ?
...
an ?

These should come from some new linear transformation P_new. I would assume we can use basic matrix algebra to figure this out, but I want to make sure I'm 100% correct here:

if K*X = Z,

and P_old*Z = Y, we're searching for:

P_new*X = Y

we should be able to use...

P_new*X = Y

or P_new = Y*X^-1 ?

Is this correct? What if X is not square (and it most certainly is not, there are almost always more trials than variables in any good data set)? Will computational solutions make sense?

I've searched through 23/34 pages on PCA, and was not able to find an answer to this question. If its out there, my apologies, I tried.

tkg
  • 72
  • 4
  • You might want first to read this thread http://stats.stackexchange.com/q/62677/3277. – ttnphns Oct 09 '13 at 16:36
  • I've read the thread. I'm not asking if performing some pre PCA data-messaging step is theoretically correct, I'm asking how a preprocessing step/transformation effects the final eigenvalue. – tkg Oct 09 '13 at 17:35
  • In a linear way, which, however, is unpredictable beforehand. – ttnphns Oct 09 '13 at 17:54
  • I'm asking after the fact. Was my question unclear? Here is a simple example. Say I change the units on one of my variables in my data set, this could be seen as a transformation from data set X to data set Z. Then I perform PCA on data set Z, giving eigenvalues and eigenvectors in terms of data set Z. What are these eigenvalues and eigenvectors in terms of data set X? – tkg Oct 09 '13 at 18:41
  • If you read that thread attentively you would find that the eigen-set (the -values and the -vectors) will correspond to Z and in no way to X. There is no relation between X and the eigen-set of Z apart from via Z. – ttnphns Oct 09 '13 at 20:55
  • 1
    Hmm I don't think thats correct. If K is the linear transformation taking X to Z, and P is the linear transformation taking Z to Y (PCA finds P). In linear algebra's terms, we write K*X = Z (K could be normalization or standardization) , and P*Z = Y (P is found through PCA). My question was, what is M, the transformation that takes X to Y, written M*X = Y. M is found by simple linear algebra! Someone on another forum helped me. Y = P*Z = P*(K*X) = (P*K)*X (matrix multiplication is associative) Hence Y = M*X = (P*K)*X, which implies M*X*X^-1 = P*K*X*X^-1 so M = P*K ! – tkg Oct 10 '13 at 17:35

0 Answers0