2

In many receptor-modeling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable.

This is done by:

  1. calculate the z-score for absolute zero concentrations (i.e. take a vector with all zeroes, subtract the sample mean and divide by the sample variance);
  2. calculate the rotated PC scores for each component for this z-scored absolute zero from step 1;
  3. subtract the "zero" PC score (from 2) from the true scores.

I tried to follow the procedure but still had negative values... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion):

irisX <- iris[,1:4]
ncomp <- 2 
pca_iris        <- prcomp(irisX , center=T, scale=T)
rawLoadings     <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp)
rotatedLoadings <- varimax(rawLoadings)$loadings
invLoadings     <- t(pracma::pinv(rotatedLoadings))
scores          <- scale(irisX) %*% invLoadings  # my scores from rotated loadings which are standardized

# want to use APCS to do MLR instead of these scores
#step 1: create artificial sample with zero concentrations for all variables  
z0i              <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1)
#step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample
scores0         <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) 
#step 3: now to calculate my new "APCS"
scores0         <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores))
ACPS            <- scores - scores0

This results in

> head(ACPS)
         [,1]       [,2]
[1,] 4.274291  -9.339044
[2,] 3.980231  -8.167430
[3,] 3.937934  -8.548838
[4,] 3.886160  -8.284854
[5,] 4.262470  -9.527271
[6,] 4.724111 -10.296421

and one can see that there are still negative values in the ACPS data. Why?

amoeba
  • 93,463
  • 28
  • 275
  • 317
sor
  • 21
  • 3
  • **You are doing everything correctly.** The sign of PCA components is arbitrary, see https://stats.stackexchange.com/questions/88880. The same remains true for the ACPS components. What you have in your `ACPS` matrix is that the first column is all positive but the second column is all negative. You can simply flip the sign of the second component if you want to have all positive values in your matrix. (CC to @ttnphns) – amoeba Dec 11 '17 at 11:20

0 Answers0