When using principal components as predictors in linear regression, PC1 comes out not significant

Question

I have a dataset with 37 independent variables and a dependent variable. In order to take care of multicollinearity among the independent variables, I conducted a PCA on them. My first principal component explains 53% of the variance in the data.

However, when I run a linear regression using top 10 PCs (explaining ~90% of the variance) as independent variables:

lm(dependent ~ PC1 + PC2 + ...))

the regression coefficient for PC1 comes to be statistically insignificant (all other 9 PCs are highly significant). I would have imagined that PC1 would have been highly significant.

Am I conceptually missing something?

No, you are not missing. There is no guarantee that the scores for your first PC are statistically significant with another variable (eg. your independent variable in this case.) — usεr11852, Oct 07 '15 at 22:03
@amoeba, I used the first 10 PCs in my regression model as together they explain about 90% of the variance in the data. PC2:PC9 came out to be highly significant. I am confused because I was expecting PC1 to be signifcant. — States.the.Obvious, Oct 07 '15 at 22:06
Thanks for the clarification. Please see my answer in this thread http://stats.stackexchange.com/questions/141864/ (and many linked answers there if you want to go further). Perhaps this question can even be closed as a duplicate of that one. Let me know if that discussion does not fully address your concerns. — amoeba, Oct 07 '15 at 22:12

When using principal components as predictors in linear regression, PC1 comes out not significant

0 Answers0