I have a dataset with 37 independent variables and a dependent variable. In order to take care of multicollinearity among the independent variables, I conducted a PCA on them. My first principal component explains 53% of the variance in the data.
However, when I run a linear regression using top 10 PCs (explaining ~90% of the variance) as independent variables:
lm(dependent ~ PC1 + PC2 + ...))
the regression coefficient for PC1 comes to be statistically insignificant (all other 9 PCs are highly significant). I would have imagined that PC1 would have been highly significant.
Am I conceptually missing something?