I'm trying to calculate change in scores on a depression questionnaire - a very simple problem. However, what I care about is not the change in raw score, but rather the change in principal component scores for each subject. My pipeline is as follows:
- Conduct a PCA using the pre-treatment scores for each subject
- Calculate pre-treatment scores for each subject for PC1 through PC4
- Use the loadings for PC1-4 calculated in part 1 to calculate post-treatment scores for each subject for PC1-4
- Compute the difference between pre- and post-treatment scores for each subject
However, because PC scores are scaled, the post-treatment scores are no longer scaled and centered because they are calculated using the PC loadings from the pre-treatment data but the actual data from the post-treatment data. Is this kosher?
A follow-up question would be, is there a better way to calculate change in principal component scores between time points? Could I calculate the factor loadings using all data (pre- and post-treatment) and then calculate pre- and post-treatment scores for PC1-4 that way? Intuitively that seems wrong.
Any suggestions would be much appreciated!