2

I did a lot of readings about how to do PCA with train/test split. see PCA and the train/test split

I understand that we should apply the PCA on train set and then apply the same transformation to the test set. However, when it comes to logistic PCA, I have no idea:

logistic PCA treat the binary data as Bernoulli with probability p, and used ALS to optimize U and V parameters. Logistic PCA

My question is: How can I apply the same transformation to the test set? If I use the same log(p/(1-p)) transformation to test data to change them to probability, then it will be inf or 0. Then I cannot use V to project testing points to PCs.

1 Answers1

0

When I read the paper A Generalized Linear Model for Principal Component Analysis of Binary Data, I realized that the paper itself just gave us the solution. To compute the scores matrix U for the testing data we should use the same transformation, and this transformation is V as we previously trained. So we just need to fix V and update U for testing data. Note that the process is still done by maximizing log-likelihood.