I have extracted image features from medical images using a convolutional neural network and I am combining them with clinical features like age, gender, etc. There are 2048 extracted features and I am using PCA to reduce it down to 7 components. These 7 are the features that are combined with the clinical data and these features make up the dataset to train a random forests classifier.
Training data set X undergoes PCA
(n, d) = X.shape
X = X - np.tile(np.mean(X, 0), (n, 1))
(l, M) = np.linalg.eig(np.dot(X.T, X))
X = np.dot(X, M[:, 0:7])
Now X is used as the dataset to train the random forests. However, from what I have read, I cannot simply repeat this process with my testing dataset Y. I am not sure why not, and I am not sure how to actually test my model if I can't reduce the extracted image features of the testing dataset down to 7.