I have a high dimensional dataset ($n \times p$: $30 \times 100$) which I want to use as an testing dataset to build a two group classifier (LDA or QDA). I've read that you can do PCA to do an dimension reduction of your dataset to select to most important features. But I'm a bit confused what you use exactly as the input to build the classifier. I'm familiar with PCA using SVD and what it means.
Consider following situation:
- I do a SVD of my dataset.
- I look at the scores of the first couple of principal components.
- When I assign my scores a label indicating from which group they come, I see that the 3th PC best separates my 2 groups (although it only explains 7% of the total variance).
What do I do next?
- I take the 3th PC transform to the original parameter space (scores * loadings * scale + mean) and build my classifier
- I look at the loadings in the 3th PC and try to decide which parameters in my original parameter space are important and build a classifier only using these.
- ...
Option 2 seems the most sensible in my opinion but I'm not entirely sure. Also If I see that only the 3th PC is important to explain the variance in my two groups, can I forget about the first two PC in my further analysis?