Say I have 20 predictor vars (X's) and 1 response var (Y) and I'm attempting to build a supervised model y=f(x). Is it advisable or is it "OK" to firstly run PCA on all of the Predictor variables - and if say 3 PC's account for ~70% of the overall variance...can these 3 NEW PC variables be used as new predictor variables to the supervised learning.
Does it violate any rule - given that PCs are just linear combinations of all original predictor vars that do not correlate with other PCs...?
Paul.