Is it okay to use Adaboost to do feature selection (selecting a subset of dimensions $S$ from a high-dimensional feature vector $V$)? I divided the samples into four non-overlapping sets: $A$ (training1), $B$ (validation), $C$ (training2), $D$ (testing). There could be two procedures to do Adaboost feature selection:
Procedure 1:
i. Run Adaboost on sets $A$ and $B$ to determine a good subset of feature dimensions $S$ from high-dimensional feature vector $V$.
ii. Using only the low-dimensional features $S$ (a subset of $V$), train a SVM classifier using $C$ and evaluate it on $D$.
Procedure 2:
i. Run Adaboost on sets $A$ and $B$ to determine a good subset of feature dimensions $S$ from high-dimensional feature vector $V$.
ii. Using only the low-dimensional features $S$ (a subset of $V$), train a SVM classifier using $A$ and evaluate it on $D$.
It sounds that procedure 1 is more rigorous. But the problem is that in practice, it doesn't work. The good subsets $S$ is correlated with training set $A$. Thus, if you use another training set $D$, $S$ is no long good. It just behaves like a random subset from $V$.
So, is it appropriate to procedure 2? Is Adaboost appropriate for this case? Are there any better ways to discard bad features?