I have a microarray expression dataset (46 samples, thousands of attributes) and I want to perform feature selection first, and, based on this subset of features (shouldn't be more than 4 or 5, based on my reduced number of samples) built a classifier model. Due to the reduced possibilities of 46 samples I would like to ask for some of your advice, to those who have already faced these type of problems.
In your experience, what selection strategies worked best (filter vs wrapper methods? any in particular?)
Do you use Cross-validation in feature selection? With this reduced dataset I can't divide my dataset and use one part for feature selection and other for building the classifier..
I've been using Weka software until now. Given this situation and the possibilities this software offers, can I do separately the feature selection (Select Attributes window), and after, remove the "useless" attributes from the arff (preprocess window) and build the classifier later (Classify window)
or should I use a meta classifier (e.g.AttributeSelectedClassifier)?
Not sure about when is this last option recommended. Would this avoid overfitting better than the former manner?
Sorry if the post isn't precise enough, but any experience related to this kind of problems, any suggestion of pipeline (and in weka, if possible), would be appreciated.
thanks a lot!