What do you think about applying machine learning techniques, like Random Forests or penalized regression (with L1 or L2 penalty, or a combination thereof) in small sample clinical studies when the objective is to isolate interesting predictors in a classification context? It is not a question about model selection, nor am I asking about how to find optimal estimates of variable effect/importance. I don't plan to do strong inference but just to use multivariate modeling, hence avoiding testing each predictor against the outcome of interest one at a time, and taking their interrelationships into account.
I was just wondering if such an approach was already applied in this particular extreme case, say 20-30 subjects with data on 10-15 categorical or continuous variables. It is not exactly the $n\ll p$ case and I think the problem here is related to the number of classes we try to explain (which are often not well balanced), and the (very) small n. I am aware of the huge literature on this topic in the context of bioinformatics, but I didn't find any reference related to biomedical studies with psychometrically measured phenotypes (e.g. throughout neuropsychological questionnaires).
Any hint or pointers to relevant papers?
Update
I am open to any other solutions for analyzing this kind of data, e.g. C4.5 algorithm or its derivatives, association rules methods, and any data mining techniques for supervised or semi-supervised classification.