I am trying to do some feature selection in gene expression data with 22215 features. I followed the tutorial here.
I initially applied filter method(ttest) to select the features having the best p values. I selected 100 features from them initially. Then I tried to apply sequential feature selection method on them with SVM classifier. However, when I do
[fs1, history] = sequentialfs(@SVM_class_fun, reducedL, yS1, 'cv', c);
it always returned me the 1st feature only. I mean in fs1 every other feature except the first one is 0. If I try to force it to give me 10 features with
[fs1, history] = sequentialfs(@SVM_class_fun, reducedL, yS1, 'cv', c, 'nfeatures', 10);
Here is my SVM_class_fun
function err = SVM_class_fun(xTrain, yTrain, xTest, yTest)
model = svmtrain(xTrain, yTrain, 'Kernel_Function', 'rbf', 'boxconstraint', 10);
err = sum(svmclassify(model, xTest) ~= yTest);
end
it will give me the first 10 selected by the filter method having lowest p values.
So this mean using sequentialfs is not helpful in this case.
To let you know I have just 12 examples. So my data matrix is of dimension 12x22215. Might this be the issue?
Can anyone provide some insights?