I'm performing text classification experiments with Scikit Learn, on a small dataset (100 samples of labelled texts by patients and controls). I tested two supervised machine learning methods: SVM (kernel=RBF) and L2-regularized logistic regression.
The feature selection methods is always the same: SelectKbest (k=200) with mutual information. I experimented with two sets of features: one including character n-grams and one without them. Feature extraction, selection an dclassification are done in a pipeline, to cross validate all steps.
My results showed that:
the SVM classifier performs better than the logistic regression on the feature set with the character n-grams.
the SVM classifier performs worser than the logistic regression on the feature set without the character n-grams.
I'm a beginner in supervised machine learning and I'm struggling with all the different possibilities. Now, I can't find an explanation for this results, I can't wrap my head around it. Any suggestions?