I have to create a classification model where my dataset contains 697 observations which only 18 are from the group of interest. As usual, I split data the into a training and test set stratified by the positive class.
I tried doing 10-fold CV with SMOTE on the training data to select the best model, but on average none were better than chance on the CV folds. Now I'm left wondering what is the best approach to the problem, and even thought on doing some things:
Utilizing bootstrap instead of CV, however I read that I might need a big number of repetitions, but given the size of my data, I wonder if my resamples will be too similar;
Just ignore any form of resampling and try fitting a complex model on the whole training data;
Try a different approach to the problem, maybe as an anomaly detection utilizing one-svc SVM.
Are any of these alternatives valid or are there any more "validated" approach to this situation?