after reading different posts about unbalanced datasets I didn't make my mind clear about my specific problem so that's why I'm posting a new question.
In my case, I have a dataset with around 20K rows and 40 features. I'm trying to do binary classification but in the data the minority class is only the 7% of the instances. I read about using different sampling methods to deal with this problem. Among those I tried SMOTE by using the "unbalanced" R package but I have several doubts about if this package is doing well with my data. From those 40 features I have only 1 that is numeric one (age) and all the others are binary features (yes/no for given diseases). As far as I know, SMOTE works with continuous data since it calculates the Euclidean distance among neighbors.
Does any of you knows if I'm doing correctly by applying this technique to my dataset with binary features?? And in case it's not, how could I manage this problem??
Thanks you so much in advance.