- I have an imbalanced dataset where the target class is <1% of sample.
- I apply oversampling or undersampling e.g. https://github.com/scikit-learn-contrib/imbalanced-learn.
- I run random forest on the resampled data
- I adjust probabilities back to the original sample by multiplying by the ratio of odds ratios as explained here: https://yiminwu.wordpress.com/2013/12/03/how-to-undo-oversampling-explained/
Is step 4 always the same regardless of the type of oversampling or undersampling employed?