9
  1. I have an imbalanced dataset where the target class is <1% of sample.
  2. I apply oversampling or undersampling e.g. https://github.com/scikit-learn-contrib/imbalanced-learn.
  3. I run random forest on the resampled data
  4. I adjust probabilities back to the original sample by multiplying by the ratio of odds ratios as explained here: https://yiminwu.wordpress.com/2013/12/03/how-to-undo-oversampling-explained/

Is step 4 always the same regardless of the type of oversampling or undersampling employed?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
simon
  • 349
  • 1
  • 9

0 Answers0