1

Should I balance data set for survival random forest? By subsampling I will loose information in data set. However I would do that in RF for classification. Should it be done also in case of survival analysis? I am not sure whether there is a conceptual difference.

pikachu
  • 731
  • 2
  • 10

1 Answers1

3

Don't balance, in neither case. Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?

(Converted from a comment. For my rationale, see here. On short answers, see here. Better and longer answers are always welcome.)

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357