0

My colleague introduced me to this idea that we can do residual analysis for random forest classification problem that spits out probabilities instead of the class label for a binary classification problem.

This claim has surprised me. I am under the assumption that residual analysis can only be done for a regression problem. I cannot find relevant literature to rebuttal the arguments of my co-worker. May I know if anyone of you has come across this idea before, or you have a proof for why this claim might be wrong. In addition, will a residual plot have any meaning in this regard?

Dee
  • 3
  • 2
  • 2
    Does this answer your question? [Determine accuracy of model which estimates probability of event](https://stats.stackexchange.com/questions/20534/determine-accuracy-of-model-which-estimates-probability-of-event) The Brier score described there is the equivalent of mean squared error in regression. – EdM Jun 12 '20 at 17:39
  • Yeah I can sort of see this being related to my problem. I am working with imbalanced class here. So, this might not completely solve my issue. Thanks for the link. – Dee Jun 12 '20 at 17:58
  • Nothing in your question suggests that imbalanced classes are an important component of carrying out residual analysis. Can you edit your question to clarify how imbalanced classes in your problem requires a special kind of residual analysis that is not addressed in the other post? In particular, nothing on the other page appears to depend on whether or not the classes are balanced, so the suggestion that imbalanced classes make the duplicate post not relevant seems like a red herring. – Sycorax Jun 15 '20 at 12:59
  • https://stats.stackexchange.com/questions/163221/whats-the-measure-to-assess-the-binary-classification-accuracy-for-imbalanced-d Went over this stackexhange. I am convinced that Brier method might actually be a better option for me. Thanks! – Dee Jun 15 '20 at 13:33

0 Answers0