0

In a binary classification problem, we usually use the ROC curve to choose the decision threshold.

(the concept of the decision threshold: if an unit's probability of 1 is bigger than the threshold, we classify the unit as class 1)

But, I am not sure why do not we use the accuracy((TP+TN)/(TP + TN + FP + FN)) with cross-validation? As far as I know, because accuracy is not a good metric of model performance in unbalanced classes problem, we do not use that as a metric. However, in my opinion, that is just a problem when we tune parameters. That is, I cannot understand why do not we use the accuracy in the decision threshold selection.

Is there any special reason? except what I have just mentioned?

QWEQWE
  • 471
  • 1
  • 9
  • 1
    Of possible interest: https://stats.stackexchange.com/questions/357466/are-unbalanced-datasets-problematic-and-how-does-oversampling-purport-to-he https://www.fharrell.com/post/class-damage/ https://www.fharrell.com/post/classification/ https://stats.stackexchange.com/a/359936/247274 https://stats.stackexchange.com/questions/464636/proper-scoring-rule-when-there-is-a-decision-to-make-e-g-spam-vs-ham-email https://twitter.com/f2harrell/status/1062424969366462473?lang=en – Dave Jun 16 '21 at 11:32

0 Answers0