1

As the title suggests, I am running a Random Forest classifier using Scala. To evaluate this classifier (and since I am handling highly imbalanced classes), I used the BinaryClassificationEvaluator library. The area under PR is >0.5 but when I print the confusion matrix, it looks like my recall and precision are 0 (I have 0 TP predictions).

Is this mathematically possible?

Sycorax
  • 76,417
  • 20
  • 189
  • 313
Toutsos
  • 157
  • 4
  • 3
    The confusion matrix just looks at one threshold, but the PR curve looks at all thresholds. Suppose your classifier gives predictions 0.48 to all negative and 0.49 to all positives. Using the classification rule "If prediction > 0.5, positive else negative", you'll have 0 TP predictions. Something similar is happening here. – Sycorax Apr 17 '19 at 18:07
  • I completely understand what you mean..That was my first thought as well but I wasnt so sure. Will wait a little longer and accept answer after I investigate a bit further. – Toutsos Apr 17 '19 at 18:59
  • Related: [Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?](https://stats.stackexchange.com/q/357466/1352), as well as [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) - everything said there about accuracy applies equally to TPR, FPR etc. – Stephan Kolassa Apr 17 '19 at 20:27
  • @Sycorax: do you want to post your comment(s) as an answer? [Better to have a short answer than no answer at all.](https://stats.meta.stackexchange.com/a/5326/) Anyone who has a better answer can post it. – Stephan Kolassa Apr 17 '19 at 20:28
  • @StephanKolassa Fair point. – Sycorax Apr 17 '19 at 21:49

1 Answers1

3

The confusion matrix just looks at one threshold, but the PR curve looks at all thresholds. Suppose your classifier gives predictions 0.48 to all negative and 0.49 to all positives. Using the classification rule "If prediction > 0.5, positive else negative", you'll have 0 TP predictions. Something similar is happening here.

Sycorax
  • 76,417
  • 20
  • 189
  • 313