2

I’ve read that precision-recall (PR) curves are preferred over AUC-ROC curves when a dataset is imbalanced as there’s more of a focus on the model’s performance in correctly identifying the minority/positive class.

At what point (rule of thumb?) does it make more sense to primarily use PR to evaluate a classifier instead of AUC-ROC score? I imagine if the dataset has 40% positive class, AUC is still appropriate? But what about at 30% or 20% positive class? What level is considered “imbalanced” where PR is preferred?

Insu Q
  • 255
  • 2
  • 9
  • 2
    "Unbalanced" datasets are not a problem: [Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?](https://stats.stackexchange.com/q/357466/1352) However, precision and recall are: [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) (everything said about accuracy at that thread also applies to precision and recall). – Stephan Kolassa May 11 '20 at 04:42
  • @StephanKolassa so what’s the rule of thumb? I read the links and most of the examples were 1% positive class and 99% negative class. Are you suggesting that’s the answer? – Insu Q May 11 '20 at 12:31
  • 1
    No. Per my question and my answer to the accuracy question, there is no problem with unbalanced data, unless you use inappropriate quality measures like accuracy. Use an appropriate *probabilistic* model, and "unbalance" will naturally be expressed as low probabilities. – Stephan Kolassa May 11 '20 at 14:20
  • @StephanKolassa I might not have asked my question correctly. I know there’s no problem with unbalanced data. A lot of real-world data is unbalanced. My question is, is there a point in that level of unbalance where using PR curves makes more sense than using AUC? If you have too few positive examples in a dataset, the AUC can appear to be high and when you look at the PR curve, it’s obvious there’s room for improvement. When your dataset has 49% positives and 51% negatives, technically it’s unbalanced but AUC is fine to use. When it’s 5% positives, you probably want to look at a PR curve. – Insu Q May 11 '20 at 14:30
  • 1
    I advocate not using precision/recall at all. See the links above for my argument. [This may be helpful for context.](https://stats.meta.stackexchange.com/q/5000/1352) – Stephan Kolassa May 11 '20 at 14:43

2 Answers2

0

Agree with the comments, I have used AUC ROC for binary classification with a class imbalance of 5% positive and 95% negative. I was actually able to get a pretty good model still.

Stochastic
  • 799
  • 1
  • 6
  • 28
  • The concordance probability (AUROC) is not used for _classification_ (forced choice) but rather for assessing the pure predictive discrimination of a continuous _prediction_. And as you said it is unaffected by extreme imbalance. – Frank Harrell Nov 24 '20 at 12:43
0

Context

The imbalance depends on the dataset size also.

A model with 5-10% positive class and 90-95% negative class with 50 or 500 samples is different from a model that has 10'000 samples.

Opinion

A model seeing 1 positive sample and trying to learn from it is different from seeing hundreds of positive samples (even if they represent only 5% of the whole data).

Anyway, as anything between 20-40% positives is considered imbalanced, too imbalanced is around 5-10%, and extremely imbalanced is below 5%.

Resampling

Multiple resampling methods exist, however, it is very tricky on whether or not they improve your model, since an increase in the recall, causes also a huge decrease in precision in most of the times (if you oversample the minority).

ombk
  • 116
  • 2