What is the expected value of AUROCC for random predictions?

Question

I was having a debate with co-workers today about the dependence of AUC on class imbalance, ie, the proportion of positive/negative instances in the response variable. It was suggested that when classes are highly imbalanced, the average AUROCC value for repeated random predictions would not necessarily be expected to converge on 0.5 in the limit.

For models fit using random data or using randomly permuted class labels, I would expect the class probabilities of each instance in all models to be uniformly distributed between zero and one, and consequently the AUC to vary between zero (when all positive samples are assigned a class probability of zero, and all negative samples non-zero) and one (when all positive samples are assigned a probability of one, and all negative samples less than one). Under this model, I would expect the AUC to be relatively uniformly distributed between zero and one. I am curious as to whether I am missing some aspect of this.

I would be grateful for any literature on the topic if this is not the case.

Can you pleas check this thread on "[What is “baseline” in precision recall curve](https://stats.stackexchange.com/questions/251175)"? I think it relevant to what you ask. — usεr11852, May 14 '18 at 19:50

score 2 · Accepted Answer · answered May 14 '18 at 17:34

2

ROC AUC is insensitive to class imbalance.

Tom Fawcett, "An introduction to ROC analysis"

answered May 14 '18 at 17:34

Sycorax

76,417
20
189
313

Thanks! That was also my intuition but I am not knowledgeable enough about the topic to be confident in asserting it. As I understand it, the value of other AUPR and other measures is in allowing you to optimise for measures other than correct positive prediction (edit: and lack of incorrect positive prediction of course) – alan ocallaghan May 14 '18 at 17:41

What is the expected value of AUROCC for random predictions?

1 Answers1