Training AUC below 0.5

Question

I've trained a logistic regression using a small number of predictors - pseudo R-squared is only 0.1 but I have significant terms and a nice low p value for the model. However, even on its own training data, the AUC for the model is only 0.28:

I thought this was impossible, and my only intuition for what's going on is that the class imbalance (only 5% of the observations are in the positive class) means that the no information rate is pretty high. I still think my model should beat random guessing, at least on its own training data.

I assumed this was a coding error but I think I've ruled that out now, so can anyone explain to me how this is possible?!

I've looked for other discussions on here, most seem to focus on AUC < 0.5 on the test set which I can can more easily understand (e.g. here and here). This one came the closest, demonstrating that a predictor that's really just noise can come out below 0.5 - but I think my model has found some real relationships in the data...

Thanks @usεr11852 but I've tried flipping the labels, respecifying the target variables and other coding errors but I just keep reproducing this result. — Tom Wagstaff, Dec 20 '21 at 20:10
Thanks @Sycorax, I'd get that if my terms were all insignificant, but given I've got significant terms and a decent p value for the model overall, seems like there's some signal in the features. I realise I haven't shared enough to rule out a code error, but I've satisfied myself it's unlikely. — Tom Wagstaff, Dec 20 '21 at 20:12
That's a good clue but I'm not fully understanding. My expectation at least would be that significant terms should tend to rank order the observations correctly. Is there a link you could throw me on this @Sycorax? — Tom Wagstaff, Dec 20 '21 at 20:27
Thanks but I know what the 2 concepts mean, and that they're different, but surely one leads to the other? I was actually referring to the p value of the whole model, but let's say it's a model with a single significant linear predictor. The fact that the model achieves a good fit with that variable, should mean that predictions based on that linear relationship tend to rank order the observations better than randomly too. And in any case, not worse! :-) — Tom Wagstaff, Dec 20 '21 at 20:46
@usεr11852 you were totally right - the labels were still flipped in the final plotting command, the one bit of code I didn't check - silly me, thanks for your help — Tom Wagstaff, Dec 21 '21 at 11:39
No worries Tom, it has happened to all of us in the past too. :) (I will write it as an answer so the question isn't unanswered) — usεr11852, Dec 21 '21 at 11:42

score 2 · Accepted Answer · answered Dec 21 '21 at 11:50

AUC-ROC can be below 0.5 but when it is substantially below 0.5 as in the case shown here (~0.28) there is a good chance that the labels are flipped/reversed at some point in our modelling pipeline. Such a low AUC-ROC score would suggest we are consistently bad, it can happen, but usually we are just bad!

score 0 · Answer 2 · answered Dec 21 '21 at 11:45

0

It turned out this was a coding error, here's the correct ROC curve: with AUC of 0.72.

Moral of the story: if you see this, triple- and quadruple-check your code, because it is impossible.

answered Dec 21 '21 at 11:45

Tom Wagstaff

193
8

Training AUC below 0.5

2 Answers2