ROC curve interpretation

Question

In the context of binary classification how do you interpret ROC curve: more precisely:

1) Why the diagonal stand for a random classifier?

[Edit] Let's imagine a random classifier: each time he see an observation he labels this observation to 1 with a probability of 0.5.

So with a lot of observations,

among the observation with true label 1 half of them will be correctly classified.
In the same way there is a 0.5 false positive rate (among the true label 0 half of them will be classified 1).

So on ROC curve this classifier will be on the diagonal on the point (0.5,0.5). But I can only see this case unless i don't understand well the meaning of random classifier in this context...

2) Why ROC curve is insensitive to Class skew ? Why ROC curve is insensitive to class distributions or error costs.

Let's imagine we have a sample of observation and we draw the ROC curve. Now let's add a lot of observation wich are labelled 0, does the ROC curve stay the same ? what does it mean and how to explain this ?

Her I saw an explanation: https://www.quora.com/Why-is-AUC-Area-under-ROC-insensitive-to-class-distribution-changes; the answers seems good if we assume that when we increase the number of negative sample they should be distributed with the same score distribution that the previous neagative samples.

PS: A mathematical paper (with proof) could help me to well understand the previous assertion since it will be formally defined. I found this one (in my language) but i am not sur it is correct since there seem to have a mistake since the begining with the RECALL definition: http://www.xavierdupre.fr/app/mlstatpy/helpsphinx/c_metric/roc.html. So if you have any reference please share it. I

Have you reviewed the answers on this thread? https://stats.stackexchange.com/questions/145566/how-to-calculate-area-under-the-curve-auc-or-the-c-statistic-by-hand The explanation for why the diagonal corresponds to the random classifier derives from the connection to the Wilxcoxon-Mann-Whitney test. If this link doesn't help you, please edit your question to clarify what you don't understand & we can help you get unstuck. — Sycorax, May 02 '18 at 21:26
Thank you for the wonderful link! I understand the interpretation with the concordance. So if each 0 and each 1 is mapped to a random score i understand that AUC is 0.5 since we can draw half proportion of line with positive slope. But i still don' t get the ROC curve. Why the diagonal? There could be an infinite numbers of path whose area is 0.5 ! — curious, May 03 '18 at 06:05

ROC curve interpretation

0 Answers0

Linked