1

If we dont have access to model and have just actual and predicted labels without probabilities, is it still be possible to plot AUC/ROC curve.

For example can we have the curve from the following information (>1000 values in array in actual)

actual = ["C1","C1","C2","C1","C2"]
predicted = ["C2","C1","C2","C1","C1"]

Or is it necessary to have access to probabilities instead of predicted labels?

A.B
  • 187
  • 7
  • 1
    Your example gives one point on the ROC curve. You can invent two other points, one with all the predictions `C1` and another with all the predictions `C2`. Whether you think three points leading to two line-segments is really an ROC curve is up to you; I would say probably not, as I think you should be adjusting discrimination thresholds with a little more sophistication than this – Henry Aug 05 '20 at 14:39
  • 1
    If you assign C1 and C2 distinct numerical values, then you can follow the instructions here https://stats.stackexchange.com/questions/145566/how-to-calculate-area-under-the-curve-auc-or-the-c-statistic-by-hand to draw a three-point ROC curve with AUC $\approx 0.583$. – Sycorax Aug 05 '20 at 16:32
  • @Sycorax, thankyou for the comment, but where will I bring the probability(0-1) that author has as "predicted retention status" in this question, I don't have access to such number – A.B Aug 06 '20 at 03:02
  • You don't need it, because ROC curves and AUC are statistics of *ranks*. Replace all instances of C1 with $0$ and all instances of C2 with $1$. That's what I mean by "assign C1 and C2 distinct numerical values." – Sycorax Aug 06 '20 at 03:10
  • Okay thanks, I need to replace in both `actual` and `predicted`? – A.B Aug 06 '20 at 03:30

0 Answers0