7

I have a question regarding statistical evaluation of the AUC. In their paper (http://www.jstor.org/stable/2531595), DeLong et al. describe a method to evaluate AUC curves. (Another good explanation can be found in the book "Statistics with Confidence: Confidence Intervals and Statistical Guidelines" by Altman et al.).

As far as I understood, we compute the $\text{AUC}$ and the standard deviation $\sigma$ of the Kernel matrix. Assuming the normal distribution $\mathcal{N}(\text{AUC},\sigma)$ it is possible to compute confidence intervals.

My question is about the normality assumption:

  1. The $\text{AUC}$ usually lies in the interval $[0,1]$ but the interval for the normal distribtion is $(-Inf, Inf)$. Is this problem really negligible? (This problem e.g. is solved in pROC package by just restricting the CI to $[0,1]$)

  2. The $Beta$ distribution is defined on the interval $[0,1]$ and has the shape parameters $\alpha$ and $\beta$. Can we estimate them given the data like we are able to do it for the AUC?

To give an example: Given a vector c(T,F,F,F,T,F,F,T,F,F) the $\text{AUC} = 0.619$ and $\sigma = 0.237$ which results in 95% CI $(0.156, 1.083)$.

library(pROC)
temp.in <- c(T,F,F,F,T,F,F,T,F,F)
pROC::auc(pROC::roc(controls=which(temp.in), cases=which(!temp.in)))
pROC::ci.auc(pROC::roc(controls=which(temp.in), cases=which(!temp.in)))

Intead of using the normal distribution I would like to use the $Beta$ distribution. But how we can estimate $\alpha$ and $\beta$ for $Beta$ distribution given c(T,F,F,F,T,F,F,T,F,F)?

Drey
  • 894
  • 6
  • 11

1 Answers1

4

An alternative given by [1] is to compute the interval for the logit AUC:

$ log \left( \frac{AUC}{1-AUC} \right) \pm \phi ^{-1} \left( 1 - \frac{\alpha}{2} \right) \frac{\sqrt{AUC}}{AUC(1 - AUC)} $

so that you get an asymmetric interval. In your case, you would get a 95% CI $(0.38, 0.81)$.

If you are frequently dealing with high AUCs and small sample sizes, you may want to have a look at [2] that shows there is no single method that can optimally compute confidence interval for all ROC curves.


[1] Pepe MS, The Statistical Evaluation of Medical Tests for Classification and Prediction, OUP 2003, p. 107

[2] Obuchowski NA, Lieber ML, Confidence bounds when the estimated ROC area is 1.0, Acad Radiol. 2002, 9 (5) p. 526-30

Calimo
  • 2,829
  • 17
  • 26
  • Thank you very much for the alternative way and the hints. Although assymetric intervals are probably better suited to model the CI of the AUC, I think that they are still able to be geater than 1 in some cases. How I would comute $\alpha$ and $\beta$ estimates for $Beta$-distribution is still open. – Drey Feb 27 '14 at 11:40
  • @Drey no they can't possibly be outside [0,1] – Calimo Feb 27 '14 at 13:05
  • Okay, I think I don't understand what the $\phi^{-1}$ stands for. – Drey Feb 28 '14 at 13:58
  • @Drey it is the inverse (or quantile) of the normal CDF... – Calimo Feb 28 '14 at 14:54
  • @Drey $\phi$ being the CDF of $\mathcal{N}$ – Calimo Feb 28 '14 at 16:07
  • Thank you for the explanation. Sorry for my late comment. E.g. for upper confidence interval I try `log10(temp.auc/(1-temp.auc)) + qnorm((1-0.05/2)) * sqrt(temp.auc)/(temp.auc*(1-temp.auc))` in R which equals to 6.749. Unfortunately I cannot access [1] to see what I'm doing wrong. Any ideas? Thanks! – Drey Mar 06 '14 at 11:27
  • Hi, a question from my side. here AUC is a vector of auc values from cross validatiaon? – sveer Dec 11 '20 at 10:36
  • @sveer No, AUC is the scalar value of the AUC for which you want to compute the confidence interval. – Calimo Dec 11 '20 at 10:39
  • @Calimo Thanks for answering, I am bit confused. is it possible to calcualte CI with a single value of AUC? I I have like 10 AUC obrained from Cross Validation of LR and looking for a way to calucalte CI for AUC. as mentioned by Drey, can I use sam formula to obtain CI? – sveer Dec 11 '20 at 15:29