3

I am conducting a meta-analysis of diagnostic test accuracy studies comparing myocardial perfusion scintigraphy vs coronary angiography using the mada R package. I have completed the computations for the Reitsma bivariate model, as follows:

install.packages(mada)
library(mada)

author_year <- c("Ben-Haim 2010", "Duvall 2011", "Fiechter 2011", "Ben Haim 2014", 
                 "Chowdhury 2014", "Duvall 2014", "Goto 2014", "Mouden 2014",
                 "Nishiyama 2014", "Barone-Rochette 2015", "Gimelli 2015",
                 "Liu 2015", "Nakazato 2015", "Perrin 2015", "Shiraishi 2015", 
                 "Sharir 2016")
TP <- c(4, 121, 44, 7, 74, 31, 51, 12, 46, 73, 103, 27, 31, 93, 12, 122)
TN <- c(0, 38, 10, 10, 61, 37, 187, 61, 18, 13 ,14, 130, 23, 26, 32, 111)
FP <- c(1, 65, 5, 1, 16, 23, 73, 19, 4, 13, 10, 46, 5, 17, 9, 23)
FN <- c(0, 6, 7, 1, 14, 24, 11, 8, 8, 5, 10, 8, 8, 13, 2, 15)
mydata <- data.frame(author_year, cbind(TP, TN, FP, FN))
mydata

fit.reitsma <- reitsma(mydata)
summary(fit.reitsma)

However, the AUC is only provided as point estimate (0.833).

I would like to obtain also 95% confidence intervals for the AUC, but the package does not seem to provide them, nor a standard error to obtain them indirectly. I guess it could be done with bootstrap (eg the boot package), but I am unable to do it.

Can you give me some suggestions?

Giuseppe Biondi-Zoccai
  • 2,244
  • 3
  • 19
  • 48

1 Answers1

2

Most of the following discussion draws from Wojtek J. Krzanwoski and David J. Hand, ROC Curves for Continuous Data, except where noted otherwise. This is intended as a summary, rather than a comprehensive reproduction of the chapter.

Confidence intervals are constructed from sampling distributions, the distribution of possible results under repeated sampling. The essential intuition here is that the ROC curve could have a different shape, and therefore a different area, were the model composed of different data, or the holdout set were different.

The ROC curve is a complicated function: it's shape-constrained to be non-decreasing, and it must go from $(0,0)$ to $(1,1)$. This is why there's no simple answer.

The beta model

In this question, I develop the exact sampling distribution of ROC curves under a beta model for each operating point. The essential contribution is that ROC curves can be viewed as shape-constrained mixture data. These results are superior to bootstrapping even in small samples (because the justification for bootstrapping is asymptotic) as well as being computationally cheaper.

The binomial model

A point $(\hat{fp},\hat{tp})$ on some specific ROC curve estimates the "true" point $(fp,tp)$, which arises when a particular classification threshold is applied to the model score. There are three types of variability that can occur: variability in $\hat{fp}$, $\hat{tp}$, or both. Note that "both" only arises as an option as the outcome of two events, variability in $\hat{fp}$ and $\hat{tp}$. If the negative and positive populations are independent $tp$ and $fp$ are independent binomial events, so we can use standard exact binomial results. Therefore, a $100(1-\alpha)\%$ interval in both estimates is a rectangle with sides which are $100(1-\tilde{\alpha})\%$ intervals for $tp$ and $fp$ separately for $\tilde{\alpha}=1-\sqrt{1-\alpha}.$

If we cannot assume independence in the negative and positive populations, then we have to use data-based methods.

There are also results for the case of considering only variation in $tp$ and $fp$ separately, but I don't think that's of interest here. They're also reported in the book.

Parametric curve-fitting

I don't know much about the Reitsma bivariate model, but it is a maximum likelihood method which therefore assumes a specific distribution for each of the positive and negative populations. Therefore, the vector of estimated parameters has a distribution $\mathcal{N}(\theta,\mathcal{I}^{-1})$ where $\mathcal{I}$ is the Fisher information of the estimated parameters. The ROC curve is therefore a function of all the estimated model parameters. Thus, we can get confidence intervals from the assumption that the two sample are independent, the delta method to obtain the variance, and some tedious manipulation and calculation.

Sycorax
  • 76,417
  • 20
  • 189
  • 313