I would like to calculate the variance of the AUC of readers
(for each reader and averaged results) giving a score
(1-5) to specific areas
(1-5) of cases
using two different modalities
.
Test data:
df <-data.frame(
modality=rep(c("A","B"),each=25,2),
reader=rep(c("reader1","reader2"),each=50),
case= rep(c("1","2","3","4","5"),each=5,2),
area=rep(c("a1","a2","a3","a4","a5"),20),
score=sample(1:5,100, replace=TRUE),
disease=rep(sample(0:1,25, replace=TRUE),4),
stringsAsFactors=FALSE)
> df
modality reader case area score disease
1 A reader1 1 a1 5 0
2 A reader1 1 a2 3 0
3 A reader1 1 a3 4 1
4 A reader1 1 a4 2 0
5 A reader1 1 a5 5 1
6 A reader1 2 a1 1 1
7 A reader1 2 a2 3 1
Suggested Method (from 1):
DBM refers to (2)
Davison and Hinkley refers to (3)
BWC refers to (4)
What I know
I found a solution how to do the first step (two-way bootstrap): Bootstrapping hierarchical/multilevel data (resampling clusters)
# dplyr
library(dplyr)
replicate(100, {
cluster_sample <- data.frame(case= sample(df$case, replace = TRUE))
dat_sample <- df %>% inner_join(cluster_sample, by = "case")
dat_sample
})
I know how to calculate the AUC for reader 1 and 2 and the average AUC
library(pROC)
roc1 <- roc(df[which(df$reader=="reader1"&df$modality=="A"),]$disease, df[which(df$reader=="reader1"&df$modality=="A"),]$score)
roc1 <- roc(df[which(df$reader=="reader2"&df$modality=="A"),]$disease, df[which(df$reader=="reader2"&df$modality=="A"),]$score)
rocm <- multiclass.roc(df[which(df$modality=="A"),]$disease, df[which(df$modality=="A"),]$score)
What I cannot replicate
I do not know how to apply the second part of the method stated above to get the correct variances (yellow marked part)
Reference
Gallas BD, Bandos A, Samuelson FW, Wagner RF. A Framework for Random-Effects ROC Analysis: Biases with the Bootstrap and Other Variance Estimators. Communications in Statistics - Theory and Methods. 2009 Jul 23;38(15):2586–603. https://www.tandfonline.com/doi/abs/10.1080/03610920802610084
Dorfman DD, Berbaum KS, Metz CE. Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. Invest Radiol. 1992 Sep;27(9):723–31.
Davidson, A. C., Hinkley, D. V. (1997). Bootstrap methods and their applications. Cambridge University press.
Beiden SV, Wagner RF, Campbell G. Components-of-variance models and multiple-bootstrap experiments: An alternative method for random-effects, receiver operating characteristic analysis. Academic Radiology. 2000 May;7(5):341–9. https://www.academicradiology.org/article/S1076-6332(00)80008-2/pdf