Setting:
In my study, each of three readers (A
, B
,C
) applies three different qualitative scores (score1
, score2
, score3
, each an ordinal scale) to the same set of 95 cases. score1
for example is a score how the reader rates the severity of a case (1 = not severe, 10 = extremely severe).
For some of the cases (in the example data case 92-95) they apply the score multiple times (i.e., at different time points without a special event between the time points).
Example data:
library("lme4")
#> Loading required package: Matrix
# example data
set.seed(1)
df <- data.frame(reader=rep(c("A","B","C"),each=100),
case=rep((c(rep(1:91),92,92,93,93,94,94,95,95,95)),3),
score1=sample(1:10,300,replace=TRUE),
score2=sample(5:10,300,replace=TRUE),
score3=sample(2:10,300,replace=TRUE),
class=sample(0:1,300,replace=TRUE))
str(df)
#> 'data.frame': 300 obs. of 6 variables:
#> $ reader: chr "A" "A" "A" "A" ...
#> $ case : num 1 2 3 4 5 6 7 8 9 10 ...
#> $ score1: int 9 4 7 1 2 7 2 3 1 5 ...
#> $ score2: int 5 8 8 9 8 5 9 10 9 6 ...
#> $ score3: int 5 6 7 5 3 2 4 2 4 3 ...
#> $ class : int 0 1 1 1 1 1 1 1 0 1 ...
Created on 2022-02-23 by the reprex package (v2.0.1)
Aim:
Now I would like to investigate the association between the scores1-3
(independent variables) and a class
(dependent variable, 1 or 0).
I could do this with a simple logistic regression in R
:
glm(class ~ score1 + score2 + score3, family="binomial", data = df)
However, since the data is clustered/nested, I think I get too low p-values for the independent variables.
Question:
What analyisis is most appropriate to meet the level of nesting of my data?
My solutions:
Averaging
Average eachscore1-3
among the readers and among the cases with multiple measurements and perform a simple logistic regression as mentioned above.
Use a mixed-effects model
I found some advice for nested data: Mixed Effects Model with Nesting and What is the difference between fixed effect, random effect and mixed effect models?
However, since I am new to mixed-effects models I am sure which variable should be considered as fixed and random:
Only reader as random effect
mod1 <- glmer(class ~ score1 + score2 + score3+ (1|reader), family="binomial", data = df)
#> boundary (singular) fit: see help('isSingular')
Reader and case as random effect
mod2 <- glmer(class ~ score1 + score2 + score3+ (1|reader/case), family="binomial", data = df)
#> boundary (singular) fit: see help('isSingular')
I think I get the warning boundary (singular) fit: see help('isSingular')
because the effects are very small in the test data.