I analyse the effect of a protein level including confounding variables (age and gender) on a joint score. The joint score = each joint in the hand is evaluated and all values are summed. This joint scoring is performed by two independent physicians and both scores are averaged. There is no missing value. The distribution follows a negative binomial assumption and I have designed this model.
mod <- lme4::glm.nb(score ~ age + gender + protein_level,
data = d01)
Data frame with averaged scores.
d01 <- data.frame(subject = 1:10,
gender = c("female", "female", "male",
"female", "female", "female", "male",
"female", "female", "male"),
age = c(74, 78, 62, 62, 77, 66, 66, 52, 60,
60),
protein_level = c(1.23, 1.07, 12.79, 1.75,
11.63, 10.13, 0.89, 7.18, 1.23,
0.29), score = c(18, 9, 30, 24.5, 41,
54.5, 2.5, 11.5, 21.5, 5.5))
However, the averaged values are non-integral and I also receive this warning.
Warning messages: 1: In dpois(y, mu, log = TRUE) : non-integer x = 0.500000, etc.
I suppose that the possibilities below could work:
- Ignore this warning, the model works
- Instead of the average, use the sum of scores. However, I need the average score for a publication. Could I divide model coefficients (the estimated mean and confidence intervals) by two (two physicians)?
- I found that it is possible to use offset(log()), but I do not know how to apply it to my model.
Could you please recommend to me the best one or suggest a better one?
I thank everybody for an answer in advance.
I also attach the data frame with scores by both physicians.
d02 <- data.frame(subject = rep(1:10,2),
eval = rep(c("eval1","eval2"), each = 10),
gender = rep(c("female", "female", "male",
"female", "female", "female", "male", "female",
"female", "male"), 2), age = rep(c(74, 78, 62, 62,
77, 66, 66, 52, 60, 60), 2), protein_level =
rep(c(1.23, 1.07, 12.79, 1.75, 11.63, 10.13, 0.89,
7.18, 1.23, 0.29), 2), score = c(17, 8, 30, 24,
40, 43, 2, 11, 21, 6, 19, 10, 30, 25, 42, 66, 3,
12, 22, 5))