Consider an experimental situation where each of $n$ participants are randomised to one of two treatments (e.g. active versus placebo). Our measuring device is known to have some variability in the results it gives so we take a series of (repeat) measurements from each participant. We want to examine the treatment effect and, to do so, we resist the urge to average each subject's measurements and instead fit a varying effects model.
My go-to approach in this situation would be to consider a subject level random intercept model that accounts for the repeat measures.
What I am wondering is whether it could make any sense to include a subject level slope for treatment? This would seem at least somewhat reasonable if the participants were given both the placebo and the treatment and data were collected under both conditions. Under that setting, there clearly would be within-subject variation in the responses under the placebo and active treatments. However, at least superficially, this idea seems to not fit when a participant is only ever allocated to either the active treatment or the placebo.
Is my intuition correct or is it misguided? I have reviewed several questions, but am yet to find one that addresses this specific question.
library(lmerTest)
library(mvtnorm)
library(data.table)
library(ggplot2)
getdat <- function(n = 100, mmt = 4){
sd_u <- c(1, 1.3)
rho <- matrix(c(1, 0.5, 0.5, 1), ncol = 2)
S <- diag(sd_u) %*% rho %*% diag(sd_u)
u <- mvtnorm::rmvnorm(n, rep(0, 2), S)
id <- rep(1:n, each = mmt)
trt <- rbinom(n, 1, prob = 0.5)
trt <- trt[id]
b <- c(2, 2.5)
e <- rnorm(n * mmt, 0, 0.7)
y <- b[1] + u[id, 1] + (b[2] + u[id, 2]) * trt + e
d <- data.table(id, trt,
b1 = b[1], b2 = b[2],
u1 = u[id, 1], u2 = u[, 2], e = e, y = y )
d
}
set.seed(30)
d <- getdat(n = 30, mmt = 30)
# number of observations on each patient
table(treatment = d$trt, subject = d$id)
# wrong
lm1 <- lm(y ~ trt, d); summary(lm1)
# reasonable
lm2 <- lmerTest::lmer(y ~ trt + (1 | id), data = d); summary(lm2)
# misguided?
lm3 <- lmerTest::lmer(y ~ trt + (1 + trt | id), data = d);summary(lm3)
The call to lmer
results in a warning indicating that it did not converge, but does return parameter estimates.
bb <- fixef(lm3)
re <- data.table(id = sort(unique(d$id)),
ui = ranef(lm3)$id[, 1],
us = ranef(lm3)$id[, 2])
re[, mu0 := bb[1] + ui]
re[, mu1 := bb[1] + ui + bb[2] + us]
re <- melt(re[, .(id, mu0, mu1)], id.vars = "id")
re[, trt := rep(0:1, each = length(unique(d$id)))]
ggplot(re, aes(x = trt, y = value, group = id)) +
geom_line() + theme_bw() + scale_x_continuous(breaks = 0:1)
Yielding
Below is code that reallocate participants to both groups and then fits the model on the same data. Note that I need to check this again as it is not doing what I anticipated it would. So, I will update if/when I can figure it out.
# realloc patients to both trts
d[, trt := rbinom(.N, 1, prob = 0.5)]
table(treatment = d$trt, subject = d$id)
d[, y := b1 + u1 + (b2 + u2) * trt + e]
# seems more reasonable
# equivalent to treatment by id interaction?
lm4 <- lmer(y ~ trt + (1 + trt | id), data = d); summary(lm4)