2

Consider an experimental situation where each of $n$ participants are randomised to one of two treatments (e.g. active versus placebo). Our measuring device is known to have some variability in the results it gives so we take a series of (repeat) measurements from each participant. We want to examine the treatment effect and, to do so, we resist the urge to average each subject's measurements and instead fit a varying effects model.

My go-to approach in this situation would be to consider a subject level random intercept model that accounts for the repeat measures.

What I am wondering is whether it could make any sense to include a subject level slope for treatment? This would seem at least somewhat reasonable if the participants were given both the placebo and the treatment and data were collected under both conditions. Under that setting, there clearly would be within-subject variation in the responses under the placebo and active treatments. However, at least superficially, this idea seems to not fit when a participant is only ever allocated to either the active treatment or the placebo.

Is my intuition correct or is it misguided? I have reviewed several questions, but am yet to find one that addresses this specific question.

library(lmerTest)
library(mvtnorm)
library(data.table)
library(ggplot2)

getdat <- function(n = 100, mmt = 4){
    sd_u <- c(1, 1.3)
    rho <- matrix(c(1, 0.5, 0.5, 1), ncol = 2)
    S <- diag(sd_u) %*% rho %*% diag(sd_u)
    u <- mvtnorm::rmvnorm(n, rep(0, 2), S)
    id <- rep(1:n, each = mmt)
    trt <- rbinom(n, 1, prob = 0.5)
    trt <- trt[id]
    b <- c(2, 2.5)
    e <- rnorm(n * mmt, 0, 0.7)
    y <- b[1] + u[id, 1] + (b[2] + u[id, 2]) * trt + e
    d <- data.table(id, trt, 
              b1 = b[1], b2 = b[2],  
              u1 = u[id, 1], u2 = u[, 2], e = e, y = y )
    d
}

set.seed(30)
d <- getdat(n = 30, mmt = 30)
# number of observations on each patient
table(treatment = d$trt, subject = d$id)
# wrong
lm1 <- lm(y ~ trt, d); summary(lm1)
# reasonable
lm2 <- lmerTest::lmer(y ~ trt + (1 | id), data = d); summary(lm2)
# misguided?
lm3 <- lmerTest::lmer(y ~ trt + (1 + trt | id), data = d);summary(lm3)

The call to lmer results in a warning indicating that it did not converge, but does return parameter estimates.

bb <- fixef(lm3)
re <- data.table(id = sort(unique(d$id)),
                 ui = ranef(lm3)$id[, 1],
                 us = ranef(lm3)$id[, 2])
re[, mu0 := bb[1] + ui]
re[, mu1 := bb[1] + ui + bb[2] + us]
re <- melt(re[, .(id, mu0, mu1)], id.vars = "id")
re[, trt := rep(0:1, each = length(unique(d$id)))]

ggplot(re, aes(x = trt, y = value, group = id)) +
    geom_line() + theme_bw() + scale_x_continuous(breaks = 0:1)

Yielding

enter image description here

Below is code that reallocate participants to both groups and then fits the model on the same data. Note that I need to check this again as it is not doing what I anticipated it would. So, I will update if/when I can figure it out.

# realloc patients to both trts
d[, trt := rbinom(.N, 1, prob = 0.5)]
table(treatment = d$trt, subject = d$id)
d[, y := b1 + u1 + (b2 + u2) * trt + e]
# seems more reasonable
# equivalent to treatment by id interaction?
lm4 <- lmer(y ~ trt + (1 + trt | id), data = d); summary(lm4) 
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
t-student
  • 720
  • 5
  • 16
  • Note that when running simulations it's a good idea to set a seed, for reproducibility. Your `lme3` model converged without warning the first time I ran it. If you `set.seed(1)` prior to the call to `d – Robert Long Mar 06 '21 at 10:39
  • Take a look at this answer to see how I run these kinds of simulations for mixed models, where it's also necesary to check for non-convergence and convergence to a singular fit.: https://stats.stackexchange.com/questions/489059/obtaining-correlation-between-random-effects-separately-for-2-groups/489181#489181 – Robert Long Mar 06 '21 at 10:44
  • @Robert Long, thanks for your comments. The code was never meant as a simulation, but more as a supplement to the question. However, I get your point and as the code does simulate data I have introduced a seed to provide a reproducible example. I am more interested in whether it is conceptually reasonable to consider random intercept and slope in the setting described. – t-student Mar 07 '21 at 01:18

0 Answers0