First:
what is the difference between the following models in terms of their random-effects structure?
the model:
lmer(math ~ ses*sector + (ses+sector|sch.id), data = hsb)
specifies:
- random intercepts for
sch.id
- random slopes for
ses
and sector
, both varying within sch.id
- the 3 correlations between each of those.
The model:
lmer(math ~ ses*sector + (ses*sector|sch.id), data = hsb)
specifies:
- random intercepts for
sch.id
- random slopes for
ses
, sector
and the interaction between them, all varying within sch.id
- the 6 correlations between each of those.
The model:
lmer(math ~ ses*sector + (ses:sector|sch.id), data = hsb)
specifies:
- random intercepts for
sch.id
- random slopes for the interaction between
ses
and sector
- that is, a slope for each unique combination of ses
and sector
- varying within sch.id
. Note that since sector
is categorical with 2 levels, this will result in 2 random slopes estimates.
- the 3 correlations between each of those. The correlation between each of the random slopes and the intercept should answer the next question:
How should I define my model so that Corr, the correlation between intercepts and slopes, be separately reported for public & private sectors?
Edit:
So, it appears that
lmer(math ~ ses*sector + (ses:sector|sch.id), data = hsb)
will achieve this.
Here is a monte carlo simulation to demonstrate it:
We simulate two datasets, one for each sector, with 20 schools in each.
We set the standard deviations of the random intercepts and slopes to be the same among all schools, but the correlation between them to be different. Here we use -0.4 and 0.4 but the code can easily be adjusted:
First we set up a helper function to let us know if the models have converged normally and if not, why:
# helper function
# Has the model converged ?
hasConverged <- function (mm) {
if ( !class(mm)[1] == "lmerMod") stop("Error must pass a lmerMod object")
retval <- NULL
if(is.null(unlist(mm@optinfo$conv$lme4))) {
retval = 1
}
else {
if (isSingular(m0)) {
retval = 0
} else {
retval = -1
}
}
return(retval)
}
Now we set up the parameters and data needed:
n.school <- 20
dt0 <- expand.grid(sch.id = 1:n.school, ses = 1:10, sector = c("0"))
dt1 <- expand.grid(sch.id = (n.school+1):(n.school*2), ses = 1:10, sector = c("1"))
dt0$Y <- 1
X <- model.matrix( ~ ses, data = dt0)
myFormula <- "Y ~ ses + (ses | sch.id)"
foo <- lFormula(eval(myFormula), dt0)
Z <- t(as.matrix(foo$reTrms$Zt))
betas_0 <- c(1, 2)
betas_1 <- c(1, 3) # use different fixed effect for ses to create an interaction in the combined model
s1 <- 2 # SD of random intercepts
s2 <- 1 # SD of random slopes
rho_0 <- 0.4 # correlation between intercepts and slopes for sector 0
cormat_0 <- matrix(c(s1, rho_0, rho_0, s2), 2, 2) # correlation matrix for sector 0
covmat_0 <- lme4::sdcor2cov(cormat_0) # covariance matrix (needed for mvrnorm)
rho_1 <- -0.4 # correlation between intercepts and slopes for sector 1
cormat_1 <- matrix(c(s1, rho_1, rho_1, s2), 2, 2) # correlation matrix for sector 1
covmat_1 <- lme4::sdcor2cov(cormat_1) # covariance matrix (needed for mvrnorm)
then perform the simulations:
n.sim <- 250
vec.sim.rho_0.sep <- numeric(n.sim)
vec.sim.rho_1.sep <- numeric(n.sim)
vec.sim.rho_0.comb <- numeric(n.sim)
vec.sim.rho_1.comb <- numeric(n.sim)
for (i in 1:n.sim) {
set.seed(i)
# simulate the random effects
umat_0 <- MASS::mvrnorm(n.school, c(0, 0), covmat_0, empirical = TRUE)
u_0 <- c(rbind(umat_0[, 1], umat_0[, 2])) # lme4 needs the random effects in this order interleaved)
e_0 <- rnorm(nrow(dt0), 0, 2) # residual error
dt0$Y <- X %*% betas_0 + Z %*% u_0 + e_0
# fit the model for sector 0
m0 <- lmer(myFormula, dt0)
# simulate random effects for sector 1
umat_1 <- MASS::mvrnorm(n.school, c(0, 0), covmat_1, empirical = TRUE)
u_1 <- c(rbind(umat_1[, 1], umat_1[, 2])) # lme4 needs the random effects in this order interleaved)
e_1 <- rnorm(nrow(dt1), 0, 2) # residual error
dt1$Y <- X %*% betas_1 + Z %*% u_1 + e_1
# fit the model for sector 1
m1 <- lmer(myFormula, dt1)
# combine data:
dt <- rbind(dt0, dt1)
# fit the full model
m <- lmer(Y ~ ses*sector + (ses:sector | sch.id), dt)
If all models converged properly then extract the correlations
if (hasConverged(m0) == 1 && hasConverged(m1) == 1 && hasConverged(m) == 1) {
print(paste(i, "OK", sep = ": "))
vc_0 <- as.data.frame(VarCorr(m0))
vc_1 <- as.data.frame(VarCorr(m1))
vc <- as.data.frame(VarCorr(m))
vec.sim.rho_0.sep[i] <- vc_0$sdcor[3]
vec.sim.rho_1.sep[i] <- vc_1$sdcor[3]
vec.sim.rho_0.comb[i] <- vc$sdcor[4]
vec.sim.rho_1.comb[i] <- vc$sdcor[5]
} else {
print(paste(i, "Not converged", sep = ": "))
vec.sim.rho_0.sep[i] <- NA
vec.sim.rho_1.sep[i] <- NA
vec.sim.rho_0.comb[i] <- NA
vec.sim.rho_1.comb[i] <- NA
}
}
The first two vectors, vec.sim.rho_0.sep
and vec.sim.rho_1.sep
are the estimated correlations from the models run on the seperate datasets.
vec.sim.rho_0.comb
and vec.sim.rho_1.comb
are the estimated correlations from the models run on the full dataset
> mean(vec.sim.rho_0.sep, na.rm = TRUE) ; mean(vec.sim.rho_0.comb, na.rm = TRUE)
[1] 0.4355072
[1] 0.4320562
> mean(vec.sim.rho_1.sep, na.rm = TRUE) ; mean(vec.sim.rho_1.comb, na.rm = TRUE)
[1] -0.4015803
[1] -0.4039854