Reading multilevel model syntax in intuitive ways in R (lme4)

Question

Below, I have 3 lme4 longitudinal mixed-models. Throughout, y is the response variable, group is a binary indicator for "control" vs. "treatment", therapist (a clustering indicator), subjects (a clustering indicator), and time is the measurement time indicator (e.g., 0, 1, 2, 3).

Question: I was wondering if there is an intuitive way to understand the data/design structure each model describes OR at least the way lme4 understands each model syntax to mean?

# FIRST:
lmer(y ~ time * group + 
                (time | therapist:subjects) +
                (time * group || therapist), 
                 data = data)

# SECOND:
lmer(y ~ time * group + 
                (time | therapist:subjects) +
                (time | therapist) +
                (0 + group + time:group | therapist), 
                 data = data)

# THIRD:
lmer(y ~ time * group + 
                (1 | therapist:subjects) +  
                (0 + time | therapist:subjects) +
                (0 + time:group | therapist) + 
                (0 + group | therapist),
                 data = data)

Does [R's lmer cheat sheet](https://stats.stackexchange.com/q/13166/) help? — gung - Reinstate Monica, Jul 15 '20 at 18:40

Robert Long · Accepted Answer · 2020-07-16T04:37:41.167

Intuitively, the understanding can start with grouping variables or grouping terms. These are the terms that appear on the right side of the | or || in the random parts of the formula. They are often factor variables for which there are repeated measurements, or combinations (interactions) of a factor with another random factor (or indeed a fixed factor). These should be a single term - a vaiable or an combination/interaction, and not multiple terms (ie, you can have (1|A) or (1|A:B) but not (1|A+B) (that would be invalid). In all cases, they can be interpreted as: there is some random variation in the data at the "level" of the variable or combination of variables. We usually want to fit random intercepts for these, which will account for any non-independence due to this variation.

We can think about the number of "levels" in the model by considering the number of unique grouping terms. With a single groupig term we have a 2-level model, with 3 terms, a 3-level model. Some care is needed here because these "levels" might not correspond to levels in a multilevel model. In the analysis of experiments with a factorial design, "levels" don't really apply, so you can have things like y ~ A + B + (1|id) + (1|id:A) + (1|id:B) and there are 3 "levels" of variation, but not in a multilevel modelling sense. See this answer for further details on this. Even in multilevel modelling this can be tricky because of nested and crossed factors. See here for further details on this. Nesting is a property of the study design, not the model.

The terms on the left side of the | or || specify which variable(s) are allowed to vary at the different levels of the grouping term. In the simplest case it is just "1" which means we want only the intercept to vary (so, just random intercepts). If we have other variables in place of the "1" then it means we want those variables (which are typically fixed effects) to vary at each level of the grouping term in addition to the intercept (that is, (time|group) is the same as (1 + time|group). These are random slopes.

If "0" appears on the left hand side of the | or || then it means that random intercepts for the groupng term at not to be fitted (usually because they are fitted by a different term in the model).

Lastly, by default lmer will attempt to estimate a correlation between the random effects. However, if || is specified, then the correlations are not estimated (is they are fixed at zero). This is actually just shorthand. For example (time||group) is the same as (1|group) + (0+time|group) which means we fit random intercepts for group, and we fit random slopes for time, but no random intercept for time, so taken together it means random intercepts for group and slopes for time but with no correlation between them. This also means that randon slopes and random intercepts will be correlated only when they are on opposite sides of a single |.

So, for your specific examples:

lmer(y ~ time * group + 
            (time | therapist:subjects) +
            (time * group || therapist), 
            data = data)

First note we have 2 different grouping terms: the therapist:subjects combination and therapist and this is the case for all 3 models. For the former we also fit random slopes for time (correlated with the random intercepts). For the latter we fit random slopes for time * group but these are not correlated with the random intercept for therapist.

lmer(y ~ time * group + 
            (time | therapist:subjects) +
            (time | therapist) +
            (0 + group + time:group | therapist), 
            data = data)

As with the first model, we have 2 different grouping terms: the therapist:subjects combination and therapist and again we have time as a random slope for the former. For the latter, this time we have random slopes for time (correlated with the random intercepts for therapist, and uncorrelated random slopes for group and time:group.

lmer(y ~ time * group + 
            (1 | therapist:subjects) +  
            (0 + time | therapist:subjects) +
            (0 + time:group | therapist) + 
            (0 + group | therapist),
            data = data)

Again we have 2 different grouping terms: the therapist:subjects combination and therapist. For the former, we have random intercepts for therapist:subjects, and random slopes for time (uncorrelated with the random itercepts). As noted above these could also be written in shorthand as (time || therapist:subjects). For the latter, there are no random intercepts at all (because 0 appears in both fomulae on the left hand side), but we fit random slopes for time:group and group

A few final points to highlight things that are invalid:

lmer(y ~ time * group + (0 | therapist)

is invalid because you can't have a single zero on the left of |. That would mean therapist is a grouping variable (implying we want random intercepts) but then the "0" means don't fit random intercepts. It's a conflict and should generate an error.

lmer(y ~ time * group + (1 | therapist*subject)

is an error because therapist*subject is not a single term - it is shorthand for therapist + subject + therapist:subject so it is eqivalent to

lmer(y ~ time * group + (1 | therapist + subject + therapist:subject)

which is invalid. If you actually wanted each of those terms to be grouping terms then you would use:

lmer(y ~ time * group + (1 | therapist) + (1 subject) + (1| therapist:subject)

Thank you Robert! I have a few questions. So, regarding `(1|group) + (0+time|group)` means we fit random intercepts for `group`, and we fit random slopes for `time`, but no random intercept for `time`, *How would this imply that random intercepts and slopes are uncorrelated?* — rnorouzian, Jul 15 '20 at 23:12
Second question, what is the difference between `time:group` and `time*group` (left of `|`) on the one hand, and `therapist:subjects` and possibly `therapist*subjects` (right of `|`)? — rnorouzian, Jul 16 '20 at 03:26
No problem. Because slopes and intercepts are correlated only when using the `|` operator in the same expression. So the first expression says we fit random intercepts then in the 2nd expression says we fit random slopes — Robert Long, Jul 16 '20 at 03:42
2nd question: the difference between `time:group` and `time*group` (left of `|`) is that in the former only the interaction will have random slopes but n the latter the interaction and both main effrcts will have random slopes... — Robert Long, Jul 16 '20 at 03:47
...and `therapist:subjects` and possibly `therapist*subjects` (right of `|`). The former only the interaction is the grouping term but in the latter the interaction and main effects are the grouping term (which should be invalid as a grouping term should be a single variable or an interaction) — Robert Long, Jul 16 '20 at 03:50
So sorry:) And many thanks, 3rd question: how can you say `"how many levels"` each model entails by looking at its syntax? — rnorouzian, Jul 16 '20 at 03:53
That's simply the number of expressions in the formula that include *unique* grouping terms. So in all 3 of your examples there are 2 levels and since one of them is a main effect and the other an interaction, this implies that subjects are nested in therapists. In a multilevel model framework we would say that subjects are level 1 and therapists level 2. — Robert Long, Jul 16 '20 at 03:58
Btw these are good questions. I will edit the answer to make it a bit more clear. — Robert Long, Jul 16 '20 at 04:01
humm..., but Robert, all these models are longitudinal so we have repeated measures (Level 1), subjects (Level 2), and therapist (Level 3), no? — rnorouzian, Jul 16 '20 at 04:03
True. Yes, my mistake, these are 3-level models ! You are quite right. — Robert Long, Jul 16 '20 at 04:15
Dear Robert, a follow-up, I often see piecewise models (e.g., with `time1`, `time2`) written as `y ~ time1 + time2 + (time1 + time2 | subjects)`. This means that both `time` pieces' slopes and their intercepts vary among subjects, right? Are `(time1 * time2 | subjects)` and `(gender + time1 + time2 | subjects)` also valid specifications? -- Thanks! — rnorouzian, Jul 19 '20 at 21:21
I'd suggest asking a new question on this. In principle they are valid, but in practice there could be issues (does the data support an interaction terms for `time1:time2` varying among subjects - this is at the heart of the "keep it maximal" vs keep it parsimoneous debate) and does it make sense for a between subject variable such as `gender` to vary among subjects ? — Robert Long, Jul 20 '20 at 04:22
Hi Robert, just a quick thing, I believe in the 2nd paragraph (2nd line) in your answer where you say: `*With a single grouping term we have a 2-level model, with 3 terms, a 3-level model*`, you meant to say with **2** terms, a **3-level** model, right? — rnorouzian, Sep 07 '20 at 20:28
Yes indeed that looks like a typo. Feel free to make an edit:) — Robert Long, Sep 07 '20 at 20:39
Yes, because if you combine them, the model will estimate the correlation between the slopes, whereas in `0 + time:group | therapist) + (0 + group | therapist)` there will be no correlation estimated. — Robert Long, Oct 09 '20 at 19:02

Reading multilevel model syntax in intuitive ways in R (lme4)

1 Answers1

Linked