My colleagues and I are working on a suite of lmer
post-estimation tools for a R package we are developing. One of the tools is an ICC function that would calculate the appropriate ICC for models with 1 or 2 specified random factors. Our challenge is to identify when the factors specified in the model are purely nested or crossed because the ICCs one would calculate based on a nested and crossed design are different.
The fundamental problem is that we want to be able to tell whether a lmer
model fit with the random effects specification (1|factor1) + (1|factor2)
is nested or crossed.
How would you suggest we tackle this problem?
Below is some context on lmer
model specification in nested and crossed situations that may be useful if you are not familiar with the package.
lme4
is a very clever package that seems to infer from the data structure the appropriate calculation of the random effect variances. For example, and as documented in other threads, lmer
treats the following nested random effect structures as equivalent if indeed the data structure is nested and the data is coded appropriately:
(1|School) + (1|Student) #Technically this is a crossed specification!
(1|School/Student)
(1|School) + (1|School:Student)
As @BenBolker clearly states here:
Whether you explicitly specify a random effect as nested or not depends (in part) on the way the levels of the random effects are coded. If the ‘lower-level’ random effect is coded with unique levels, then the two syntaxes (1|a/b) (or (1|a)+(1|a:b)) and (1|a)+(1|b) are equivalent. If the lower-level random effect has the same labels within each larger group (e.g. blocks 1, 2, 3, 4 within sites A, B, and C) then the explicit nesting (1|a/b) is required. It seems to be considered best practice to code the nested level uniquely (e.g. A1, A2, …, B1, B2, …) so that confusion between nested and crossed effects is less likely.
The answer by @RobertLong in the CV thread linked above shows this problem and an alternative solution to it.
We want our ICC function to accurately report ICCs for truly nested vs. truly crossed (or cross-classified) models. And there seem to be a lot of moving pieces that we need to figure out. Any thoughts folks have on this are greatly appreciated.