Do the levels of a random effect need to be present in all the levels of a fixed effect?

Question

Study design

In our study, we have a 2x2x2 design with factors Prime (determiner, pronoun), Category (noun, verb) and Masking (masked presentation, unmasked presentation). Our stimulus set included two prime words (determiner 'a' and pronoun 'he') and 80 target words (40 nouns and 40 verbs).

Linear mixed models in R

We have applied linear mixed models to the log-transformed reaction times. We used the package lme4 in R, function lmer(), REML was set to FALSE. To compare the models, we performed the likelihood ratio tests using the anova() function.

As random effects, we included random intercepts for participant (1|Participant), target word (1|Target word) and block number (1|Block).

Question

For example, in the model

lmer(lgRT  ~ Category*Prime + (1|Participant) + (1|Target word) + 
        (1|Block), data=my_data_clean_masked, REML=FALSE)

Is it legitimate to include (1|Target word) as a random effect given that:

(1) the fixed effect "Category" has two levels: noun and verb;

(2) however, not all target words fall in the category NOUN (or VERB) because half of the target words was nouns and the other half was verbs?

In other words, nouns among the target words never belong to the word category VERB and verbs among the target words never belong to the word category NOUN.

The residuals plot looks fine when three random effects are included ((1|Participant) + (1|Target word) + (1|Block)):

enter image description here

However, excluding the random effect (1|Target word) creates patterns on the residuals plot (the homoscedasticity seems to be violated, right?):

enter image description here

Ben Bolker · Accepted Answer · 2021-11-04T20:26:42.163

This is fine. Another, slightly more formal way of describing your design is "The fixed-effect covariate Category varies within levels of the grouping variables Participant and Block, but only across the levels of the grouping variable Target_word." In the classical ANOVA world this would be referred to as a split-plot design. The good news is that 'modern' mixed-model packages (lme4, nlme, etc etc in R, SAS PROC MIXED ...) can handle these designs without any special input from the user.

The patterns that you see in your second fitted-vs-residuals plot are harmless, I think. They look more like patterns in the distribution of the fitted values, i.e. the marginal distribution along the x-axis, which are irrelevant; the mixed model framework makes no particular assumptions about this distribution.

Heteroscedasticity is generally assessed using a location-scale plot, which you can generate using this function:

scale_loc_plot <- function(m, line.col = "red", line.lty = 1,
                           line.lwd = 2) {
  plot(fm1, sqrt(abs(resid(.))) ~ fitted(.),
       type = c("p", "smooth"),
       par.settings = list(plot.line =
                             list(alpha=1, col = line.col,
                                  lty = line.lty, lwd = line.lwd)))
}

Dear Ben, thank you very much for your answer! – Elena Nov 05 '21 at 09:59 — Elena, Nov 05 '21 at 09:59

Do the levels of a random effect need to be present in all the levels of a fixed effect?

Study design

Linear mixed models in R

Question

1 Answers1

Linked