0

Using the sleepstudy data from the lme4 package I want to do pairwise comparison using the emmeans package.

This is the model:

library(lme4)
lmm <- lmer(Reaction ~ Days + (1 + Days | Subject), sleepstudy)

Now when I want to do pairwise comparison like this, I only get NAs, no pairwise comparisons:

library(emmeans)
lmm.pw <- emmeans(lmm, "Days")
pairs(lmm.pw)

According to the emmeans FAQs, this is due to "Days" being numeric. However, when I try to fit the model with "Days" as factor it gives me the following error:

lmm <- lmer(Reaction ~ as.factor(Days) + (1 + as.factor(Days) | Subject), sleepstudy)

number of observations (=180) <= number of random effects (=180) for term (0 + fDays | Subject); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable

This was already solved on stackexchange but I don't really understand why I should be able to fit the model when "Days" is numeric, but not when it is a factor?

Likewise, why do I get different summary() outputs for these two simpler models, when the only difference is that "Days" is either a factor or numeric?

lmm1 <- lmer(Reaction ~ Days + (1|Subject), sleepstudy)
lmm1f <- lmer(Reaction ~ as.factor(Days) + (1|Subject), sleepstudy)

I am able to get pairwise comparisons from the lmm1f-model. But the fixed effects table that I get by running summary(lmm1f) seems to treat every level of as.factor(Days) as a separate fixed effect (EDIT Dec 30 2020: see here for an explanation of the summary output), which makes me wonder if the model truly does what I want it to do.

Pretty sure I'm overlooking/not understanding some basic stuff. Any help is appreciated, thanks!


I'm using R version 4.0.2 in RStudio Version 1.3.1073 on macOS 11.1.

Packages: lme4 (1.1-23), emmeans (1.5.0)

  • 1
    I think your first step is to figure out what you really want. Since Days is numeric, your initial model fits a fixed slope for Days and a random intercept and slope for each subject. In that context, pairwise comparisons of Day are rather meaningless because you are just comparing points on a straight line, thus the t test for any such comparison is just the t test for the slope of the fitted line, which you can get from the model summary. Your second model appears to have several times as many parameters as observations, so it makes no sense at all. What is your research plan? – Russ Lenth Dec 29 '20 at 20:15
  • @RussLenth Thanks, I think I understand what was meant in that quote now. I am trying to compile an overview on some statistic procedures in R for my work using the sleepstudy data. In this case, my goal is to get an output that gives me p-values between days, as a post hoc test on a repeated measures anova would. Any ideas? – J. Ziegler Dec 30 '20 at 13:18
  • Seems like that is not a suitable dataset for such purposes. – Russ Lenth Dec 30 '20 at 18:06

0 Answers0