5

I am currently running some analyses on a linguistic data set with a mixed effect model. The problem is, I think that one random factor should be excluded while my colleague thinks it should be included. The two options are:

lmer(intdiff ~ stress * vowel_group + (1|speaker) + (1|word), data)
lmer(intdiff ~ stress * vowel_group + (1|speaker), data)

How do we check which model best fits our data set? It was suggested that I use a likelihood ratio test, but as far as I can tell, there isn't a function in R that can be applied to 2 linear mixed effects models. Is there another way to tell which model is more predictive?

Thanks

Shakesbeery
  • 85
  • 1
  • 1
  • 6
  • 4
    Store your first model in `fit1` and your second model in `fit0` then type `anova(fit0,fit1)` but you need to fit with the `ML` method, not `REML` – Stéphane Laurent Feb 12 '13 at 22:33
  • 2
    "fits best" and results of significance tests are not necessarily particularly related concepts. What do you really mean by 'fits best' in this case? – Glen_b Feb 12 '13 at 22:43
  • 1
    @StéphaneLaurent I was thinking that that was true for LRTs on the fixed effects, but that for tests on the random effects it was still acceptable to use REML. I quickly tried just now to find a source for this and couldn't, but certainly I feel that I have been led to believe this. Not so? – Jake Westfall Feb 13 '13 at 07:33
  • @Jake that's possible. The easiest way to check consists in trying `anova(fit0,fit1)`:) – Stéphane Laurent Feb 13 '13 at 08:45
  • @Jake Glen_b's comment is pertinent. A likelihood ratio test is not really a test for "better fitting". – Stéphane Laurent Feb 13 '13 at 08:46
  • At first thought I would go with Stéphane's original intuition on this; I would also use 'pvals.fnc' in the 'languageR' package to get pMCMC-values to see it they convey similar information. I would also assume that just comparing the AIC scores of the ML-fitted models' is will give you an idea if it's worth including that random effect or not. – usεr11852 Feb 13 '13 at 08:59
  • @Shakesbeery: Check Bayeen's on :[Mixed-effects modeling with crossed random effects for subjects and items][1] if you haven't done already. It is really a classic on the matter. And finally if you use 'pvals.fnc' that I mentioned earlier you'll be able to see your random effects standard deviation, if those are below a JND (just-noticable-difference) threshold most probably they won't really matter for your modelling procedure anyway. [1]: http://www.ualberta.ca/~baayen/publications/baayenDavidsonBates.pdf – usεr11852 Feb 13 '13 at 09:09
  • I think a brief description about the study design and the data structure from Shakesbeery would be more helpful in understanding whether a random for word is valid or not. – AdamO Feb 14 '13 at 16:15

1 Answers1

7

You just use an ANOVA test for this like Stéphane and the help file of the package suggest!

>fm1 <- lmer(intdiff ~ stress * vowel_group + (1|speaker) + (1|word), data)
>fm2 <- lmer(intdiff ~ stress * vowel_group + (1|speaker), data)
>anova(fm1,fm2)

It doesn't matter whether you set the model with the fewest df first or second in the anova command, however it is important in the interpretation of the results that the model with the least df gets preferred in case of a significant difference.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Huub Hoofs
  • 366
  • 1
  • 11