2

I am fitting a linear mixed effect model (using lmer), and want to obtain p-values via model comparison. As far as I understand it, the procedure is: (i) create a nested model identical to the big model minus one of the variables you want to test; (ii) compare the big model to the nested model, the p-value of this comparison is attributed to the excluded variable; (iii) repeat this for each of the variables (i.e. each nested model is compared to the big model).

What happens when I test for a main effect, and the model includes an interaction term of this factor with some other factor? I learned from other threads it is wrong to have higher-level terms without lower-level ones (this one, or this one, but my mathematical understanding is limited). And yet, it seems that model comparison is standardly done to obtain p-values of main effects even when interaction terms are included (e.g. by using the function mixed() from the afex package with the LRT method, or manually constructing a nested model with an itneraction but without a main effect).

My question is actually twofold:

  1. Is it wrong to rely on model comparison for p-values of main effects when I have an interaction? Or does the function mixed() (with the LRT method) do something other than the procedure described above?

  2. Assuming all contrasts are orthogonal, I don't understand why it is wrong to have an interaction without a main effect. With orthogonal contrasts, isn't the variability explained by interaction independent of the variability of the main effects?

Galit
  • 107
  • 1
  • 9
  • When the interaction between A and B exists, it is hard to define the "main effects" of A and B. So there is nothing to estimate/test. Of course, if you can define them, you can estimate/test them. – user158565 Nov 25 '18 at 16:38
  • The main effects are defined as the mean effect across all the levels of the other effect. But how does this answer my question regarding a nested model which is identical to the big model in everything but a main effect? – Galit Nov 25 '18 at 17:48
  • Suppose 2 factors X and Y, X has level a and b, Y has level 1 and 2 and interaction exist. For response variable Z, the means for 4 different combination of X and Y are $\mu_{a1}, \mu_{a2}, \mu_{b1}, \mu_{b2}$. How to define the main effect of X and Y? – user158565 Nov 25 '18 at 18:01
  • It depends on the coding, but assuming sum coding (as I did), and a balanced sample, the main effect of X is the contrast between (μa1+μa2)/2, or -(μb1+μb2)/2. Likewise, for Y: (μa1+μb1)/2, or -(μa2+μb2)/2. – Galit Nov 25 '18 at 18:48
  • if $\mu_{a1} = -1, \mu_{a2} =1, \mu_{b1} = 1, \mu_{b2} = -1$, then $(\mu_{a1}+\mu_{a2})/2 = 0$ and $-(\mu_{b1}+\mu_{b2})/2 =0$. Is it you want? Mathematically, it is correct; in practice, it always misleads. – user158565 Nov 25 '18 at 19:04
  • Exactly, this is a coding for the interaction. And as you showed, it is orthogonal to the main effect. Therefore, I don't see any reason why not to fit a model without the main effect but leaving in the interaction, and then compare it to the big model that includes everything, thus estimating the amount of independent variability explained by the main effect alone. Then, you are saying it is correct? Why is it misleading? It is only to obtain a p-value. – Galit Nov 25 '18 at 19:55
  • Suppose a new drug for decreasing the weight of the people. After test, it is confirmed that this drug can decrease 10 pounds for male and increase 10 pounds for female after treatment. According to "main effect" this drug has no effect on weight. But the fact is this drug is good for male. – user158565 Nov 25 '18 at 22:13
  • This is a good example. There would be no main effect, but there will be an interaction. Now, leaving interpretation aside, the interaction term captures variability in the data that the main effect (as defined) does not. This should result in a low p-value of the interaction, whether a term for the main effect is included in the model or not. – Galit Nov 26 '18 at 08:16

1 Answers1

1

I don't know what the function mixed does, but if you have an interaction then the test of the main effect is the effect when other variable(s) in the interaction are 0. It is not the usual main effect.

As to why you should include the main effects when you have an interaction, you already linked to threads that answer that question. Was there something specific in those threads that you did not understand? (You note that your "mathematical understanding is limited" - that's OK, so is mine - but it's hard to know how to help if we don't know where you are confused.

Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • Do you mean that neither of the two models that I compare when testing for the main effect should include an interaction term? As for the second question - I saw that people answered about how an interaction without a main effect can be spuriously significant if the main effect is significant, because it "swallows" the variability of the main effect. But fi their variabilities are independent - how does that happen? When contrasts are orthogonal then each accounts for a unique amount of variability. Am I wrong? – Galit Nov 25 '18 at 11:13
  • If your model needs an interaction, then the interaction should be included. But testing for main effects then means something else. But don't drop the interaction if it is needed. As to why to include main effects, it's because the interpretation of an interaction without main effects is very problematic. Maybe I will write a blog post on this - it's a little too long for here. – Peter Flom Nov 25 '18 at 11:21
  • I'd be happy to read this blog post if ever you write it! Just to make sure that I understood your intention - (i) the nested model includes an interaction, but not a main effect (and that's fine for purposes of model comparison?). (ii) the p-value obtained by comparing these two models is not of the main effect, but of something else? .... – Galit Nov 25 '18 at 11:29
  • OK, I'll think about the blog post, but 1) No. You should always include the main effects when you include the interaction. Otherwise, the model will not capture the right thins. The estimates will be screwy. 2) If you don't include a main effect there will be no estimates of it. – Peter Flom Nov 25 '18 at 14:56
  • Now I am confsed. If a main effect should always be included when there is an interaction, then there is no way to fit a nested model which differs from the big model by the main effect alone (because the nested model will have the interaction). Then, how is model comparison employed to estimate the contribution of only the main effect to the model? I.e.: big_model – Galit Nov 25 '18 at 17:46
  • You can't test that. You can test adding the interaction to a model with main effects, but not vice versa. nor is there any reason to test it, as the model with only the interaction should not be used. – Peter Flom Nov 25 '18 at 22:24
  • Oh and here is the blog post: http://www.statisticalanalysisconsulting.com/why-you-should-include-main-effects-when-you-include-an-interaction/ – Peter Flom Nov 25 '18 at 22:25
  • 1
    Hi Peter, many thank on the blog post. I left you a comment there, but I'm not sure it registered. – Galit Dec 06 '18 at 07:48