1

I have a linear mixed model specified as follows

model1<-lmer(y~A*B+(1|region/plot)+(A*B|region),data=mydata)

where A is a categorical variable with two levels and B is a continuous variable. The dataset contains 80 observations from 40 plots and 3 regions. On each plot, each of the two levels of A is present. I hypothesize that there is a significant interaction between A and the actual variable of interest, B. I also assume there are regional differences not only in the response to B, but in the response of B in interaction with A, which is why I'm including the term "+(A*B|region)" in addition to the random intercept term regarding regions and plots (there are two observations for every plot and 25-30 plots per region).

The results differ depending on the type of df approximation used:

> anova(model1,type=1)
Analysis of Variance Table of type I  with  Satterthwaite 
approximation for degrees of freedom
         Sum Sq Mean Sq NumDF   DenDF F.value    Pr(>F)    
A      0.3686  0.3686     1  2.1079  1.0947 0.4004060    
B      1.4547  1.4547     1  5.6317  4.3209 0.0859361 .  
A:B    6.5694  6.5694     1 25.2556 19.5133 0.0001654 ***

> anova(model1,type=1,ddf="Kenward-Roger")
Analysis of Variance Table of type I  with  Kenward-Roger 
approximation for degrees of freedom
         Sum Sq Mean Sq NumDF   DenDF F.value Pr(>F)
A      0.2364  0.2364     1 1.80541  0.7022 0.4983
B      0.5225  0.5225     1 1.03026  1.5520 0.4260
A:B    1.2586  1.2586     1 0.57113  3.7385 0.4223

Am I correctly specifying the random effects terms? Also, why are the ddf so low in the Kenward-Roger approximation? With a different dependent variable I even get 0 ddf with Satterthwaite approximation, resulting in p-values not being computed.

> anova(model2,type=1)
Analysis of Variance Table of type I  with  Satterthwaite 
approximation for degrees of freedom
          Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
A      0.27907 0.27907     1     0  0.9559       
B      0.18727 0.18727     1     0  0.6414       
A:B    2.04150 2.04150     1     0  6.9924   

Warning messages:
1: In pf(F.stat, qr(Lc)$rank, nu.F) : NaNs produced
2: In pf(F.stat, qr(Lc)$rank, nu.F) : NaNs produced
3: In pf(F.stat, qr(Lc)$rank, nu.F) : NaNs produced

I feel like I am comitting some serious mistake concerning the second random effects term. Is it eating up too many df? Would a specification as "+(A*B|region)" be correct?

Jan
  • 65
  • 1
  • 8
  • You really should not use region as a random factor if it only has 3 levels. That's too few. For KR method, you effectively have a sample size of 3 and of course nothing is significant. Try `y ~ A*B*region + (1|plot)` – amoeba Apr 23 '18 at 09:35
  • Hi amoeba, thanks for you quick answer! I was trying to avoid a triple interaction as well as having to discuss the (known) regional differences. But I guess there is no way around including them as a fixed effect. I'll go with your proposed model. – Jan Apr 23 '18 at 10:16

0 Answers0