I have a linear mixed model specified as follows
model1<-lmer(y~A*B+(1|region/plot)+(A*B|region),data=mydata)
where A is a categorical variable with two levels and B is a continuous variable. The dataset contains 80 observations from 40 plots and 3 regions. On each plot, each of the two levels of A is present. I hypothesize that there is a significant interaction between A and the actual variable of interest, B. I also assume there are regional differences not only in the response to B, but in the response of B in interaction with A, which is why I'm including the term "+(A*B|region)" in addition to the random intercept term regarding regions and plots (there are two observations for every plot and 25-30 plots per region).
The results differ depending on the type of df approximation used:
> anova(model1,type=1)
Analysis of Variance Table of type I with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
A 0.3686 0.3686 1 2.1079 1.0947 0.4004060
B 1.4547 1.4547 1 5.6317 4.3209 0.0859361 .
A:B 6.5694 6.5694 1 25.2556 19.5133 0.0001654 ***
> anova(model1,type=1,ddf="Kenward-Roger")
Analysis of Variance Table of type I with Kenward-Roger
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
A 0.2364 0.2364 1 1.80541 0.7022 0.4983
B 0.5225 0.5225 1 1.03026 1.5520 0.4260
A:B 1.2586 1.2586 1 0.57113 3.7385 0.4223
Am I correctly specifying the random effects terms? Also, why are the ddf so low in the Kenward-Roger approximation? With a different dependent variable I even get 0 ddf with Satterthwaite approximation, resulting in p-values not being computed.
> anova(model2,type=1)
Analysis of Variance Table of type I with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
A 0.27907 0.27907 1 0 0.9559
B 0.18727 0.18727 1 0 0.6414
A:B 2.04150 2.04150 1 0 6.9924
Warning messages:
1: In pf(F.stat, qr(Lc)$rank, nu.F) : NaNs produced
2: In pf(F.stat, qr(Lc)$rank, nu.F) : NaNs produced
3: In pf(F.stat, qr(Lc)$rank, nu.F) : NaNs produced
I feel like I am comitting some serious mistake concerning the second random effects term. Is it eating up too many df? Would a specification as "+(A*B|region)" be correct?