6

I am sorry if this answer has been answered before but most answers here (e.g. here, or here ) do not really adress my issue (or maybe I just do not see correctly how they do. I want to use a (linear) mixed effects analysis on my data. My design is quite straightforward:

  • I have two kinds of bacteria (M1 and M2) that are either paired with 8 other bacteria (S1) or not (S2).
  • In case of S1 quadruplicates bottles were measured and in case of S2 just triplicates.
  • The output variable is measured at multiple timepoints (repeated measures) for approx 72h.
  • Then a defined amount of biomass is taken to inoculate new bottles and so on for a total of 4 cycles (SB1 to 4). Hence the measurements of consecutive cycles are not independent (corelation).
  • The amount of timepoints per cycle is not always the same (unbalanced)

Ok, maybe not that straightforward, let's clarify with a picture: Design According to one of my colleagues with a lot of experience in mixed models this is a split-split plot design with repeated measures. Sadly, she is a SAS user and I would love to use R (lme or lmer) for the analysis. However, I'm not here for the translation (that would be something for stack overflow I suppose, although if anyone here could answer it would be nice) but rather for understanding whether my design is indeed split-split with unbalanced repeated measures and how I can construct an adequate model to analyze it in terms of nested, fixed and random effects.

Here is wat I have so far with some dummy data (I try to use split-split plot nomenclature):

output <- rnorm(266)
mainplot <- c(rep("M1",133),rep("M2",133))
subplot <- c(rep("S1",76),rep("S2",57),
             rep("S1",76),rep("S2",57))
subsubplot <-c(rep("SB1",24),rep("SB2",16),rep("SB3",20),rep("SB4",16),
               rep("SB1",18),rep("SB2",12),rep("SB3",15),rep("SB4",12),
               rep("SB1",24),rep("SB2",16),rep("SB3",20),rep("SB4",16),
               rep("SB1",18),rep("SB2",12),rep("SB3",15),rep("SB4",12))
time<-rnorm(266)

mockset <- data.frame(output,mainplot,subplot,subsubplot,time)

mock.lmefit1 <- lme(output ~ 0 + mainplot * subplot * subsubplot, 
               data = mockset,
               random = ~ time|mainplot/subplot/subsubplot)

The mockset really represents my "unbalanced repeated measures" data structure. In the model above I do not account for my "autocorrelation" between subsubplots (i.e. Cycles SB1 through 4) nor do I correct explicitely for the "unbalancedness" of my repeated measures. I am not interested in my intercept.

My colleague suggested to use the following SAS code:

proc mixed data=rdata;
class mainplot subplot subsubplot time;
model output=  mainplot subplot
               subsubplot subplot*subsubplot
               time subplot*time
               subplot*subsubplot*time/ddfm=KENWARDROGER;
random mainplot*subplot mainplot*subplot*subsubplot;
lsmeans subsubplot subplot subplot*subsubplot
        time subplot*time subsubplot*time
        subplot*subsubplot*time;
run;

If I got it right the * in SAS is analogous to the R :, i.e. it shows the interaction between terms. To me it is not exactly clear how to port this model from SAS to R. I do also not really get why specific effects are fixed here and others are random. One more thing that I do not have an option to give in into R is the ddfm (denominator degrees of freedom for the tests of fixed effects) option of SAS. What is the importance of this? Does anyone know how lme deals with this?

mock.lmefitSAS <- lme(output ~ mainplot + subplot +
                        subsubplot + subplot:subsubplot +
                        time + subplot:time + 
                        subsubplot:time +
                    subplot:subsubplot:time,
                data = mockset,
                random = ~ 1|mainplot:subplot + mainplot:subplot:subsubplot)

This code does not work, because this is not the proper way to give in interaction terms in lme I guess (? error is "invalid formula for groups") but as stated before I'm not here for a SAS-to-R translation... I would just like to know if and why the SAS model is more correct given my design and what my R implementation is lacking.

FM Kerckhof
  • 183
  • 1
  • 6

1 Answers1

3

The one practical thing I can tell you is that the denominator degrees-of-freedom business is available for lme4 models, using the pbkrtest package or various wrappers for it: see the ?pvalues man page from recent versions of lme4.

library("lme4")
options(contrasts=c("contr.sum","contr.poly"))
m1 <- lmer(output~0+mainplot*subplot*time +
 (time|mainplot:subplot:subsubplot),
           data=mockset)
library("car")
Anova(m1,type="3",test="F")

For the rest, I pretty much just have to agree with you. I'm a bit surprised by your colleague's specification -- I independently reconstructed your R-style syntax before I saw yours, and it doesn't make much sense to me to treat terms involving subsubplot (bottle) as fixed, or to treat mainplot*subplot as random (for both philosophical and practical, i.e. insufficient-replication, reasons). Perhaps she could chime in?

Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
  • Hello, why use (time|mainplot:subplot:subsubplot) and not (time|mainplot/subplot/subsubplot) ? – skan Jul 28 '16 at 19:57
  • because the effects of `mainplot:time` and `mainplot:subplot:time` are already included in the fixed effect term. Including them in the random effects (as would be done if you used the nested notation, which expands to `(time|mainplot) + (time|mainplot:subplot) + (time|main:subplot:subsubplot)`) would lead to an overdetermined model. – Ben Bolker Jul 28 '16 at 20:02
  • And what's the difference between (time|mainplot/subplot/subsubplot) and (time|mainplot*subplot*subsubplot)? I know the first one is the nested notation and the second one is the interaction, but when you expand them they look the same. – skan Jul 29 '16 at 10:51
  • 1
    I think the commenting system mangled your question a little bit, but `(x|a*b*c)` is a *crossed random effect*. `lme4` doesn't actually know how to deal with this correctly at present, but it should expand to `(x|a)+(x|b)+(x|c) + (x|a:b) + (x|a:c) + (x|b:c) + (x|a:b:c)`. I'd encourage you to ask another question! – Ben Bolker Jul 29 '16 at 15:19