11

Beginner user of R here struggling with a repeated measures ANOVA.

I have a dataset that consists of one between subjects factor with 4 levels (coded in a single variable called 'groups'), and one within subjects factor with 4 levels (coded in four separate variables 'DV1', 'DV2', 'DV3', 'DV4').

I have the following objectives:

  1. Run an overall repeated measures ANOVA.
  2. Compare groups using custom contrasts (as in an LMATRIX command in SPSS).
  3. Compare different levels of DV using custom contrasts (as in an MMATRIX command in SPSS).
  4. Do a combination of 2) and 3) simultaneously so I am comparing only certain groups at certain levels of the within-subjects factor.
  5. Run a set of contrasts that do NOT sum to zero.

I know I can do this in SPSS without much of a problem, but I can't get a clear idea of how to do this in R. I've seen how parts of this could work in different packages, but I have not so far seen how this could work within one procedure or a set of related procedures in R.

Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
aquadhere
  • 113
  • 1
  • 5

1 Answers1

18

A sketch of one solution (for another see below):

  1. Data needs to be in the long format (i.e., on value per row) instead of in the wide format as in SPSS (i.e., one subject per row), see the reshape package, or ?reshape. That includes that there needs to be a variable indicating the subject identifier (i.e., subject id).
  2. All factors (including the subject identifier) must be of class factor (run str on your data frame to check this). If you don't do this your results will be wrong.
  3. If you want to obtain type-III sums of squares set the default contrasts to effect coding:
    options(contrasts=c("contr.sum","contr.poly"))
  4. Specify the desired model with lme from the nlme package (install and load the package beforehands via install.packages("nlme") and library(nlme)) using a compund symmetric correlation structure. See the answer and especially my comment to the accepted answer to this question. In your case that could be something like (if you would have provided sample data, which is strongly recommended, you would have received the correct code):
    my.anova <- lme(dv ~ group*within, data = your.df, random = ~1|id, correlation = corCompSymm(form = ~1|id))
  5. Use the generic anova function to obtain the anova table (see ?anova.lme):
    anova(my.anova)
    To obtain type-III sums of squares use the anova command with argument type set to "marginal" (this only works if contrasts are set to effects coding, see point 3):
    anova(my.anova, type = "marginal")
  6. The fitted object of type lme now allows diverse functions to perform contrasts. The most flexible solution (but a rather unhandy one) is the L argument in a call to anova.lme (see again ?anova.lme).
    Other solutions also require a fitted lme object as an argument:
    Also very flexible is the estimable function from the gmodels package. This package also offers the fit.contrasts function.
    The multcomp package allows contrasts using alpha-error adjustment (but you can only perform contrasts using one of your factors), using the glht function.
    A new and promising approach is the contrast package, however, so far it does not seem to privde all possible contrasts.

An alternative solution is to use standard ANOVA via the combination of afex and lsmeans as outlined in the afex-vignette.

Henrik
  • 13,314
  • 9
  • 63
  • 123
  • (+1) Great and very instructive response. Waiting for the blog post... – chl Sep 14 '11 at 10:49
  • `?anova.lme` doesn't work for me and `methods(anova)` lists it as a non-visible function. – John Sep 14 '11 at 12:04
  • @John Did you load `nlme` before? If not, run `library(nlme)`, then it should work. If it is still not working, `install.packages("nlme")` first. – Henrik Sep 14 '11 at 12:06
  • ah... I hadn't apparently, only lme4 – John Sep 14 '11 at 13:13
  • @ Henrick, I can go up to step 5, but I am not able to set up contrasts comparing specific between-subjects factors at specific within-subject factor levels. All of the online documentation I've seen has to do with testing differences at different levels of between-subjects factors. Do you have any example code? I can just wait for the blog post if you are going to cover it there. – aquadhere Sep 15 '11 at 14:18
  • @aquadhere Actually there should be no difference on whether a variable is between- or within-subjects. If your within variable is just a single factor or column in your data frame (which it should be). Then just use this one as if it would be between. – Henrik Sep 15 '11 at 15:09
  • @Henrick - ah yes, I figured as much, but all the examples I've seen only test one factor - is there a way to test two factors simultaneously? That's where I get stuck. – aquadhere Sep 15 '11 at 17:18
  • @aquadhere that is where it gets tricky. As far as I get it, the `L` argument and `estimable()` should be able to do so. Also the `contrast` package. The best thing is, try to understand the `L` argument. Therefore have a look at `summary(my.anova)`. – Henrik Sep 16 '11 at 07:07
  • +1. But "I will update this answer with a link to this post when it is done (hopefully this week)" line sounds a bit strange when it's almost 5 years later :-) Apart from that, here you essentially suggest to use a mixed model with enforced compound symmetry to simulate the RM-ANOVA behaviour. I know they give identical results when the data are balanced (same $n$ in all ANOVA cells), but do they also coincide in the unbalanced case? I am reading up on that, but I was under impression that they do NOT coincide then (I am not sure though). – amoeba May 09 '16 at 13:53
  • @amoeba I have removed the reference. It is true, that was a little odd. I also have meanwhile found a better solution outlined in the [afex vignette](https://cran.rstudio.com/web/packages/afex/vignettes/anova_posthoc.html). – Henrik May 11 '16 at 14:24
  • Thanks @Henrik. I would still be interested in your answer to my question about equivalence of RM-ANOVA and a mixed-model when the data are unbalanced. Or perhaps you can point me to some references where I can read up on that? – amoeba May 11 '16 at 14:36
  • @amoeba no, they are not equivalent due to enforcing a correlation above 0, see [here](http://stats.stackexchange.com/q/14088/442). Interestingly, the same also applies to `lme4` models, see [here](http://stats.stackexchange.com/q/117660/442). – Henrik May 11 '16 at 14:45
  • @Henrik, yes, I saw these threads and even gave bounty to Aaron's answer in the first one. This issue of positive correlation is general and relates to both balanced and unbalanced designs. I am asking about a situation in which this is not an issue (either because correlations happen to be positive or because lmer is run with enforcing compound symmetry that allows negative correlations). So, in this situation when data are balanced the results of RM-ANOVA and of mixed-model are the same (by "results" I guess I mean F- and p-values). I am wondering if this is still true for unbalanced data. – amoeba May 11 '16 at 16:15
  • @amoeba Sorry, I cannot give you an authoritative answer on this. – Henrik May 11 '16 at 18:42