I have been looking through this overview of lm/lmer R formulas by @conjugateprior and got confused by the following entry:
Now assume A is random, but B is fixed and B is nested within A.
aov(Y ~ B + Error(A/B), data=d)
Below analogous mixed model formula lmer(Y ~ B + (1 | A:B), data=d)
is provided for the same case.
I do not quite understand what it means. In an experiment where subjects are divided into several groups, we would have a random factor (subjects) nested within a fixed factor (groups). But how can a fixed factor be nested within a random factor? Something fixed nested within random subjects? Is it even possible? If it is not possible, do these R formulas make sense?
This overview is mentioned to be partially based on the personality-project's pages on doing ANOVA in R based itself on this tutorial on repeated measures in R. There the following example for the repeated measures ANOVA is given:
aov(Recall ~ Valence + Error(Subject/Valence), data.ex3)
Here subjects are presented with words of varying valence (factor with three levels) and their recall time is measured. Each subject is presented with words of all three valence levels. I do not see anything nested in this design (it appears crossed, as per the great answer here), and so I would naively think that Error(Subject)
or (1 | Subject)
should be appropriate random term in this case. The Subject/Valence
"nesting" (?) is confusing.
Note that I do understand that Valence
is a within-subject factor. But I think it is not a "nested" factor within subjects (because all subjects experience all three levels of Valence
).
Update. I am exploring questions on CV about coding repeated measures ANOVA in R.
Here the following is used for fixed within-subject/repeated-measures A and random
subject
:summary(aov(Y ~ A + Error(subject/A), data = d)) anova(lme(Y ~ A, random = ~1|subject, data = d))
Here for two fixed within-subject/repeated-measures effects A and B:
summary(aov(Y ~ A*B + Error(subject/(A*B)), data=d)) lmer(Y ~ A*B + (1|subject) + (1|A:subject) + (1|B:subject), data=d)
Here for three within-subject effects A, B, and C:
summary(aov(Y ~ A*B*C + Error(subject/(A*B*C)), data=d)) lmer(Y ~ A*B*C + (1|subject) + (0+A|subject) + (0+B|subject) + (0+C|subject) + (0+A:B|subject) + (0+A:C|subject) + (0+B:C|subject), data = d)
My questions:
- Why
Error(subject/A)
and notError(subject)
? - Is it
(1|subject)
or(1|subject)+(1|A:subject)
or simply(1|A:subject)
? - Is it
(1|subject) + (1|A:subject)
or(1|subject) + (0+A|subject)
, and why not simply(A|subject)
?
By now I have seen some threads that claim that some of these things are equivalent (e.g., the first: a claim that they are the same but an opposite claim on SO; the third: kind of a claim that they are the same). Are they?