I'm working with the following data frame using R. It consists of measurements obtained from 7 subjects with two independent variables (IV1
and IV2
) with two levels each (OFF/ON, ALT/ISO, respectively):
>myData
Subject DV IV1 IV2
1 2.567839 OFF ALT
1 58.708027 ON ALT
1 44.504265 OFF ISO
1 109.555701 ON ISO
2 99.043735 OFF ALT
2 75.958737 ON ALT
2 182.727396 OFF ISO
2 364.725795 ON ISO
3 45.788988 OFF ALT
3 52.941263 ON ALT
3 54.719013 OFF ISO
3 41.909909 ON ISO
4 116.145279 OFF ALT
4 162.927971 ON ALT
4 34.162077 OFF ISO
4 74.029748 ON ISO
5 114.412913 OFF ALT
5 121.127983 ON ALT
5 192.379708 OFF ISO
5 229.192453 ON ISO
6 213.421076 OFF ALT
6 526.739206 ON ALT
6 150.596812 OFF ISO
6 217.931951 ON ISO
7 117.931273 OFF ALT
7 102.467813 ON ALT
7 57.823062 OFF ISO
7 85.181033 ON ISO
(1) Is this a repeated measures (RM) design? Some folks have mentioned that it is not since it isn't a longitudinal study, but I thought that as long as there are measurements from each experimental unit for every single level of a factor, one can say this as a RM design. What is correct? Also, is an RM design synonymous with having a within-subject factor?
(2) I'm interested in both the main and the interaction effects of IV1
and IV2
, but due to having measurements from each subject for all level combinations, I think I have to include Subject
as a random effect. I have looked at aov and lmer but I'm confused about the difference in syntax:
This cheat sheet recommends:
m1 <- aov(DV ~ IV1 * IV2 + Error(Subject / (IV1 * IV2)), myData)
However it's not clear to me whether Error(x / (y * z))
means x is a random effect and y and z are nested in x. Is this interpretation correct? If so, would m1
be inappropriate for my data since my data isn't nested, but fully crossed? And if so, would
m2 <- aov(DV ~ IV1 * IV2 + Error(Subject), myData)
be the correct syntax? I have also been told that in m2
the Error
term should be dropped - is this correct?
(3) In a previous question I was told the linear mixed effects model
m3 <- lmer(DV ~ IV1 * IV2 + (1 | Subject), myData)
was appropriate more my data. Just to better understand lmer syntax: if I had n subjects and for each subject measurements were obtained for both levels of IV2
but half of the subjects were OFF
and the other half ON
, would the model be
m4 <- lmer(DV ~ IV1 * IV2 + (1 | Subject / IV1), data = myData)
? And if there was only one measurement per IV1*IV2
combination, would that mean this is no longer a repeated-measures design and therefore the model is just
m5 <- lmer(DV ~ IV1 * IV2, data = myData)
? In which case lm
would probably suffice.