- Note: This question is heavy on R programming, but it was recommended that I post it here after I posted an almost identical question in StackOverflow
Main Question
I'm looking for help correctly setting up a one-way within subjects MANOVA in R for a data-set that has no between-subject factors.
Detailed Question
I'm trying to figure out how to setup a one-way within-subjects MANOVA in R, where my design has a single within-subjects IV (with 2 levels), and 3 DVs. It has come down to a question of whether or not this is best done with the standard manova()
function, or using Anova()
from the car
package. Using a toy example (replicated below), I have done both but get different results, and these differences seem to be associated with how each function is figuring out the appropriate degrees of freedom for the ultimate F-test.
Example
To demonstrate the problem, I'll use a subset of the OBrienKaiser data set, and I'll assume that each of the levels of the Hours
within-subjects factor instead represents the measurement of a different dependent variable. I'll then take the pre
and post
conditions to be the two levels of my single within-subjects independent variable. To keep things concise, I'll only look at the first three levels from Hours
.
So what I have for my data set is 16 subjects measured in two different conditions (pre
and post
) on 3 different dependent variables (1
,2
, and 3
).
data <- subset(OBrienKaiser,select=c(pre.1,pre.2,pre.3,post.1,post.2,post.3))
car::Anova( )
To perform this analysis with Anova()
, I have primarily relied on a combination of the documentation provided with car
, and the slightly more detailed examples found here...
http://socserv.mcmaster.ca/jfox/Books/Companion/appendix/Appendix-Multivariate-Linear-Models.pdf
First, define the within-subjects factor and create the data structure for the linear model.
condition <- as.factor(rep(c('pre','post'),each=3))
idata <- data.frame(condition)
data.model <- with(data,cbind(pre.1,pre.2,pre.3,post.1,post.2,post.3))
Next, define the multivariate-linear model.
mod.mlm <- lm(data.model ~ 1)
Finally, perform the MANOVA using a call to Anova()
and print the results
mav.car <- Anova(mod.mlm,idata=idata,idesign=~condition,type=3)
print(mav.car)
The output is...
Type III Repeated Measures MANOVA Tests: Pillai test statistic
Df test stat approx F num Df den Df Pr(>F)
(Intercept) 1 0.91438 160.189 1 15 2.08e-09 ***
condition 1 0.37062 8.833 1 15 0.009498 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
My issue here is that I don't think the DF have been properly calculated. I remember learning something about MANOVAs losing DF for each DV included in the analysis, but the DF here seem to be typical for a univariate-ANOVA of the same design (i.e., if I didn't have multiple DVs). However, in trying to answer this question myself, I came across a pdf of a user manual for STATA (http://www.stata.com/manuals13/mvmanova.pdf). It presents a problem of measuring 4 DVs for each of 8 trees from 6 different root stocks (i.e., N=48, one between-factor with 6 levels, & DVs=4). They state that for the one-way MANOVA, the DF...
are just as they would be for an ANOVA. Because there are six rootstocks, we have 5 degrees of freedom for the hypothesis. There > are 42 residual degrees of freedom and 47 total degrees of freedom.
stats::manova( )
This method actually comes from the answer to this posted question...
What is the best approach for this set-up: RM ANOVA / MANOVA / Mixed-Models?
...given by @Chris Novak. For demonstration, I'll use the same dataset, but cast it to a long-format to accommodate the requirements of the stats::manova()
function and rename it data2
. I'll omit the actual casting, but the result looks like this...
>some(data2,4)
Subject Condition V1 V2 V3
3 3 pre 5 6 5
16 16 pre 4 5 7
23 7 post 7 7 8
25 9 post 4 5 6
Setting up the MANOVA using stats::manova()
is very similar to setting up a typical repeated-measures anova with that function.
mav.stat <- with(data2,manova(cbind(V1,V2,V3) ~ Condition + Error(Subject/Condition)))
The output looks like this:
Error: Subject
Df Pillai approx F num Df den Df Pr(>F)
Residuals 15
Error: Subject:Condition
Df Pillai approx F num Df den Df Pr(>F)
Condition 1 0.40717 2.9762 3 13 0.07066 .
Residuals 15
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The P-Values are clearly different, as are the numDf and denDf used in the calculations. While I'm inclined to think that this is the correct way of performing the within-subjects MANOVA, I'd like to know what I'm doing wrong in car::Anova()
and how to correctly perform the MANOVA with car::Anova()
. I'd also like to understand how the DF get treated/calculated in the computation of a within-subjects MANOVA. Thanks so much for the guidance.