I have a 10k lines dataframes on which I want to perform ANCOVA so I can get adjusted means.
Please note that I've never done this before so I jump from a tutorial to another, but I still want to make it the right way.
So my model is like Y ~ X * sex
, with
Y
the dependant variable (continuous)X
the continuous independant variablesex
the discrete independant variable (here the sex)
Reading this tutorial, I could calculate the Y mean adjusted on X for each sex :
model = aov(Y~sex*X, data=x)
data.predict = data.frame(sex=c("Male", "Female"), X=mean(x$X, na.rm=T))
data.frame(data.predict, Y=predict(model, data.predict))
This gives realistic results, but I realized that anova(aov(Y~sex*X, data=x))
and anova(aov(Y~X*sex, data=x))
give very different results. The calculated means are the same with both models though.
Reading the EdM answer in the question https://stats.stackexchange.com/a/213358/81974, I tried with the car
package and Anova(model, type="III")
, and this time both give the same results.
I don't really understand how it could matter, but it seems that my data are unbalanced (the aov
help "Note" says that it could be misleading).
Knowing this, are the previously calculated adjusted means still usable ?