Suppose we have a naïve single-arm pretest-posttest design without any control group. Every subject
has a pretest
score and a posttest
score. If we want to determine whether there is a significant difference between the means of pretest
and posttest
and how large the difference (or effect size) is, we can carry out a paired t-test. However, there are other confounding variables, such as gender
and age
. How can we control them?
I come up with three models:
- Transform
pretest
andposttest
into a factortest
(withpre
andpost
as its possible values) and a continuous variablescore
(transform the wide data into long data). Then build the mixed-effects modelscore ~ test + gender + age + (1 | subject)
. If we omit thegender
andage
terms, the result (slope oftest
) will exactly be the same as the result of a paired t-test (see Paired data comparison: regression or paired t-test?). - Treat
posttest
as the response variable andpretest
as a covariate. Build the modelposttest ~ pretest + gender + age
(see Repeated measure t test with covariates in R). If we have both a treatment group and a control group, and we want to measure thetreatment
effect, this (posttest ~ treatment + pretest + gender + age
) will be the preferred way to build models (see Best practice when analysing pre-post treatment-control designs). However, what if we have no control group and just want to measure the pretest-posttest difference? And if we indeed use this model, what will be the effect size? (I assume the intercept can be used to calculate the the effect size when the slope ofpretest
is 1; what if it isn't 1?). - Build the model
(posttest − pretest) ~ 1 + I(pretest − mean(pretest)) + gender + age
according to this paper. The intercept can be used to calculate the effect size.
Which model is the most suitable one? And why? Thanks!