Is the statistical power of the Between-subjects effects in a Mixed ANOVA higher or the same compared to an independent samples t-test?

Question

We have data on a physiological variable of interest (Metabolic Cost of Walking) from 2 groups of subjects (10 young adults and 10 old adults). We measured each one of them twice, once in the morning and once in the afternoon of the same day. So we have multiple datapoints for every subject for the variable (20 datapoints of the variable for the 10 young adults and 20 datapoints of the same variable for the 10 old adults).

We are using a Mixed ANOVA to check for both the within-subjects and between-subjects effects (old v/s young). I want to know can we achieve higher statistical power using this design for the between-subjects effects (as we have 2 datapoints per subject for the variable) compared to what would we have achieved if we would have performed a parametric/non-parametric independent samples t-test to test the between-subjects effects (old vs young) in the same variable with 1 datapoint per subject in a typical experiment?

Note: We have not found an interaction effect between age (old-young) and the session (AM-PM) difference, when we checked for the within-subjects effects.

The answer would depend on whether you have found an interaction between age and the AM-PM difference, as discussed with respect to [your related question](https://stats.stackexchange.com/q/462521/28500). That is, is there a _single_ old-versus-young difference, or does it depend on time of day? Please add that information by editing this question, as if it goes in a comment it can be harder to find and might get lost. — EdM, Jun 12 '20 at 13:57

score 1 · Answer 1 · answered Jun 12 '20 at 15:39

First, two measurements per subject will be better than one. The mean of 2 independent observations on an individual will have a lower standard error of the estimated mean value by a factor of $1/\sqrt 2$, or one-half the variance of the mean-value estimate, than what you would have with a single observation

Assume that observations among individuals are independent and that the variances among observations within an individual are the same for all individuals. Groups are more easily distinguished if the within-group variances are smaller. Within a group, the variance of the estimated mean values will be the sum of the variance among the individuals' true mean values plus the within-individual variance of the mean-value estimates. Two observations versus one on each individual reduces that latter contribution to the within-group variance and thus improves power.

As to how much it matters: it depends. If the within-individual measurement variance is very small compared against the between-individual variance, the extra measurement won't matter much. But if the within-individual measurement variance is of similar magnitude as the between-individual variance, the extra measurement will help.

Second, if there is a systematic effect of time-of-day on your outcome but no age:time-of-day interaction, including time-of-day as a predictor in your model would tend to reduce the remaining within-age-group variance. That would have to be traded off against the loss of a degree of freedom for estimating the time-of-day effect, which can be important with small studies. So any advantage of that two-predictor model over the model only considering age depends on the relative magnitudes of the age and time-of-day effects and your sample size.

Third, with a random effect for intercept in a model with age and time-of-day as additive effects, you are modeling yet another parameter: a normal-distribution variance of hypothetical baseline values among all your subjects. For example, that hypothetical could be the variance among individuals' morning measurements while correcting the old individuals' values for the age effect. How much might that help or hurt? Again, it depends on the data.

I get the point. In our case the within-individual variance is not very small and is present for both young and old adults. In fact, for old adults, it is quite substantial between the two sessions. So the extra measurement might help for us right? Also I tried using G*Power and calculate the apriori power for the between factors of a Repeated measures ANOVA. From a previous meta-analysis we expect a Cohen's d of 0.99 for our variable between the two age-groups, which I translated to an effect size of 0.5 (a large effect). — Sauvik Das Gupta, Jun 14 '20 at 07:23
I chose a default alpha of 0.05 and kept the power (1-beta) at 0.80. Then I inserted 2 in the number of groups and 4 in the number of measurements (2 measurements for every subject for both the groups). I kept the Corr among rep measures at the default value of 0.5. Am I doing this correcty then? For this I get the total sample size of 22, which probably means I need 22 young and old adults in total to achieve a 80% power to detect this effect right? — Sauvik Das Gupta, Jun 14 '20 at 07:28
@SauvikDasGupta larger within-individual variance for old versus young adults poses an extra problem (or opportunity, as that might be a finding on its own). Typical power calculations and statistical significance tests assume the same within-individual and within-group variances. You might be better off simulating data sets based on your meta-analysis and pilot data, and sampling in different ways from those simulated data sets. Once situations get complicated, as yours seems to be, simulation can be a better way than standard software programs to do power analysis for experimental design. — EdM, Jun 14 '20 at 17:12
While doing the statistical tests, I did check for the test for equality of variances by a Levene's test. So between AM and PM, this test didn't give any statistically significant results, so there were no deviations in variance between the two sessions. Also as the repeated measure has only two levels, the sphericity observation was also always met. The larger within-individual variance for old compared to young is indeed a new observation for us, as we didn't hypothesize for this prior to the experiment and also didn't have any prior reason from the literature to predict that. — Sauvik Das Gupta, Jun 16 '20 at 09:52

Is the statistical power of the Between-subjects effects in a Mixed ANOVA higher or the same compared to an independent samples t-test?

1 Answers1