3

I have submitted a paper where I have performed a $t$-test between 2 groups A and B (coded as a dummy variable) to compare the mean of an Inattention scale derived from CFA analysis of the ADHD self-reported scale. We observed a significant difference between the 2 groups (one way $t$-test $p <0.001$). The reviewer asks that we control the effect of age in our study. Could you tell me how to control for age?

I'm using the following software to conduct the data analysis python, jasp, or jamovi.

COOLSerdash
  • 25,317
  • 8
  • 73
  • 123
fredooms
  • 33
  • 2

2 Answers2

5

You can do this using a linear regression model.

The model would have the following structure: $$ y_i = \beta_0 + \beta_1\cdot\text{Group}_{i} + \beta_2\cdot\text{Age}_i + \epsilon_i $$ Here, $\text{Group}$ is the dummy variable for the group and $\beta_1$ estimates the mean difference between the groups adjusted/controlled for potential age-differences between the groups. A model containing categorical and continuous predictors is sometimes called an analysis of covariance (ANCOVA). The model above assumes a linear relationship between age and the outcome. If you suspect that age is nonlinearly related to the outcome, I'd recommend including age using restricted splines (aka natural splines) with anything between 3 to 5 knots, depending on your sample size. This allows for a very flexible adjustment for age.

COOLSerdash
  • 25,317
  • 8
  • 73
  • 123
4

I thought I'd complement COOLSerdash's answer by showing what this actually entails in Python (spoiler alert: it's quite simple!).

In a nutshell, you're basically going from this...

import statsmodels.formula.api as smf

model1 = smf.ols('inattention ~ group_a', data).fit()
model1.summary()

(where group_a is a dummy variable indicating whether the individual belongs to group A)

...to this:

import statsmodels.formula.api as smf

model2 = smf.ols('inattention ~ group_a + age', data).fit()
model2.summary()

That's really all it means to control for age. Also, note that the inferences from model1 and from your t-test are equivalent.

Adrià Luz
  • 746
  • 11
  • Thank you for you much appreciated help. Could you tell me how to understand the results? – fredooms Sep 08 '21 at 12:08
  • @fredooms I don't know what your data and your results look like. I'll be happy to help if you edit your original question to show a sample of your data and the output of `model2.summary()`. – Adrià Luz Sep 08 '21 at 16:56
  • I don't know python ... maybe you could also show how to control for a possible nbonlinear age effect using natural splines? – kjetil b halvorsen Sep 27 '21 at 15:08