Let's say I have outcome data at four time-points (baseline, 3 months, 6 months, 12 months) which I want to regress on an explicit time variable ($t_1 = 0$, $t_2 = 1$, $t_3 = 2$, $t_4 = 3$) to understand linear change.
I typically adjust for baseline differences in the outcome using a random intercept, e.g.:
$$Y_{it} = \beta_0 + \beta_1Time_{it} + U_i + e_{it} $$
Where $i$ = subject, $t$ = time, $B_0$ is a fixed intercept, $B_1$ is the slope of the explicit time variable, $U_i$ is the random intercept, and $e$ is subject- and time-varying error.
However, my supervisor adjusts for baseline differences by including the baseline measurement as a covariate and a random intercept, e.g.,:
$$Y_{it} = \beta_0 + \beta_1Time_{it} + \beta_2Baseline_i + U_i + e_{it} $$
I know that other people adjust for baseline variation in the outcome by just including baseline measurement as a covariate and no random intercept.
My questions are:
- Which of the above approaches is valid for adjusting for baseline differences (if any) and why?
- In particular, is it appropriate to adjust for baseline variation with a random intercept and no baseline covariate, and why?
- Do you have any references on the topic?