What is the criteria for including and excluding variables in longitudinal models?

Question

I am dealing with longitudinal data with two times points 2018 and 2020, I am modeling two variables BMI (continuous) and anxiety disorder ( binary), my main interest is to investigate the evolution over time of my outcomes, I am using covariates such as time(2018 or 2020) age gender region … , for the BMI I am fitting a linear mixed model, for anxiety disorder I am fitting a GEE model.

since I am interested in the evolution over time, should I only include in my model the interactions of the covariates with time ? or also include the covariates solely? should I drop all the interactions that are not significant or to report them as well?

Thank you in advance

score 6 · Answer 1 · answered May 18 '21 at 18:41

I would suggest the following approach:

Write down all the variables you have and draw a DAG (causal diagram) which shows the proposed inter-relationships between all the variables. This should be informed by expert knowledge of the subject domain.
You should potentially include confounders and competing exposures, and exclude mediators. Be careful of over-adjustment for confounders, and also of collider bias. The DAG will inform which variables to include. A very useful online tool to draw a DAG and obtain the minimal sufficient set of variables to adjust for is: http://www.dagitty.net/
For interactions, the DAG will not help, so again you need to use expert knowledge to decide whether interactions between certain variables are likely. If they are likely, then you should include them. Presumably you did a power study before collecting the data, so whatever variables and interactions you used in the power study should be in the analysis model.
Avoid relying on p-values, especially if a variable or an interaction is indicated by expert knowledge.
If you include an interaction then also include the main effects for that interaction. Similarly, if you include a 3-way interaction then also include the 2-way interactions (and the main effects). Similarly for 4-way interactions, though once you get to a 4-way interaction the model becomes very difficult to interpret.

For a detailed discussion about DAGs and how they can help inform which variables to include/exclude and avoid biases due to confounding, mediation, colliders and more, see this question and answer:

How do DAGs help to reduce bias in causal inference?

@ilyas does this answer you question ? If so then please mark it as the accepted answer, and if not, please let us know why. I apreciate there has been some discussion on your subsequent question, but if there is anything else to add, please let us continue it here. — Robert Long, Jun 04 '21 at 14:51

What is the criteria for including and excluding variables in longitudinal models?

1 Answers1

Linked