I'm trying to build a linear mixed model for 5 outcome variables ...
- Cholesterol 1,Cholesterol 2,Cholesterol 3,Cholesterol 4,Cholesterol 5
which will be melted into a single Cholesterol variable, since statsmodel does not support multivariate LMM so far.
The independed variables are 38 specific pathogenetic features build from GenePy scores.
I have to correct for the following confounders: Age, Sex, Group ,Alcohol, Smoking and Levodopa treatment. All of them might contribute to the outcome of the Cholesterol outcome. Sex, Group and Levodopa treatment are binary categorical (0 or 1).
My question would be, how do I properly build up the equation for my model and put it into the statsmodel syntax?
My guess so far is: I treat the 38 specific pathogenetic features as fixed effects and the confounders would be random effects. All catergorical confounders are put into the "groups" option of the statsmodel syntax
Based on the statsmodel syntax:
model = sm.MixedLM.from_formula("Cholesterol ~ pathogenetic feature1 + pathogenetic feature 2 + ... pathogenetic feature 38 , data, re_formula="~Age+Alcohol+Smoking", groups=data["Group,Sex,Levodopa"])
Is that correct or nonsense? I'm a rookie in this topic and apologize for my weak understanding of it. Thanks so much in advance !