I have paired data (GWAS case/control study) and I have heard using conditional logistic regression or generalized linear mixed models (GLMM) is appropriate. Which should I use in this case? Why would you use one over the other. More importantly can you guys point me towards resources for doing these methods in R
? I'm finding a lot of material for SAS
, which I do not prefer. I can provide more details if necessary.
1 Answers
The conditional logistic regression applies fixed effects (in the context of econometrics),
$$ logit(p_{ij})=\boldsymbol x_{ij}^{'}\boldsymbol\beta+u_i.$$ where each pair of subjects has an individual intercept ($u_i$). It can be implemented with
clogit()
of packagesurvival
orclogistic()
of packageEpi
.Generalized linear mixed models (GLMM) for binary data can adopt link functions like
logit
,probit
andcloglog
. The mixed logistic regression is as,$$ logit(p_{ij})=\boldsymbol x_{ij}^{'}\boldsymbol\beta+\boldsymbol z_{ij}^{'}\boldsymbol u_i$$ where $\boldsymbol u_i$ are random variables and can have the distribution assumption (e.g. normal distribution). Of course you can use a random intercept model, i.e. $\boldsymbol z_{ij}^{'}=1$ and $\boldsymbol u_i$ is a scalar. You can estimate GLMM using
glmer()
of packagelme4
.As to the choice between conditional logistic regression and GLMM for binary data, some people are in favor of conditonal (fixed-effects) logistic regression and GLMM with
probit
link, but against fixed-effectsprobit
or GLMM withlogit
link. The reason may be that some of the consistency properties break down, especially with small within-cluster sample size ($n_i=2$ for your case).You can find the clarification of fixed effects and random effects (and marginal models) in different contexts here.
-
how can case-control even be analyzed with GLMM ? isn't one assumption is that observations are independent conditioned on the strata? Case-Controls destroys this assumption... – Maverick Meerkat Sep 07 '20 at 13:59