I have one independent continuous and time-dependent variable X, repeatedly measured (from 1 to 4 times) in different patients during some period of time. My dependent variable Y is binary and is obtained after a period of time (e.g diagnosis at the end of the 3 months during which X was measured)
My first step would be to properly model the time-dependency of X with a GLMM and random effects to account for repeated measures, but this would fail due to the insufficient number of data per patient (25% have only 1 measure).
So I am somewhere in between :
1.) a simple logistic regression with only one datapoint per patient (or an aggregated version of X) but this is a shame in term of power loss) and
2.) a proper modeling of X with random effect accounting for repeating measures.
I have read this post which seems to cover a related topic : Which statistical model to use when trying to find the beginning of a time-dependent increase
Is it sound to the use of a logistic regression with all data considered first as independent, even if most of them are not, but then adjust the errors with a function like robcov()
from rms
package ? Can we obtain "adjusted" LR coefficients and a LR model to make new prediction with this approach ?
Thanks in advance.