I have a dataset of about 300 patients of which 100 have repeated observations of the outcome (2-6 observations per patient) and 200 have only 1 observation of the outcome.
To determine what the risk factors are for change in the outcome over time/age, I have fitted a linear mixed model (with a random intercept per patient as the only random effect) using age as the time variable, which is also the only time-varying variable. Additionally, the model contains interaction effects with age of a few putative risk factors (also as fixed effects). Eventually I am interested in the interaction effects with age, as these tell me what influences the progression of my outcome per year of increase in age. An example of how it is coded in R is lme(fixed = outcome~ age + sex*age + smoking*age, random = ~1|patient_id, data = data, method = "REML")
.
As I'm only interested in investigating the change in my outcome over increasing age, it feels logical to only include patients with multiple measurements. However, running a model including all patients and one including only patients with repeated measurements, it became evident that the latter is less precise (i.e. larger confidence intervals), has vastly different intercepts (this feels logical, as a patient with only 1 measurement dóés have an intercept and so these might differ), and a few of the significance levels of associations differ (but most are similar).
My question is: how does adding patients with only 1 measurement affect the associations I find other than the model intercept, when comparing to a model that does nót include them? How would they contribute to different main (fixed) effects?
*Edit: I've found this post (Can I fit a mixed model with subjects that only have 1 observation) in which @Macro explains subject with 1 observation should be added as they 'contribute to estimating the mean structure'. I'd like to understand a bit more on why and how adding these observations influences the results.