Fitting a linear mixed model on repeated measurements data, do I in- or exclude patients with only 1 observation?

Question

I have a dataset of about 300 patients of which 100 have repeated observations of the outcome (2-6 observations per patient) and 200 have only 1 observation of the outcome.

To determine what the risk factors are for change in the outcome over time/age, I have fitted a linear mixed model (with a random intercept per patient as the only random effect) using age as the time variable, which is also the only time-varying variable. Additionally, the model contains interaction effects with age of a few putative risk factors (also as fixed effects). Eventually I am interested in the interaction effects with age, as these tell me what influences the progression of my outcome per year of increase in age. An example of how it is coded in R is lme(fixed = outcome~ age + sex*age + smoking*age, random = ~1|patient_id, data = data, method = "REML").

As I'm only interested in investigating the change in my outcome over increasing age, it feels logical to only include patients with multiple measurements. However, running a model including all patients and one including only patients with repeated measurements, it became evident that the latter is less precise (i.e. larger confidence intervals), has vastly different intercepts (this feels logical, as a patient with only 1 measurement dóés have an intercept and so these might differ), and a few of the significance levels of associations differ (but most are similar).

My question is: how does adding patients with only 1 measurement affect the associations I find other than the model intercept, when comparing to a model that does nót include them? How would they contribute to different main (fixed) effects?

*Edit: I've found this post (Can I fit a mixed model with subjects that only have 1 observation) in which @Macro explains subject with 1 observation should be added as they 'contribute to estimating the mean structure'. I'd like to understand a bit more on why and how adding these observations influences the results.

You could start by fitting models with/without those 200 1-obs patients, and show us ... — kjetil b halvorsen, Feb 15 '22 at 03:12
I did! Thats what I was trying to say in the third paragraph: ```it became evident that the latter is less precise (i.e. larger confidence intervals), has vastly different intercepts (this feels logical, as a patient with only 1 measurement dóés have an intercept and so these might differ), and a few of the significance levels of associations differ (but most are similar)``` However, I'm trying to understand *why* this happens. — tcvdb1992, Feb 15 '22 at 10:48

score 1 · Accepted Answer · answered Feb 15 '22 at 11:04

Let's denote the group of patients with only one observation with S, and the group of those with more than one observation with M.

Since you have modeled only the intercept as a random effect, and that one depends on the patient, you effectively try to fit a new line for each patient, and two lines for two different patients must be parallel and can only differ in their intercept. So you have, for 300 patients, 300 (probably mostly different) parallel lines.

Now, a line for a patient in S can have any intercept and any slope and still perfectly fit the single measurement. But the model learns from the slopes of the patients from M what a good slope would be. Now, if you use this slope for all 300 patients, you will get 300 intercepts, because, for a given slope, the intercept for patients from S is now "determined", too.

The decisive point is now that the model "doesn't like" the random effects (in your case the intercepts) to differ too much. Solutions with strongly varying intercepts between the patients are penalized. Thus, because of the common slope, the intercepts of patents in S, too, are influencing the individually fitted intercepts of all patients, including those of patients in M. There usually will be some kind of "common interception region" for all the lines.

This "soft constraint" on the intercepts is, in turn, also influencing the common slope, since each patients line must not only fit the slope of its data but also make sure it doesn't stray too much away from the common interception region.

Thus, those patients in S improve your estimation both for the intercept and the slope.

Hi @Frank, thanks for this elaborate answer! That made sense. Would what you are refering to be more or less the same as 'estimating the mean structure' as someone else suggested in the question linked below? Link: https://stats.stackexchange.com/questions/24280/can-i-fit-a-mixed-model-with-subjects-that-only-have-1-observation — tcvdb1992, Feb 16 '22 at 09:06

Fitting a linear mixed model on repeated measurements data, do I in- or exclude patients with only 1 observation?

1 Answers1