0

According to my research, it is ok to use a random effect term that has levels containing one observation (only when a minority of the data is this way) explained well by this post: How will random effects with only 1 observation affect a generalized linear mixed model?

While I understand that including levels which only have one observation affect the the residual error, I am having trouble wrapping my head around expanding this concept to if many of the levels (say 1/3-1/2) within a random effect, which all have >1 observation/level, only correspond to the same outcome in a binary response. For example, in a presence/absence logistic regression model, is it appropriate to use the random effect of "region" when certain regions only have observations of absence (or presence)? If so, why and how does this affect the model output?

Charlie_J
  • 1
  • 2
  • the model won't blow up, but it might not work very well. If there is enough data overall then the absence-only or presence-only regions will be shrunk properly toward the mean (especially if they're absence-only or presence-only because they have a small number of observations). – Ben Bolker Oct 15 '21 at 00:57

0 Answers0