According to my research, it is ok to use a random effect term that has levels containing one observation (only when a minority of the data is this way) explained well by this post: How will random effects with only 1 observation affect a generalized linear mixed model?
While I understand that including levels which only have one observation affect the the residual error, I am having trouble wrapping my head around expanding this concept to if many of the levels (say 1/3-1/2) within a random effect, which all have >1 observation/level, only correspond to the same outcome in a binary response. For example, in a presence/absence logistic regression model, is it appropriate to use the random effect of "region" when certain regions only have observations of absence (or presence)? If so, why and how does this affect the model output?