Using glmer to analyse ratio without cbind

Question

Say I have a stimuli (Sound from a speaker) which can either be present (Y) or not (N). I have observed arrivals of individuals and noted whether during these arrivals the stimuli was Y or N. I now wish to see which factors affect whether individuals choose to arriva. My variables are:

Pb_on = Is there sound Y/N (Factor, 2 levels)
Pb_type = What sound is being played (Factor) NOTE: Initially this factor had 4 levels. One being Silence, this Silence is excluded as the playback for Silence was always off. Which causes problems with model estimations.
NR_Tot = How many individuals are already present (Numerical)
Site = The experiment was conducted at several sites (Factor, 12-13 levels).

I now wish to determine whether more individuals arrived during sound playback or not. I do not wish to use a glmer in the form of: glmer(cbind(Y,N) ~ factors) as this would mean that I have to sum my observations per site which causes me to lose much data on individual observations (such as NR_Tot). It was suggested to me to use something resembling the following formula:

glmer(pb_on ~ Pb_type + Nr_Tot + (1|Site), data = df, family = binomial(link = "logit"))

The idea being that due to specifying the binomial family, the model will 'understand' that I wish to determine which factors affect the ratio between Pb_on being Y or N.

Now I can get this model running. But I have two questions:

The model ends up being singular. When analysing the summary one can see that the variance/std. dev of the random effect is 0, which I assume is the issue. Comparing the model with a glm model (without random effect but otherwise identical) using the anova() function seems to suggest that a random effect is not necessary (Though I know that from a philosophical point of view you would prefer to include it, as different sites were used to conduct the experiment). Is this approach acceptable? (It was based on information found here). Or should I instead perhaps treat Site as a fixed effect (though then I am somewhat lost as to how to interpret my results).
Is this actually how the model works? Or is it now analysing something else? The model (without random effect) works fine, though none of the variables turn out to be significant. And ultimately, the 'best' model would be a null model, suggesting none of my factors are of great import.

In this case, because you have a lot of sites, having a coefficient for all of them might too many parameters for your model. I think it makes sense to have site as a random effect. And most likely the effect per site is small.. so you can do without it. You can also check out this link https://stats.stackexchange.com/questions/238005/what-is-the-intuition-on-fixed-and-random-effects-models — StupidWolf, Apr 14 '20 at 21:19
See http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#zero-variance and https://stats.stackexchange.com/questions/112423/random-effect-with-zero-sd-in-lmm?rq=1 — kjetil b halvorsen, Apr 15 '20 at 16:56
What I understand from Ben Bolker's page: Singularity seems to be present as is evident from the 0 variance. This does not necessarily mean there is no variation between groups. It does seem to imply that my model is degenerate (evident from the other stack thread posted by StupidWolf). I wish to avoid Bayesian statistics for now (as I don't understand them fully) but that would be an option. Circumstances are such that for now I will leave this model be. However, based on the other thread posted by kjetil I understand that while degenerate, model results will be similar if site is excluded. — R. Iersel, Apr 16 '20 at 14:42

Using glmer to analyse ratio without cbind

0 Answers0