very new to statistical modelling & have found lots of the existing questions and answers from this community hugely helpful. I'm running into some issues of my own now that I can't find answers to online so I'd be very grateful for any advice you might have.
I’m trying to model the number of feeding attempts a butterfly makes depending on its species (2 species measured), age (3 different age groups), and test (consecutive memory tests). For reference, my dataframe looks like this:
> str(newcomparattemptdata3)
'data.frame': 1138 obs. of 12 variables:
$ ID : Factor w/ 241 levels "D100","D101",..: 107 108 109 110 134 135 138 139 107 142 ...
$ Test : Factor w/ 4 levels "Naive","Initial recall",..: 1 1 1 1 1 1 1 1 2 1 ...
$ Date : Date, format: "2019-02-04" "2019-02-05" "2019-02-06" ...
$ nAttempts : int 21 25 39 10 17 15 22 27 7 13 ...
$ Colour : Factor w/ 3 levels "","P","Y": 2 2 2 3 2 2 3 2 2 3 ...
$ Species : Factor w/ 2 levels "D","H": 2 2 2 2 2 2 2 2 2 2 ...
$ Sex : Factor w/ 3 levels "","f","m": 2 3 3 3 2 2 3 2 2 3 ...
$ Age : Ord.factor w/ 3 levels "0"<"1"<"2": 1 1 1 1 1 1 1 1 1 1 ...
Individuals were measured multiple times for each consecutive test (repeated measures) and so I’ve included individual ID as a random effect. I’m considering the possibility of interactions between the fixed effects and so my initial model (using glmmTMB) looked like this:
glmm6.0 <- glmmTMB(nAttempts ~ Age*Test*Species + (1|ID),
data=newcomparattemptdata3,
family = poisson())
I did some model checking with DHARMA, and as I expected from initial data exploration, it was zero-inflated, so I adjusted it to this new model – including the same effects in the zero-inflated formula because I think the same factors might be contributing to the zeroes (i.e. when an individual doesn’t feed at all):
glmm7.0 <- glmmTMB(nAttempts ~ Age*Test*Species + (1|ID),
data=newcomparattemptdata3,
ziformula=~Age*Test*Species + (1|ID),
family = poisson())
However, I checked this model using DHARMA and the residuals have weird patterns, and there are some significant deviations from uniformity, including when plotting residuals against each individual fixed effect level (DHARMA results shown below):
There apparently isn’t any overdispersion, but I tried a negative binomial distribution (nbinom2, based on the mean-variance relationship which looks more quadratic) anyways in case it cleared up some of the issues, and the model wouldn’t converge. I presumed it was probably overparameterised so I dropped the 3-way interaction and made a lot of iterative models with various combinations of predictors/interactions in both the conditional formula and the zero-inflated formula and using both the poisson and the negative binomial distribution. For all of these, the negative binomial distribution seems to fit better (judging from AIC, etc.) but even for the best of these I’m having the same problems checking the model using DHARMA – issues with uniformity in the residuals.
I’m at a bit of a loss, most sources I can find w.r.t. issues with model checking suggest solutions I’ve already tried (adjusting for zero-inflation, etc.) and I’m struggling to see how the model might be mis-specified.
I’d be happy to provide any more info this might be missing and would be hugely appreciative for any advice anyone might have on what I’m doing wrong. Thanks!