Mixed model residuals are not normal

Question

I am modeling data from an experiment with a mixed model. The outcome variable is a percentage. There are three fixed effects, Condition: diseased and healthy, Time point: 1, 2 and 3 , Drug: A,B,C,D,E. Subject is taken as the random effect. I need to perform three tests:

Check for a significant difference between disease and healthy for time point 1 and drug A, and so on for all the combinations of time points and drugs
Check for significant difference between Time point 1 and 2 for all the combinations of drugs and conditions
Check for a significant difference between Drug A and B for all the combinations of time points and conditions.

The data is unbalanced

This is what I did:
1. Build a linear mixed model

fit_1 <- lmer(y~Condition*Drug*Timepoint+(1|Subject))

2) Use lsmeans to perform tests

lsmeans(fit_1,pairwise~Condition | Drug * Timepoint,adjust="none")
lsmeans(fit_1,pairwise~Timepoint | Drug * Condition,adjust="none")
lsmeans(fit_1,pairwise~Drug | Condition * Timepoint,adjust="none")

However, none of the p-values were less than the nominal 0.05 alpha-level. The inference on these values was significant when Wilcoxon test was used. So I went back to check the residuals and they seemed to violate the assumptions of normal distribution and homoscedasticity. Should I use GLMM instead? If so, which family will be applicable when $y$ variable is a percentage of counts data?

It's hard to interpret the results of a Wilcoxon when applied on dependent data. The Wilcoxon doesn't allow for adjustment as in the mixed model. Even if the data were independent, the Wilcoxon has a different interpretation than a bivariate least squares model. It's good to look at the distributions of data. Producing some `xyplot`s can help visualize the experimental conditions better. Those assumptions are rarely met, but the inference is usually valid, as many posts on this site will tell you! — AdamO, Apr 09 '18 at 17:29

Isabella Ghement · Accepted Answer · 2018-04-06T23:46:59.640

1

You can convert your dependent variable data from percentages to proportions and then check if all the proportions lie in the interval (0,1). If yes, you can use a GLMM with beta distribution, which can be implemented in R via the glmmTMB package. If some proportions are equal to 0 but the rest are in the interval (0,1), you can use a GLMM with a zero-inflated distribution. See How to fit a mixed model with response variable between 0 and 1? for more ideas. There is a way to get lsmeans to work with this type of models.

edited Apr 06 '18 at 23:46

answered Apr 06 '18 at 23:32

Isabella Ghement

18,164
2
22
46

1

Thank you Isabella. I tried using glmmTMB function in R but lsmeans doesn't support an object from this function. Can you suggest how can I do the post-hoc analysis at each level of the factors following the model fit? – AMC Apr 12 '18 at 18:05

Mixed model residuals are not normal

1 Answers1