1

I am modeling data from an experiment with a mixed model. The outcome variable is a percentage. There are three fixed effects, Condition: diseased and healthy, Time point: 1, 2 and 3 , Drug: A,B,C,D,E. Subject is taken as the random effect. I need to perform three tests:

  1. Check for a significant difference between disease and healthy for time point 1 and drug A, and so on for all the combinations of time points and drugs
  2. Check for significant difference between Time point 1 and 2 for all the combinations of drugs and conditions
  3. Check for a significant difference between Drug A and B for all the combinations of time points and conditions.

The data is unbalanced

This is what I did:
1. Build a linear mixed model

fit_1 <- lmer(y~Condition*Drug*Timepoint+(1|Subject))

2) Use lsmeans to perform tests

lsmeans(fit_1,pairwise~Condition | Drug * Timepoint,adjust="none")
lsmeans(fit_1,pairwise~Timepoint | Drug * Condition,adjust="none")
lsmeans(fit_1,pairwise~Drug | Condition * Timepoint,adjust="none")

However, none of the p-values were less than the nominal 0.05 alpha-level. The inference on these values was significant when Wilcoxon test was used. So I went back to check the residuals and they seemed to violate the assumptions of normal distribution and homoscedasticity. Should I use GLMM instead? If so, which family will be applicable when $y$ variable is a percentage of counts data?

AdamO
  • 52,330
  • 5
  • 104
  • 209
AMC
  • 33
  • 4
  • 1
    It's hard to interpret the results of a Wilcoxon when applied on dependent data. The Wilcoxon doesn't allow for adjustment as in the mixed model. Even if the data were independent, the Wilcoxon has a different interpretation than a bivariate least squares model. It's good to look at the distributions of data. Producing some `xyplot`s can help visualize the experimental conditions better. Those assumptions are rarely met, but the inference is usually valid, as many posts on this site will tell you! – AdamO Apr 09 '18 at 17:29

1 Answers1

1

You can convert your dependent variable data from percentages to proportions and then check if all the proportions lie in the interval (0,1). If yes, you can use a GLMM with beta distribution, which can be implemented in R via the glmmTMB package. If some proportions are equal to 0 but the rest are in the interval (0,1), you can use a GLMM with a zero-inflated distribution. See How to fit a mixed model with response variable between 0 and 1? for more ideas. There is a way to get lsmeans to work with this type of models.

Isabella Ghement
  • 18,164
  • 2
  • 22
  • 46
  • 1
    Thank you Isabella. I tried using glmmTMB function in R but lsmeans doesn't support an object from this function. Can you suggest how can I do the post-hoc analysis at each level of the factors following the model fit? – AMC Apr 12 '18 at 18:05