2

Are the residuals close enough to normality after Box Cox transformation using the MASS package?

 max.intercept.BCmodel <- lmer(RTtrans ~ Var1*Var2*Var3 + (1|Participant),data= Data, REML=FALSE)

enter image description here

Skewness and kurtosis seem to be decent

      > skewness(residuals(max.intercept.BCmodel))
       [1] 0.07592625

      > kurtosis(residuals(max.intercept.BCmodel))
      [1] 0.863893

I also tested if there was overdispersion which showed underdispersion

 > overdisp_fun(m1)
        chisq        ratio          rdf            p 
     1.690057e+00 1.326138e-05 1.274420e+05 1.000000e+00 
CatM
  • 442
  • 3
  • 15
  • 1
    "Close enough" for what purpose? – whuber Aug 13 '20 at 16:06
  • to be able to use a linear mixed effects model – CatM Aug 13 '20 at 16:07
  • To use it for what purpose?? For prediction you might be concerned, but for estimation of parameters there's unlikely to be any problem at all. That's why we care about knowing your objectives. – whuber Aug 13 '20 at 16:10
  • For hypothesis testing, i.e. to determine whether a congruency effect was present and whether it increased with practice. – CatM Aug 13 '20 at 16:12
  • 1
    It must be a very subtle hypothesis indeed to require testing in the presence of so much data! Your quantile plot suggests around $10^5$ observations. Normality of residuals is not a necessary assumption for these tests, anyway. You can find extensive discussions of this by searching our site. – whuber Aug 13 '20 at 16:15
  • It is actually a very strong effect, it is just that we have 1000 trials per participant. Would it be fine then? I see everywhere that normality of the residuals is required for a linear mixed effects model. – CatM Aug 13 '20 at 16:22
  • No, normality of the residuals is not required except, perhaps, for some tests you might want to conduct on the residuals -- and even then, many tests are robust to departures from normality. We often assume the random effects are normal, but there has been research on other distributions for random effects. See more here: https://stats.stackexchange.com/questions/217774/normal-distribution-necessary-for-linear-mixed-effects-r/217828 – kurtosis Aug 13 '20 at 18:21
  • If those low observations bother you, maybe do a separate mini-study of the lowest observation per participant and see if there is some commonalities (*e.g.* an initial or final measurement, a certain time of year, etc.). Getting to know your data better is always a good first step. – kurtosis Aug 13 '20 at 18:22
  • Is this a good source? https://www.biorxiv.org/content/10.1101/498931v2.full.pdf+html – CatM Aug 13 '20 at 18:41
  • Then should I use the transformed or untransformed data? – CatM Aug 13 '20 at 18:45
  • 1
    @kurtosis from their other posts, I think the main thing they are concerned about is that the `lmer` model residuals have extremely short tails, so much so that the standard errors are several times smaller than they would be if they were approximately normal and unfortunately they need to produce p values for their target journal. – Robert Long Aug 14 '20 at 06:17
  • @RobertLong Ah. Well, they have a lot of data; however, I suppose pseudoreplication could still be an issue. Maybe if there are other factors which could explain common variance and be the source for other random effects? – kurtosis Aug 14 '20 at 06:58
  • @kurtosis tbh I think the linear model can be used here. I mean, the estimates are unbiased and consistent, and the standard errors *should* be smaller due to less spread. I haven't encountered this before. Maybe this question could be reframed as a one sample t test on a sample with very short tails ? – Robert Long Aug 14 '20 at 07:16
  • No, you definitely need a random effect at least on Participant -- because you know there is a common random factor in observations on the same individual. If some of these people are related, you would also want a family random effect (so participant nested inside family). There might also be a location effect. However, treating all of the observations as though they are independent seems (1) an even worse choice, and (2) inviting a method-savvy reviewer to rip apart any conclusions of significance. – kurtosis Aug 14 '20 at 07:21
  • @kurtosis I think you missed my point, I'm not suggesting to do away with random effects (absolutely they are needed) , I am only suggesting to think about the problem of inference with very short tails as a one sample problem and then apply the idea to mixed model. – Robert Long Aug 14 '20 at 08:12

0 Answers0