Goodness of fit for a logit-transformed linear random-effects model?

Question

(After re-reading my question here, I realize my notation is a mess... apologies. I hope the question is clear enough.)

There is an examination that students (indexed by $i$) can take once annually in a three-year program. Students are not obligated to take this examination each year, but are required to take it at least once over the three years.

A student could (but is not obligated to) take an exam-prep class of sorts either before or after an exam at time $t$ (indicated by $\alpha_{it}$) and may be in years 1, 2, or 3 of the program (indicated by $\beta_{it}$). For our purposes, if student $i$ never takes the prep class, $\alpha_{it} = 0$ for all $t$. The value $t$ indicates the calendar year in which the exam was taken, distinct from the year the student is in the program.

Suppose I have a linear mixed-effects model $$\log\left(\dfrac{y_{it}}{1 - y_{it}} \right) = \mu + b_1\alpha_{it} + b_2\beta_{it} + \gamma_i+\epsilon_{it}$$ where $y_{it} \in (0, 1)$ is the exam score expressed as a percentage of the $i$th student at time $t$(truncated from above at 0.999), $\mu$ is an intercept, $\alpha_{it} \in \{0, 1\}$ (a binary indicator equalling $1$ if an exam-prep class was taken before taking the exam at time $t$), $\beta_{it} \in \{1, 2, 3\}$, $\gamma_i \sim \mathcal{N}(0, \sigma^2_{\gamma})$ is a random effect used to incorporate student-to-student variability, and $\epsilon_{it} \sim \mathcal{N}(0, \sigma^2_{\epsilon})$ is the usual noise term. $b_1$ and $b_2$ are your usual coefficients estimated through least squares (taking into account the random effect).

As a simplified version of my question, suppose I wanted to test the goodness of fit of the model consisting of $(\mu, \alpha_{it}, \gamma_i, \epsilon_{it})$ in comparison to the model consisting of $(\mu, \alpha_{it}, \beta_{it}, \gamma_i, \epsilon_{it})$.

What test is appropriate for this? Deviance testing comes to mind, but what I'm dealing with here isn't a generalized linear model - it is just a transformed model. Journal articles and textbooks are very much appreciated.

Can you say more about the exam score $y_{it}$ in its raw metric and why you are treating it as described? — Erik Ruzek, Jun 18 '20 at 18:33
@ErikRuzek The only thing I have available are the percentage correct that a student gets on an exam. I thought a logit transformation would be appropriate. If you have other ideas, I'd be willing to entertain them. — Clarinetist, Jun 18 '20 at 18:34
Have you tried modeling the % correct? The main thing we are worried about is the residuals $\epsilon_{it}$ having a normal distribution (along with the random intercept residuals $\gamma_i$.). — Erik Ruzek, Jun 18 '20 at 18:43
@ErikRuzek Yes, I struggled with this when coming up with this model. It's been a while since I examined these, but I recall it's not as close to normality as I would like. I see that you're an education researcher: are there standard tools (perhaps nonparametric ones) that one can use to test for association between $y_{it}$ and $\alpha_{it}$ if normality does not hold? All of the random-effects modeling I've learned about in school assumes normality of the residuals and random effects. — Clarinetist, Jun 18 '20 at 18:49
@ErikRuzek And of course, at the end of the day, I need to arrive at a $p$-value or CI of some sort. Not that I agree with that, but that's easiest with normality. — Clarinetist, Jun 18 '20 at 18:52
The log could work, it's just harder to interpret and requires some back translation for non-technical audiences. You could also consider a GLMM using a gamma link function. https://stats.stackexchange.com/questions/67547/when-to-use-gamma-glms?noredirect=1&lq=1. The other (band-aid) approach is to use "robust" estimation, which protects you against deviations from normality in the residuals. See https://www.frontiersin.org/articles/10.3389/fpsyg.2018.02104/full and https://cran.r-project.org/web/packages/robustlmm/vignettes/rlmer.pdf — Erik Ruzek, Jun 18 '20 at 19:04
@ErikRuzek Thank you for the links. Yes, I completely agree that it's harder to interpret, which is why I found myself having to deal with inverse logits to interpret the parameter estimates. — Clarinetist, Jun 18 '20 at 19:06

score 2 · Accepted Answer · answered Jun 18 '20 at 18:58

To check the goodness of fit of mixed-effects models you could use the simulated residuals provided by the DHARMa package. If the model fits the data well, then you expect these residuals to have a uniform distribution. Also, they can be used to identify potential over-dispersion and zero-inflation problems.

Goodness of fit for a logit-transformed linear random-effects model?

1 Answers1

Linked