5

I am referring to the question. When estimating random effect (RE) variance or correlation, the estimations are different in VarCorr(mod) function and when calculating the variance or correlation among REs with cor(ranef(mod)). The question above explains why this is so.

I am interested which values are better for interpretation (best linear unbiased predictions (BLUPs) or VarCorr) results and why? What do we get with each option? Because the conclusions can obviously be different.

Finally, is there any difference in BLUPs interpretation in lmer vs glmer?

EDIT: providing a concrete example

Let's say we have the following model

Random effects:
 Groups  Name        Variance Std.Dev. Corr 
 subject (Intercept) 0.24513  0.4951        
         X           0.03209  0.1791   -0.83
Number of obs: 13037, groups:  subject, 218

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.49331    0.04184   35.69   <2e-16 ***
X           -0.32162    0.02815  -11.42   <2e-16 ***

We are predicting Accuracy (0 or 1) from intelligence (X, continuous scale), while controlling for the REs in subjects.

These are manually correlated correlations and variances of REs

First correlation cor(ranef(mod))

            (Intercept)      C.RT
(Intercept)    1.000000 -0.977063
C.RT          -0.977063  1.000000

Then varinace-covariance matrix cov(ranef(mod))

            (Intercept)        C.RT
(Intercept)  0.16267598 -0.04952407
C.RT        -0.04952407  0.01579298

Now, how does the concrete interpretation of marginal and conditional effects look like?

My general interpretation would be: Variance of normally distributed logit accuracy values of individual subjects around the mean logit accuracy for all participants (1.49) is 0.24

Also the correlation between individual logit accuracy values of individual participants and individual "intelligence" slopes is -.83 => the higher the accuracy for individual participant, the lower the relationship between intelligence and accuracy for this participant.

Are these interpretations of marginal or conditional models? How would the interpretation of the other model look like?

EDIT 2: Other questions related to this issue Here I list other questions on cross validated, but in neither case did I get the precise answer I was looking for.

Subject specific vs population average predictions

Marginal model versus random-effects model – how to choose between them? An advice for a layman

Conditional vs. Marginal models

Difference between marginal and conditional models

User33268
  • 1,408
  • 2
  • 10
  • 21

1 Answers1

7

As explained in the answer you cited above, the covariance matrices are referring to two different models, one in the marginal model (integrating out the random effects), and the other on the conditional model on the random effects.

It is not that one is better than the other because they are not referring to the same model. Which one you select depends on your question of interest.

Regarding your second question, you have to be a bit more specific on what you exactly you mean by "BLUPs" under the two models. For example, the empirical Bayes estimates of the random effects are derived using the same idea under the two approaches, i.e., the mode of the conditional distribution of the random effects given the observed outcomes. You can add these to the fixed-effects part to obtain subject-specific predictions, taking into account though that in the case of glmer() you also have a link function.


EDIT: Regarding the two models mentioned above; say, $y$ is your outcome variable and $b$ are the random effects. A general definition of a mixed model is: $$\left\{ \begin{array}{l} y \mid b \sim \mathcal F_y(\theta_y),\\\\ b \sim \mathcal N(0, D), \end{array} \right.$$ where $\mathcal F_y(\theta_y)$ is an appropriate distribution for the outcome $y$, e.g., it could be Gaussian (in which case you obtain a linear mixed model), Binomial (and you obtain a mixed effects logistic regression), Poisson (mixed effects Poisson regression), etc.

The random effects are estimated as the modes of the posterior distribution $$ p(b \mid y) = \frac{p(y \mid b) \; p(b)}{p(y)}, $$ where $p(y \mid b)$ is the probability density or probability mass function behind $\mathcal F_y$, and $p(b)$ the probability density function of the multivariate normal distribution for the random effects.

With regard to your question, the covariance matrix of the empirical Bayes estimates obtained from ranef() is related to the covariance of this posterior distribution, whereas VarCorr() is giving the $D$ matrix, which is the covariance matrix of the prior distribution of the random effects. These are not the same.


EDIT 2: A relevant feature of the estimation of the random effects is shrinkage. That is, the estimates of the random effects are shrunk towards the overall mean of the model. The degree of shrinkage depends on

  1. The relative magnitude of the variance of the random effects versus the variance of the error terms. I.e., the larger the variance of the random effects with respect to the error variance, the smaller the degree of shrinkage.
  2. The number of repeated measurements. The more repeated measurements, the smaller the degree of shrinkage.

The following code illustrates this in the simple random-intercepts model:

prior_vs_post <- function (prior_variance = 1, error_variance = 1, 
                           repeated_meas_per_id = 5) {
    require("lme4")
    n <- 1000 # number of subjects
    beta <- 10 # fixed effect intercept
    b <- rnorm(n, 0, sqrt(prior_variance)) # random intercepts
    DF <- data.frame(id = rep(seq_len(n), each = repeated_meas_per_id))
    # simulate normal data conditional on the random intercepts
    DF$y <- rnorm(n * repeated_meas_per_id, beta + b[DF$id], sqrt(error_variance))
    ###############
    # Fit the model
    fm <- lmer(y ~ 1 + (1 | id), data = DF)
    c(estimated_prior_variance = VarCorr(fm)$id[1L],
      BLUPs_variance = var(ranef(fm)$id[[1L]]))
}

# high variance of REs, low variance error terms
# 2 repeated measurements => low shrinkage
prior_vs_post(prior_variance = 10, error_variance = 1, 
              repeated_meas_per_id = 2)
#> estimated_prior_variance           BLUPs_variance 
#>                 11.05215                 10.58501

# high variance of REs, low variance error terms
# 20 repeated measurements => almost no shrinkage
prior_vs_post(prior_variance = 10, error_variance = 1, 
              repeated_meas_per_id = 20)
#> estimated_prior_variance           BLUPs_variance 
#>                 10.07539                 10.02580

# low variance REs, high variance error terms,
# 20 repeated measurements => considerable shrinkage
prior_vs_post(prior_variance = 1, error_variance = 10, 
              repeated_meas_per_id = 20)
#> estimated_prior_variance           BLUPs_variance 
#>                1.0002202                0.6666536

# low variance REs, high variance error terms,
# 2 repeated measurements => extreme shrinkage
prior_vs_post(prior_variance = 1, error_variance = 10, 
              repeated_meas_per_id = 2)
#> estimated_prior_variance           BLUPs_variance 
#>                0.9479291                0.1574824
Dimitris Rizopoulos
  • 17,519
  • 2
  • 16
  • 37
  • I added a concrete example where you could help me understand the interpretation of the two models, because it's still too apstract for me... Regarding the second question - can `ranef(mod)` coefficients be interpreted in the same manner for `lmer` and `glmer` aside from taking care of the link function? – User33268 Feb 13 '19 at 22:03
  • 1
    I don't understand what two models you are referring to. OP basically asks the following: for a model like `y~x+(x|A)`, lmer fits a 2x2 random effect covariance matrix. It also outputs the BLUPs. The sample covariance matrix of BLUPs will be close but not identical to the covariance matrix output by the model. So which one should one rather interpret? CC @User33268. – amoeba Feb 13 '19 at 22:18
  • Yes this is what I mean. Thanks @amoeba for pointing that out more clearly! And if I can interpret both, what is the concrete difference in interpretation between `VarCorr(mod)` and `cor(ranef(mod))` – User33268 Feb 13 '19 at 22:37
  • @amoeba check my edit above. – Dimitris Rizopoulos Feb 13 '19 at 22:46
  • 1
    Thanks for the update, but could you translate your edit into more layman terms - concretely for my edit in the original post. Concretely, in my example, what is correlated -.83 and what is correlated -.98 ? – User33268 Feb 13 '19 at 23:08
  • Interesting. Can't say I understood it really. BTW, we have basically the same question asked previously: https://stats.stackexchange.com/questions/275620, but I don't feel the answer there really answers it (at least not for me). It might be instructive to take some very simple case (e.g. `y~1+(1|A)`, a random intercept model without any predictors at all), and work out in detail why the variance of BLUPs is not equal to the random intercept variance, and *how* will it differ from it (larger? smaller? why? what does it depend on?). CC @User33268. – amoeba Feb 14 '19 at 09:59
  • I agree, I've also seen this question but have not understood it well. Should we post the simple version of the question? `y ~ 1+(1|A)` Or edit the present one? – User33268 Feb 14 '19 at 13:03
  • The difference between the interpretation/predictions from marginal versus conditional models is related but is not the same as your first question regarding the difference between the estimated covariance matrix of the random effects and the covariance matrix of the empirical Bayes estimates of the random effects. With regard to the former, have a look at this question and answer: https://stats.stackexchange.com/questions/365907/interpretation-of-fixed-effects-from-mixed-effect-logistic-regression – Dimitris Rizopoulos Feb 14 '19 at 13:25
  • @amoeba I have added a new edit illustrating the shrinkage effect. – Dimitris Rizopoulos Feb 14 '19 at 13:52
  • Thanks! (+1) I was aware of the shrinkage effect in mixed models but I did not quite realize that it means that variance of BLUPs is generally smaller than the estimate of the random variance. One thing that is still unclear to me in your answer is the reference to "marginal model" in the beginning. Do you use this term in the same sense as here https://stats.stackexchange.com/questions/86309 or here https://stats.stackexchange.com/questions/21760? Because there it is claimed that for linear models, the estimates coincide... – amoeba Feb 19 '19 at 22:22
  • @amoeba by marginal model I mean the model in which the random effects are integrated out, i.e., $p(y)$ in my notation above. When $\mathcal F$ is the normal distribution, the marginal distribution of $y$ is still normal with mean described by the fixed-effects part. With non-gaussian models or nonlinear link functions, the fixed-effects part no longer corresponds to the mean of the marginal model. In this case, there is an issue with the interpretation of these coefficients: https://stats.stackexchange.com/questions/365907/interpretation-of-fixed-effects-from-mixed-effect-logistic-regression – Dimitris Rizopoulos Feb 21 '19 at 07:57
  • Dimitris, in your `prior_vs_post` illustration, if `estimated_prior_variance` turns out to be larger than estimated `BLUPs_variance`, that's the indication that random-effects have shrunken a lot? Why so? – rnorouzian Sep 10 '20 at 00:48
  • @rnorouzian because shrinkage means how much the a-priori value of the random effects have shrunk towards the overall mean of the model. – Dimitris Rizopoulos Sep 10 '20 at 07:36