Non-zero random slope variance in lmer for predictor that is constant inside each level of the grouping variable

Question

If I run a mixed model including a level 2 predictor as a random slope, it is sometimes possible to get an estimation of the variance. For example:

require(lmerTest)
test = lmer(Sweetness ~ Income + (Income|Consumer) + (Income|product), data = carrots)
summary(test)

Here, I am including income, which has no variance within each consumer, as a random slope. The output of the test suggests that consumer variance of the slope is not zero (SD=0.01087).

Here is the full output:

Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees
  of freedom [lmerMod]
Formula: Sweetness ~ Income + (Income | Consumer) + (Income | product)
   Data: carrots

REML criterion at convergence: 3828.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.58301 -0.68129 -0.07647  0.69873  2.59386 

Random effects:
 Groups   Name        Variance  Std.Dev. Corr 
 Consumer (Intercept) 0.4154951 0.64459       
          Income      0.0001182 0.01087  -1.00
 product  (Intercept) 0.2850742 0.53392       
          Income      0.0003368 0.01835  -1.00
 Residual             1.3572890 1.16503       
Number of obs: 1161, groups:  Consumer, 97; product, 12

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)  3.84828    0.27317 38.83000  14.088   <2e-16 ***
Income      -0.18368    0.08892 93.46000  -2.066   0.0416 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
       (Intr)
Income -0.815

why is this term in the output and how could it be interpreted?

****EDIT

Another example where intercept and slope are not correlated at -1, and where the Income value within each Consumer is the same, thus random slope should be 0/non-estimable. This is a subset of a real dataset I'm working with. The full dataset results in a -.47 correlation between intercept and slope.

datTest = structure(list(Sweetness = c(6.26, 0.49, 1.65, 2.08, 7.37, 10, 
9.65, 5.95, 0.74, 7.86, 0.19, 3.66, 10, 7.37, 2.86, 0.43, 7.76, 
5.02, 7.08, 10, 8.21, 6.05, 0.39, 9.61, 2.08, 7.12, 10, 8.99, 
4.47, 0.39, 2.08, 0.88, 5.33, 7.96, 9.86, 3.7, 0.56, 9.73, 1.83, 
7.67, 10, 8.29, 5.95, 0.43, 5.86, 2.35, 7.65, 10, 5.04), Income  = structure(c(0.085, 
-27.915, 37.085, -15.915, -7.915, -22.915, 2.085, 0.085, -27.915, 
37.085, -15.915, -7.915, -22.915, 2.085, 0.085, -27.915, 37.085, 
-15.915, -7.915, -22.915, 2.085, 0.085, -27.915, 37.085, -15.915, 
-7.915, -22.915, 2.085, 0.085, -27.915, 37.085, -15.915, -7.915, 
-22.915, 2.085, 0.085, -27.915, 37.085, -15.915, -7.915, -22.915, 
2.085, 0.085, -27.915, 37.085, -15.915, -7.915, -22.915, 2.085
), .Dim = c(49L, 1L)), Consumer = c(2, 3, 7, 16, 18, 20, 23, 2, 3, 
7, 16, 18, 20, 23, 2, 3, 7, 16, 18, 20, 23, 2, 3, 7, 16, 18, 
20, 23, 2, 3, 7, 16, 18, 20, 23, 2, 3, 7, 16, 18, 20, 23, 2, 
3, 7, 16, 18, 20, 23), product = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 
2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 
5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7)), row.names = c(2L, 
3L, 7L, 16L, 18L, 20L, 23L, 203L, 204L, 208L, 217L, 219L, 221L, 
224L, 404L, 405L, 409L, 418L, 420L, 422L, 425L, 605L, 606L, 610L, 
619L, 621L, 623L, 626L, 806L, 807L, 811L, 820L, 822L, 824L, 827L, 
1007L, 1008L, 1012L, 1021L, 1023L, 1025L, 1028L, 1208L, 1209L, 
1213L, 1222L, 1224L, 1226L, 1229L), .Names = c("Sweetness", "Income", 
"Consumer", "product"), class = "data.frame")

lmList(Sweetness ~  Income|Consumer, data = datTest)

Call: lmList(formula = Sweetness ~ Income | Consumer, data = datTest) 
Coefficients:
   (Intercept) Income
2     5.034286     NA
3     0.490000     NA
7     6.364286     NA
16    2.061429     NA
18    6.554286     NA
20    9.708571     NA
23    8.201429     NA

Degrees of freedom: 49 total; 35 residual
Residual standard error: 1.890709

summary(lmer(Sweetness ~  Income + (Income|Consumer) + (Income|product), data = datTest))

Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of freedom [
lmerMod]
Formula: Sweetness ~ Income + (Income | Consumer) + (Income | product)
   Data: datTest

REML criterion at convergence: 208.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.9550 -0.3096  0.1659  0.4416  1.8956 

Random effects:
 Groups   Name        Variance Std.Dev. Corr 
 Consumer (Intercept) 3.336092 1.82650       
          Income      0.011874 0.10897  -0.81
 product  (Intercept) 0.376941 0.61396       
          Income      0.001311 0.03621  1.00 
 Residual             2.264987 1.50499       
Number of obs: 49, groups:  Consumer, 7; product, 7

Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)   
(Intercept)  6.19235    0.99829 3.32300   6.203  0.00621 **
Income       0.03679    0.06247 0.09200   0.589  0.88019   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
       (Intr)
Income 0.037 

summary(lmer(Sweetness ~  Income + (Income||Consumer) + (Income||product), data = datTest))

Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of freedom [
lmerMod]
Formula: Sweetness ~ Income + ((1 | Consumer) + (0 + Income | Consumer)) +  
    ((1 | product) + (0 + Income | product))
   Data: datTest

REML criterion at convergence: 212.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.28024 -0.29821  0.08234  0.47653  1.69613 

Random effects:
 Groups     Name        Variance  Std.Dev.
 Consumer   (Intercept) 2.5399675 1.59373 
 Consumer.1 Income      0.0266043 0.16311 
 product    (Intercept) 0.1078963 0.32848 
 product.1  Income      0.0006577 0.02565 
 Residual               2.6104516 1.61569 
Number of obs: 49, groups:  Consumer, 7; product, 7

Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)   
(Intercept)  6.37522    1.01119 2.81400   6.305  0.00974 **
Income       0.07441    0.09030 4.20200   0.824  0.45409   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
       (Intr)
Income 0.335

How do you know there is no variance within each consumer? Have you run the output of `lmList` and verified that each consumer specific regression model has an "income" coefficient equal to that of the fixed effect of the mixed model? — AdamO, Jun 19 '17 at 16:33
I just ran lmList as such: `lmList(Sweetness ~ Income | Consumer, data = carrots)` and the income coefficients are all NA. Each consumer has the same income across observations, which is the reason why I am saying there is no income variance within each consumer - and thus shouldn't be able to estimate coefficients for income within consumer. Unless I am misunderstanding something about the meaning of the random slope. — user137451, Jun 19 '17 at 16:42
Does the `lmer` fitter drop the intercept for the Consumer random effect? — AdamO, Jun 19 '17 at 17:11
I still get the Consumer random effect intercept. I added the full output. — user137451, Jun 19 '17 at 17:22
I don't know but note that you get correlation -1 between intercept and slope for both grouping factors (`consumer` and `product`), possibly indicating problems with convergence. What happens if you drop both correlation parameters by using `||` instead of `|`? — amoeba, Jun 19 '17 at 19:55
If I use `| |` on this model, the slope variance for both `consumer` and `product` goes to 0. However, I will note that there is variance in the slopes of `Income` for `product` (if I run lmList I get different coefficients), but this term also goes to 0. Additionally, I tried a similar scenario in a different dataset (include a `(Variable | group)` where `variable` has the same value within each `group`; lmList returns NAs), and the random slope is also estimated, has a correlation different from -1 with intercept, and does not go to 0 when I use `| |`. — user137451, Jun 19 '17 at 20:56
@user137451 (1) Variance of a random effect going to exactly zero even though there is some non-zero (but small) variation in the dataset is a well-known effect. It happens, even in simulations when we know that the *true* variance is non-zero. See https://stats.stackexchange.com/a/115180 and links therein. (2) As I said, obtaining correlation `-1` is a red flag, and removing correlation parameters from the model is a sensible thing to do. Good that the slope variance goes to zero with `carrots`. (3) What you describe in a different dataset is a bit weird. Could you share it + code? — amoeba, Jun 21 '17 at 07:50
@amoeba thanks for pointing that out and for the help and edit. I have added the requested subset of my data with the code I'm running, and illustrating an example where the random slope is still estimated when removing the correlation parameter. Based on my limited knowledge, I wouldn't assume anything is wrong with the convergence from this output. — user137451, Jun 21 '17 at 12:38
Does fitting with REML= FALSE throw any light on the problem? — mdewey, Jun 21 '17 at 13:11
@mdewey I just reran this code with `REML = F`, and it made the correlation -1, but using `||` did not make the random slope 0 (so it continues to be estimated somehow). Additionally, running `REML = F` in full dataset the intercept-slope correlation was - 0.51, so `REML = F` made little difference there. — user137451, Jun 21 '17 at 13:22

Non-zero random slope variance in lmer for predictor that is constant inside each level of the grouping variable

0 Answers0