1

Consider the following lme4 model:

mod <- Y ~ X*Condition + (X*Condition|subject)

# Y = logit variable  
# X = continuous variable  
# Condition = values A and B, dummy coded; the design is repeated 
#             so all participants go through both Conditions  
# subject = random effects for different subjects 

summary(model)
Random effects:
 Groups  Name             Variance Std.Dev. Corr             
 subject (Intercept)      0.85052  0.9222                    
         X                0.08427  0.2903   -1.00            
         ConditionB       0.54367  0.7373   -0.37  0.37      
         X:ConditionB     0.14812  0.3849    0.26 -0.26 -0.56
Number of obs: 39401, groups:  subject, 219

Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)       2.49686    0.06909   36.14  < 2e-16 ***
X                -1.03854    0.03812  -27.24  < 2e-16 ***
ConditionB       -0.19707    0.06382   -3.09  0.00202 ** 
X:ConditionB      0.22809    0.05356    4.26 2.06e-05 ***

In Different variance-covariance matrices of random effects per fixed-effect group in lme4 we were discussing the usage of the dummy() function to model random effects.

I was researching the function, and was trying to mimic te results that I got with classic modelling notation. I tested the following models which, according to my logic, should yield the same results as the mod:

  1. mod1 <- glmer(Y ~ X*Condition + (0 + dummy(Condition, "A") + X:dummy(Condition, "A") + dummy(Condition, "B") + X:dummy(Condition, "B")|subject)

  2. mod2 <- glmer(Y ~ X*Condition + (1 + X + dummy(Condition, "B") + X:dummy(Condition, "B")|subject)


mod1 was not identical to mod
summary(mod1)

Random effects:
 Groups  Name                        Variance Std.Dev. Corr             
 subject dummy(Condition, "A")        0.8508   0.9224                    
         dummy(Condition, "B")        0.8854   0.9409    0.69            
         dummy(Condition, "A"):X      0.1120   0.3347   -0.83 -0.73      
         X:dummy(Condition, "B")      0.1777   0.4215   -0.46 -0.64  0.75
Number of obs: 39401, groups:  subject, 219

Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)       2.49297    0.06911   36.07  < 2e-16 ***
X                -1.04807    0.04032  -25.99  < 2e-16 ***
ConditionB       -0.19657    0.06299   -3.12  0.00181 ** 
X:ConditionB      0.23894    0.05060    4.72 2.33e-06 ***

The random effects results are different. Why is
Intercept != dummy(Condition, "A")
X != X:dummy(Condition, "A")
ConditionB != dummy(Condition, "B")
X:ConditionB != X: dummy(Condition, "B") ? What do the dummy variables mean?


mod2 was equal to mod

Random effects:
 Groups  Name                        Variance Std.Dev. Corr             
 subject (Intercept)                 0.85052  0.9222                    
         X                           0.08427  0.2903   -1.00            
         dummy(Condition, "B")       0.54367  0.7373   -0.37  0.37      
         X:dummy(Condition, "B")     0.14812  0.3849    0.26 -0.26 -0.56
Number of obs: 39401, groups:  subject, 219

Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)       2.49686    0.06909   36.14  < 2e-16 ***
X                -1.03854    0.03812  -27.24  < 2e-16 ***
ConditionB       -0.19707    0.06382   -3.09  0.00202 ** 
X:ConditionB      0.22809    0.05356    4.26 2.06e-05 ***

This means that dummy(Condition, "B") = ConditionB in mod. But what does then dummy(Condition, "A") in mod1 mean?


Update 1, according to @amoeba's answer

So, if I got this right, the imaginary data matrix would look like this:

    Intercept(X)    Intercept   ConditionBX ConditionB  dummyA(X)   dummyA  dummyB(X)   dummyB
S1  A       1           5           0           -2          1           5       0           0
S1  B       1           5           1           -2          0           0       1           3
S2  A       1           2           0           -1          1           2       0           0
S2  B       1           2           1           -1          0           0       1           1

In the first case it stands: $Y_1=Intercept*InterceptX+ConditionB*ConditionX$.
In the second case it stands: $Y_2=dummy_AX*dummy_A + dummy_BX*dummy_B$

Is this right?

User33268
  • 1,408
  • 2
  • 10
  • 21
  • As I explained elsewhere, this is the issue of categorical encoding. In X*Condition, Intercept (all 1s) is the value of intercept in A and ConditionB (which is 0 in A and 1 in B) is the difference of intercepts between A and B. In your mod1, one dummy (1 in A, 0 in B) is intercept in A and another dummy (1 in B, 0 in A) is intercept in B. The models are equivalent but they are coded differently. – amoeba Jan 19 '18 at 20:49

0 Answers0