1

I'm quite new on this with binomial data tests, confusing when doing the analysis in R using glmer and lmer

I am doing an experimental analysis which is a mixed design: 2*2*2.

Three independents variables in which one is a between-subject factor, and the others are within-subject. The three independent variables are all binormal.

The experiment is setting two groups (unequal number), all of them need to read 8 vignettes of which have two types and some of the contents have been told the participants but some are not.

So the three independent variables are: groups, types, communicated.

Now, I need to test the responses after each reading. One of the dependent variable (break) is also binormal (yes or no). Other dependent variables are ordinal and continuous.

For example, I have a hypothesis that to verify the first group is more sensitive to 'break'. Here is my code and results.

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: break ~ group + (1 | id)
   Data: df

     AIC      BIC   logLik deviance df.resid 
  3639.6   3657.8  -1816.8   3633.6     3213 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.8274 -1.4997  0.5653  0.5838  0.6668 

Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 0.07157  0.2675  
Number of obs: 3216, groups:  id, 402

Fixed effects:
                       Estimate Std. Error z value Pr(>|z|)    
(Intercept)             1.14543    0.06491  17.647   <2e-16 ***
groupI                  -0.08177    0.08625  -0.948    0.343    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr)
groupI    -0.721

For another example, my hypothesis is about the first group would have more strong emotional reactions to type 1 break. Here is my code and results.

Models:
model.H3a: emo ~ group + type + (1 | id)
model.H3b: emo ~ group * type + (1 | id)
          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
model.H3a    5 9589.1 9619.5 -4789.6   9579.1                     
model.H3b    6 9589.5 9626.0 -4788.8   9577.5 1.5838  1     0.2082
> summary(model.H3a)

Linear mixed model fit by REML ['lmerMod']
Formula: emo ~ group + type + (1 | id)

REML criterion at convergence: 9594.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.8125 -0.7602  0.1097  0.9639  1.2246 

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept) 0.02211  0.1487  
 Residual             1.13143  1.0637  
Number of obs: 3216, groups:  id, 402

Fixed effects:
                       Estimate Std. Error t value
(Intercept)             3.97730    0.03495 113.790
groupI                  -0.03096    0.04042  -0.766
typeI                   -0.16604    0.03751  -4.426

Correlation of Fixed Effects:
            (Intr) grou
groupI     -0.616       
typI       -0.537  0.000
> summary(model.H3b)
Linear mixed model fit by REML ['lmerMod']
Formula: emo ~ group * type + (1 | id)
   Data: df

REML criterion at convergence: 9596.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.8084 -0.7681  0.1077  0.9684  1.2457 

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept) 0.02214  0.1488  
 Residual             1.13119  1.0636  
Number of obs: 3216, groups:  id, 402

Fixed effects:
                                            Estimate Std. Error t value
(Intercept)                                  3.95213    0.04027  98.130
groupI                                       0.01633    0.05520   0.296
typeI                                       -0.11569    0.05485  -2.109
groupI:typeI                                -0.09459    0.07518  -1.258

Correlation of Fixed Effects:
            (Intr) grou typ
groupI        -0.730              
typI         -0.681  0.497       
groupI:typ  0.497 -0.681 -0.730
table_model

                        emo                 emo 
Predictors  Estimates   CI  p   Estimates   CI  p
(Intercept) 3.98    3.91 – 4.05 <0.001  3.95    3.87 – 4.03 <0.001
group [I]   -0.03   -0.11 – 0.05    0.444   0.02    -0.09 – 0.12    0.767
type [I]    -0.17   -0.24 – -0.09   <0.001  -0.12   -0.22 – -0.01   0.035
group [I]
* type [I]              -0.09   -0.24 – 0.05    0.208

Random Effects
σ2  1.13    1.13
τ00 0.02 id 0.02 id
ICC 0.02    0.02
N   402 id  402 id
Observations    3216    3216
Marginal R2 / Conditional R2    0.006 / 0.025   0.007 / 0.026

I run the code like this, all the variables are nonsignificant (except the intercept and only type variables in the second hypothesis). I'm not sure if I made some mistakes and how to interpret the outcome. I also changed the random intercept, but still nonsignificant.

Can someone help me to check it? And if there are some articles or books recommended that help me to figure it out or systematically learn it.

Or is there a more appropriate test for my cases?

I appreciate any feedback on this.

Jay
  • 11
  • 2

1 Answers1

2

A "significant" intercept is one whose estimated value is "significantly" different from 0. In a logistic regression, that means different from equal outcome group probabilities when the predictors are at reference levels (categorical) or at 0 (continuous). So just centering a continuous predictor or changing the reference level of a categorical predictor can change the "significance" of an intercept. Don't put too much importance onto the "significance" of an intercept in this case.

With 3 pre-defined predictors you shouldn't be doing these analyses based on just one or 2 of them. In logistic regression omitting any predictor related to outcome whether correlated with the other predictors or not will tend to bias results, often in a way that makes it harder to find true associations with outcome. So you would be better off working with the full model (also including communicated as a predictor, potentially with interaction terms, if that makes sense based on the subject matter) and testing your specific hypotheses based on that full model.

When you have multiple outcomes things get even more complicated. I'd recommend finding some local statistical expertise to help with the issues that arise then, as (1) you might want to take associations among outcomes into account and (2) with separate models for each of the outcomes you need to correct for multiple comparisons.

EdM
  • 57,766
  • 7
  • 66
  • 187
  • Thank you so much EdM, I also tried the code including three independent variables, like: `break ~ group+types+communicated+(1|vignette)+(1|id)`. But the results are nonsignificant as well. I have no idea how to deal with it. Do you have some recommend articles that relate to my case I can learn a bit? Thanks again. – Jay Jun 01 '20 at 10:04
  • @Jay if your "hypothesis is...the first group would have more strong emotional reactions to type 1 break" then you need to add a `group:types` interaction term. The additive model in your comment would not be appropriate if that's your hypothesis. If that interaction coefficient isn't significant, then there isn't support in your data/model for that hypothesis. The issues here seem to be about principles of linear modeling rather than specifics to generalized linear or mixed modeling. See [this page](https://stats.stackexchange.com/q/25632/28500) for links to more reading. – EdM Jun 01 '20 at 12:46