1

I am conducting a Poisson regression to find differences between universities in 5 regions regarding the number of crimes occured there.

The output:

Call:
glm(formula = nv ~ region, family = poisson, data = my_data)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-4.0661  -2.6186  -1.0888   0.7035   6.6723  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   1.3412     0.1240  10.813  < 2e-16 ***
regionMW      0.4834     0.1775   2.723  0.00647 ** 
regionNE      0.4426     0.1529   2.894  0.00380 ** 
regionSE      0.7711     0.1531   5.035 4.77e-07 ***
regionSW      0.3265     0.1851   1.764  0.07768 .  
regionW       0.5306     0.1861   2.852  0.00434 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 649.34  on 80  degrees of freedom
Residual deviance: 621.24  on 75  degrees of freedom
AIC: 851.12

Number of Fisher Scoring iterations: 6

I have a hypothesis which compares the regionSE and regionSW. Can I just calculate the confidence intervals of their estimates (e. g., region SE LL: 0.77 - 1.96x0.15 / UL: 0.77 + 1.96x0.15) and conclude that they are significantly different if their confidence intervals do not overlap?

Or must I use the package "emmeans" to compare between the groups?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
harico287
  • 35
  • 4
  • 2
    Here are two reasons: (1) It is *incorrect* to test comparisons based on overlapping CIs. (2) You are misinterpreting the regression estimates. The intercept is actually the prediction for the first region, and the remaining coefficients are differences between the respective regions and the first one. Thus, you already have tests of some of the pairwise comparisons, but not all of them. – Russ Lenth Oct 24 '20 at 23:23
  • 2
    And by the way, you have six regions, not five. – Russ Lenth Oct 25 '20 at 02:43

0 Answers0