It may help you to read my answer here: Why do my p-values differ between logistic regression output, chi-squared test, and the confidence interval for the OR? Your question here is nearly a duplicate of that, but there are a couple of additional elements in your question that can be addressed.
As @CliffAB notes, the p-values in the summary.glm()
output are from Wald tests. These are analogous to $t$-tests of coefficients for a linear model in that they are the difference between the fitted value of the coefficient and the reference value (taken to be $0$), divided by the standard error. The difference is that these are taken to be distributed as a standard normal instead of $t$. On the other hand, these are valid for large samples and we don't necessarily know what constitutes a 'large sample' in any given case.
Using anova.glm()
gives you access to different tests. When you set test="Rao"
, it gives you the p-value from a score test. And when you set either test="Chisq"
or test="LRT"
(they are the same), it gives you the p-value from a likelihood ratio test.
The anova.glm()
function does test the same null hypothesis as the Wald test in the summary()
output in this case. That is only because your model has just one variable. The anova.glm()
function will perform sequential tests, which are analogous to 'type I SS' in a linear setting, whereas the Wald tests from summary()
are analogous to 'type III SS' in a linear setting (see my answer here: How to interpret type I, type II, and type III ANOVA and MANOVA?). Consider:
x2 = rnorm(n)
m2 = glm(y~x+x2, family="binomial")
summary(m2)$coefficients
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.05906436 0.186876339 -0.3160612 7.519561e-01
# x -0.01567551 0.003537183 -4.4316372 9.352029e-06
# x2 -0.05967796 0.099093504 -0.6022388 5.470152e-01
anova(m2, test="LRT")
# Terms added sequentially (first to last)
#
# Df Deviance Resid. Df Resid. Dev Pr(>Chi)
# NULL 499 619.10
# x 1 20.3841 498 598.72 6.335e-06 ***
# x2 1 0.3627 497 598.35 0.547
m3 = glm(y~x2+x, family="binomial") # I just switched the order of x & x2 here
summary(m3)$coefficients
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.05906436 0.186876339 -0.3160612 7.519561e-01
# x2 -0.05967796 0.099093504 -0.6022388 5.470152e-01 # these are the same
# x -0.01567551 0.003537183 -4.4316372 9.352029e-06 # as above
anova(m3, test="LRT")
# Terms added sequentially (first to last)
#
# Df Deviance Resid. Df Resid. Dev Pr(>Chi)
# NULL 499 619.10
# x2 1 0.1585 498 618.94 0.6906 # these differ from the
# x 1 20.5883 497 598.35 5.694e-06 *** # anova output above
You can shoehorn the anova.glm()
function to give you score and likelihood ratio tests of individual variables in a multiple logistic regression model that are analogous to 'type III SS', but it is tedious. You would need to keep refitting your model so that each variable in turn is listed last in the formula provided to the glm()
call. The last p-value listed in the anova.glm()
output is one that will be analogous to 'type III SS'.
To get the score or likelihood ratio tests of individual variables more conveniently, use drop1()
instead. Consider:
drop1(m3, test="LRT")
# Single term deletions
#
# Model:
# y ~ x2 + x
# Df Deviance AIC LRT Pr(>Chi)
# <none> 598.35 604.35
# x2 1 598.72 602.72 0.3627 0.547 # the same as when x2 is last above
# x 1 618.94 622.94 20.5883 5.694e-06 *** # the same as when x is last above