Using glm() as substitute for simple chi square test

Question

I am interested in changing the null hypotheses using glm() in R.

For example:

x = rbinom(100, 1, .7)  
summary(glm(x ~ 1, family = "binomial"))

tests the hypothesis that $p = 0.5$. What if I want to change the null to $p$ = some arbitrary value, within glm()?

I know this can be done also with prop.test() and chisq.test(), but I'd like to explore the idea of using glm() to test all hypotheses relating to categorical data.

+1. $p$ evidently refers to the Binomial parameter *expressed as a probability.* Since the natural link (and the one used by `glm` by default) is the logit, to avoid confusion it's important to distinguish $p$ from its logit, which is the log odds $\log(p/(1-p))$. — whuber, Dec 29 '17 at 13:57

Ben Bolker · Answer 1 · 2017-12-29T14:11:42.647

20

You can use an offset: glm with family="binomial" estimates parameters on the log-odds or logit scale, so $\beta_0=0$ corresponds to log-odds of 0 or a probability of 0.5. If you want to compare against a probability of $p$, you want the baseline value to be $q = \textrm{logit}(p)=\log(p/(1-p))$. The statistical model is now

\begin{split} Y & \sim \textrm{Binom}(\mu) \\ \mu & =1/(1+\exp(-\eta)) \\ \eta & = \beta_0 + q \end{split}

where only the last line has changed from the standard setup. In R code:

use offset(q) in the formula
the logit/log-odds function is qlogis(p)
slightly annoyingly, you have to provide an offset value for each element in the response variable - R won't automatically replicate a constant value for you. This is done below by setting up a data frame, but you could just use rep(q,100).

x = rbinom(100, 1, .7)
dd <- data.frame(x, q = qlogis(0.7)) 
summary(glm(x ~ 1 + offset(q), data=dd, family = "binomial"))

edited Dec 29 '17 at 14:11

answered Dec 29 '17 at 14:01

Ben Bolker

34,308
2
93
126

2

(+1) this will give you the Wald test. A LRT can be done fitting the null model `glm(y ~ offset(q)-1, family=binomial, data=dd)` and using `lrtest` from the `lmtest` package. Pearson's chi-square test is the score test for the GLM model. Wald/LRT/Score are all consistent tests and should provide equivalent inference in reasonably large sample sizes. – AdamO Dec 29 '17 at 17:55
1

I think you can also use `anova()` from base R on the glm to get a LR test – Ben Bolker Dec 29 '17 at 18:40
Interesting, I've lost the habit of using ANOVA. However, I observe anova refuses to print the pvalue for the test whereas `lrtest` does. – AdamO Dec 29 '17 at 18:59
2

maybe `anova(.,test="Chisq")` ? – Ben Bolker Dec 29 '17 at 19:05

score 7 · Answer 2 · answered Dec 29 '17 at 13:23

7

Look at confidence interval for parameters of your GLM:

> set.seed(1)
> x = rbinom(100, 1, .7)
> model<-glm(x ~ 1, family = "binomial")
> confint(model)
Waiting for profiling to be done...
    2.5 %    97.5 % 
0.3426412 1.1862042

This is a confidence interval for log-odds.

For $p=0.5$ we have $\log(odds) = \log \frac{p}{1-p} = \log 1 = 0$. So testing hypothesis that $p=0.5$ is equivalent to checking if confidence interval contains 0. This one does not, so hypothesis is rejected.

Now, for any arbitrary $p$, you can compute log-odds and check if it is inside confidence interval.

answered Dec 29 '17 at 13:23

Łukasz Deryło

3,735
1
10
26

1

this is useful, but only works for checking whether $p<0.05$. If you want the actual p-value my answer will be more useful. – Ben Bolker Dec 29 '17 at 14:12
2

You can ask for any level of confidence in `confint`. So it is not only for $p<0,05$. Of course your solution is much better when it comes to calculation of p-value – Łukasz Deryło Dec 29 '17 at 17:18

Sextus Empiricus · Answer 3 · 2017-12-29T19:40:42.787

It is not (entirely) correct/accurate to use the p-values based on the z-/t-values in the glm.summary function as a hypothesis test.

It is confusing language. The reported values are named z-values. But in this case they use the estimated standard error in place of the true deviation. Therefore in reality they are closer to t-values. Compare the following three outputs:
1) summary.glm
2) t-test
3) z-test

> set.seed(1)
> x = rbinom(100, 1, .7)

> coef1 <- summary(glm(x ~ 1, offset=rep(qlogis(0.7),length(x)), family = "binomial"))$coefficients
> coef2 <- summary(glm(x ~ 1, family = "binomial"))$coefficients

> coef1[4]  # output from summary.glm
[1] 0.6626359
> 2*pt(-abs((qlogis(0.7)-coef2[1])/coef2[2]),99,ncp=0) # manual t-test
[1] 0.6635858
> 2*pnorm(-abs((qlogis(0.7)-coef2[1])/coef2[2]),0,1) # manual z-test
[1] 0.6626359

They are not exact p-values. An exact computation of the p-value using the binomial distribution would work better (with the computing power nowadays, this is not a problem). The t-distribution, assuming a Gaussian distribution of the error, is not exact (it overestimates p, exceeding the alpha level occurs less often in "reality"). See the following comparison:

# trying all 100 possible outcomes if the true value is p=0.7
px <- dbinom(0:100,100,0.7)
p_model = rep(0,101)
for (i in 0:100) {
  xi = c(rep(1,i),rep(0,100-i))
  model = glm(xi ~ 1, offset=rep(qlogis(0.7),100), family="binomial")
  p_model[i+1] = 1-summary(model)$coefficients[4]
}


# plotting cumulative distribution of outcomes
outcomes <- p_model[order(p_model)]
cdf <- cumsum(px[order(p_model)])
plot(1-outcomes,1-cdf, 
     ylab="cumulative probability", 
     xlab= "calculated glm p-value",
     xlim=c(10^-4,1),ylim=c(10^-4,1),col=2,cex=0.5,log="xy")
lines(c(0.00001,1),c(0.00001,1))
for (i in 1:100) {
  lines(1-c(outcomes[i],outcomes[i+1]),1-c(cdf[i+1],cdf[i+1]),col=2)
#  lines(1-c(outcomes[i],outcomes[i]),1-c(cdf[i],cdf[i+1]),col=2)
}

title("probability for rejection as function of set alpha level")

The black curve represents equality. The red curve is below it. That means that for a given calculated p-value by the glm summary function, we find this situation (or larger difference) less often in reality than the p-value indicates.

Hmm.. I may be confused about the rationale for using the T distribution for a GLM. Can you take a peak at a related question I just asked [here](https://stats.stackexchange.com/questions/320812/should-degrees-of-freedom-corrections-be-used-for-inference-on-glm-parameters)? — AdamO, Dec 29 '17 at 18:28
This answer is interesting but problematic. (1) the OP didn't actually ask about the difference between score, chi-squared, "exact", or GLM-based approaches to testing hypotheses about binomial responses (they *might* actually know all this stuff already), so this doesn't answer the question that was asked; (2) the estimates of residual variance etc. have a different set of assumptions and sampling distributions from linear models (as in @AdamO's question), so use of a t-test is debatable; ... — Ben Bolker, Dec 29 '17 at 19:10
(3) 'exact' confidence intervals for binomial responses are actually tricky ('exact' [Clopper-Wilson] intervals are conservative; score tests may perform better over some ranges — Ben Bolker, Dec 29 '17 at 19:11
@Ben You are right that the z-test si actually better than the t-test. The graph displayed in the answer is for the z-test. It uses the output of the GLM function. The bottom-line of my answer was that the "p-value" is a tricky thing. Therefore, I find it better to compute it explicitly, e.g. using the normal distribution, rather than extracting the p-value from a glm function, which has very conveniently been shifted with the offset but hides the origins of the calculations for the p-value. — Sextus Empiricus, Dec 29 '17 at 19:25
@BenBolker, I believe the exact test is indeed conservative, but... only because in reality we are not sampling from perfect binomial distributions. The alternative z-test, is only better from an *empirical* point of view. It is that the two "errors" cancel each other 1) the binomial distribution not being the real distribution of residuals in practical situations, 2) the z-distribution not being an exact expression for a binomial distribution. It is questionable whether we should prefer the *wrong* distribution for the *wrong* model, just because in practice it turns out 'ok'. — Sextus Empiricus, Dec 29 '17 at 19:46

Using glm() as substitute for simple chi square test

3 Answers3

Linked