16

I am interested in changing the null hypotheses using glm() in R.

For example:

x = rbinom(100, 1, .7)  
summary(glm(x ~ 1, family = "binomial"))

tests the hypothesis that $p = 0.5$. What if I want to change the null to $p$ = some arbitrary value, within glm()?

I know this can be done also with prop.test() and chisq.test(), but I'd like to explore the idea of using glm() to test all hypotheses relating to categorical data.

amoeba
  • 93,463
  • 28
  • 275
  • 317
  • 7
    +1. $p$ evidently refers to the Binomial parameter *expressed as a probability.* Since the natural link (and the one used by `glm` by default) is the logit, to avoid confusion it's important to distinguish $p$ from its logit, which is the log odds $\log(p/(1-p))$. – whuber Dec 29 '17 at 13:57

3 Answers3

20

You can use an offset: glm with family="binomial" estimates parameters on the log-odds or logit scale, so $\beta_0=0$ corresponds to log-odds of 0 or a probability of 0.5. If you want to compare against a probability of $p$, you want the baseline value to be $q = \textrm{logit}(p)=\log(p/(1-p))$. The statistical model is now

\begin{split} Y & \sim \textrm{Binom}(\mu) \\ \mu & =1/(1+\exp(-\eta)) \\ \eta & = \beta_0 + q \end{split}

where only the last line has changed from the standard setup. In R code:

  • use offset(q) in the formula
  • the logit/log-odds function is qlogis(p)
  • slightly annoyingly, you have to provide an offset value for each element in the response variable - R won't automatically replicate a constant value for you. This is done below by setting up a data frame, but you could just use rep(q,100).
x = rbinom(100, 1, .7)
dd <- data.frame(x, q = qlogis(0.7)) 
summary(glm(x ~ 1 + offset(q), data=dd, family = "binomial"))
Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
  • 2
    (+1) this will give you the Wald test. A LRT can be done fitting the null model `glm(y ~ offset(q)-1, family=binomial, data=dd)` and using `lrtest` from the `lmtest` package. Pearson's chi-square test is the score test for the GLM model. Wald/LRT/Score are all consistent tests and should provide equivalent inference in reasonably large sample sizes. – AdamO Dec 29 '17 at 17:55
  • 1
    I think you can also use `anova()` from base R on the glm to get a LR test – Ben Bolker Dec 29 '17 at 18:40
  • Interesting, I've lost the habit of using ANOVA. However, I observe anova refuses to print the pvalue for the test whereas `lrtest` does. – AdamO Dec 29 '17 at 18:59
  • 2
    maybe `anova(.,test="Chisq")` ? – Ben Bolker Dec 29 '17 at 19:05
7

Look at confidence interval for parameters of your GLM:

> set.seed(1)
> x = rbinom(100, 1, .7)
> model<-glm(x ~ 1, family = "binomial")
> confint(model)
Waiting for profiling to be done...
    2.5 %    97.5 % 
0.3426412 1.1862042 

This is a confidence interval for log-odds.

For $p=0.5$ we have $\log(odds) = \log \frac{p}{1-p} = \log 1 = 0$. So testing hypothesis that $p=0.5$ is equivalent to checking if confidence interval contains 0. This one does not, so hypothesis is rejected.

Now, for any arbitrary $p$, you can compute log-odds and check if it is inside confidence interval.

Łukasz Deryło
  • 3,735
  • 1
  • 10
  • 26
  • 1
    this is useful, but only works for checking whether $p<0.05$. If you want the actual p-value my answer will be more useful. – Ben Bolker Dec 29 '17 at 14:12
  • 2
    You can ask for any level of confidence in `confint`. So it is not only for $p<0,05$. Of course your solution is much better when it comes to calculation of p-value – Łukasz Deryło Dec 29 '17 at 17:18
3

It is not (entirely) correct/accurate to use the p-values based on the z-/t-values in the glm.summary function as a hypothesis test.

  1. It is confusing language. The reported values are named z-values. But in this case they use the estimated standard error in place of the true deviation. Therefore in reality they are closer to t-values. Compare the following three outputs:
    1) summary.glm
    2) t-test
    3) z-test

    > set.seed(1)
    > x = rbinom(100, 1, .7)
    
    > coef1 <- summary(glm(x ~ 1, offset=rep(qlogis(0.7),length(x)), family = "binomial"))$coefficients
    > coef2 <- summary(glm(x ~ 1, family = "binomial"))$coefficients
    
    > coef1[4]  # output from summary.glm
    [1] 0.6626359
    > 2*pt(-abs((qlogis(0.7)-coef2[1])/coef2[2]),99,ncp=0) # manual t-test
    [1] 0.6635858
    > 2*pnorm(-abs((qlogis(0.7)-coef2[1])/coef2[2]),0,1) # manual z-test
    [1] 0.6626359
    
  2. They are not exact p-values. An exact computation of the p-value using the binomial distribution would work better (with the computing power nowadays, this is not a problem). The t-distribution, assuming a Gaussian distribution of the error, is not exact (it overestimates p, exceeding the alpha level occurs less often in "reality"). See the following comparison:

    # trying all 100 possible outcomes if the true value is p=0.7
    px <- dbinom(0:100,100,0.7)
    p_model = rep(0,101)
    for (i in 0:100) {
      xi = c(rep(1,i),rep(0,100-i))
      model = glm(xi ~ 1, offset=rep(qlogis(0.7),100), family="binomial")
      p_model[i+1] = 1-summary(model)$coefficients[4]
    }
    
    
    # plotting cumulative distribution of outcomes
    outcomes <- p_model[order(p_model)]
    cdf <- cumsum(px[order(p_model)])
    plot(1-outcomes,1-cdf, 
         ylab="cumulative probability", 
         xlab= "calculated glm p-value",
         xlim=c(10^-4,1),ylim=c(10^-4,1),col=2,cex=0.5,log="xy")
    lines(c(0.00001,1),c(0.00001,1))
    for (i in 1:100) {
      lines(1-c(outcomes[i],outcomes[i+1]),1-c(cdf[i+1],cdf[i+1]),col=2)
    #  lines(1-c(outcomes[i],outcomes[i]),1-c(cdf[i],cdf[i+1]),col=2)
    }
    
    title("probability for rejection as function of set alpha level")
    

    CDF of rejection by alpha

    The black curve represents equality. The red curve is below it. That means that for a given calculated p-value by the glm summary function, we find this situation (or larger difference) less often in reality than the p-value indicates.

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • Hmm.. I may be confused about the rationale for using the T distribution for a GLM. Can you take a peak at a related question I just asked [here](https://stats.stackexchange.com/questions/320812/should-degrees-of-freedom-corrections-be-used-for-inference-on-glm-parameters)? – AdamO Dec 29 '17 at 18:28
  • 2
    This answer is interesting but problematic. (1) the OP didn't actually ask about the difference between score, chi-squared, "exact", or GLM-based approaches to testing hypotheses about binomial responses (they *might* actually know all this stuff already), so this doesn't answer the question that was asked; (2) the estimates of residual variance etc. have a different set of assumptions and sampling distributions from linear models (as in @AdamO's question), so use of a t-test is debatable; ... – Ben Bolker Dec 29 '17 at 19:10
  • 2
    (3) 'exact' confidence intervals for binomial responses are actually tricky ('exact' [Clopper-Wilson] intervals are conservative; score tests may perform better over some ranges – Ben Bolker Dec 29 '17 at 19:11
  • @Ben You are right that the z-test si actually better than the t-test. The graph displayed in the answer is for the z-test. It uses the output of the GLM function. The bottom-line of my answer was that the "p-value" is a tricky thing. Therefore, I find it better to compute it explicitly, e.g. using the normal distribution, rather than extracting the p-value from a glm function, which has very conveniently been shifted with the offset but hides the origins of the calculations for the p-value. – Sextus Empiricus Dec 29 '17 at 19:25
  • 1
    @BenBolker, I believe the exact test is indeed conservative, but... only because in reality we are not sampling from perfect binomial distributions. The alternative z-test, is only better from an *empirical* point of view. It is that the two "errors" cancel each other 1) the binomial distribution not being the real distribution of residuals in practical situations, 2) the z-distribution not being an exact expression for a binomial distribution. It is questionable whether we should prefer the *wrong* distribution for the *wrong* model, just because in practice it turns out 'ok'. – Sextus Empiricus Dec 29 '17 at 19:46