0

Can I use binary variables in R's glm function with a binomial outcome (logistic regression)?

Ansjovis86
  • 455
  • 4
  • 15

1 Answers1

2

The short answer is yes you can.

Here is a minimal working example of a logistic regression with one binary predictor variable.

set.seed(4)

###Creat some psuedo data
n = 100
x = rbinom(n,1,0.5)
y = x
y[sample(1:n,10,replace=FALSE)] = 1
y[sample(1:n,10,replace=FALSE)] = 0

model = glm(y~x,family="binomial")

and so y is my binary output, and x is my binary predictor variable. The code runs with no error (so clearly you can include a binary predictor variable) and the example output from running this code would be:

> model

Call:  glm(formula = y ~ x, family = "binomial")

Coefficients:
(Intercept)            x  
      -3.02         5.16  

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:      138.3 
Residual Deviance: 54.54        AIC: 58.54
  • What if you skip the sample lines and leave x = y, so we get a perfect separating predictor. Why in that case is the model becoming bad and resulting in a p-value of 1? – Ansjovis86 Nov 24 '15 at 16:52
  • @Ansjovis86 you can read about it here: http://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression –  Nov 24 '15 at 16:54