Can I use binary variables in R's glm function with a binomial outcome (logistic regression)?
Asked
Active
Viewed 3,587 times
0
-
4You mean as predictors? Yes. – Jeremy Miles Nov 23 '15 at 14:42
-
Yes you definitely can. – Nov 23 '15 at 16:30
1 Answers
2
The short answer is yes you can.
Here is a minimal working example of a logistic regression with one binary predictor variable.
set.seed(4)
###Creat some psuedo data
n = 100
x = rbinom(n,1,0.5)
y = x
y[sample(1:n,10,replace=FALSE)] = 1
y[sample(1:n,10,replace=FALSE)] = 0
model = glm(y~x,family="binomial")
and so y
is my binary output, and x
is my binary predictor variable. The code runs with no error (so clearly you can include a binary predictor variable) and the example output from running this code would be:
> model
Call: glm(formula = y ~ x, family = "binomial")
Coefficients:
(Intercept) x
-3.02 5.16
Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
Null Deviance: 138.3
Residual Deviance: 54.54 AIC: 58.54
-
What if you skip the sample lines and leave x = y, so we get a perfect separating predictor. Why in that case is the model becoming bad and resulting in a p-value of 1? – Ansjovis86 Nov 24 '15 at 16:52
-
@Ansjovis86 you can read about it here: http://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression – Nov 24 '15 at 16:54