I have been running logistic regression in R, and have been having an issue where as I include more predictors the z-scores and respective p-values approach 0 and 1 respectively. For example if have few predictors:
> model1
b17 ~ i74 + i73 + i72 + i71
> step1<-glm(model1,data=newdat1,family="binomial")
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.9461 1.8953 -3.665 0.000247 ***
i74 0.6842 0.9543 0.717 0.473384
i73 1.7691 4.8008 0.368 0.712502
i72 0.5134 2.0142 0.255 0.798812
i71 -0.6753 4.9173 -0.137 0.890771
The results appear to be fairly reasonable; however, if I have more predictors:
> model1
b17 ~ i90 + i89 + i88 + i87 + i86 + i85 + i84 + i83 + i82 + i81 +
i80 + i79 + i78 + i77 + i76 + i74 + i73 + i72 + i71
> step1<-glm(model1,data=newdat1,family="binomial")
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.887e+02 3.503e+05 -0.001 0.999
i90 1.431e-01 1.009e+04 0.000 1.000
i89 8.062e+01 1.027e+05 0.001 0.999
i88 9.738e+01 7.398e+04 0.001 0.999
i87 -1.980e+01 9.469e+03 -0.002 0.998
i86 9.829e+00 1.098e+05 0.000 1.000
i85 5.917e+01 3.074e+04 0.002 0.998
i84 -2.373e+01 1.378e+05 0.000 1.000
i83 7.257e+00 2.173e+05 0.000 1.000
i82 -1.397e+01 1.894e+05 0.000 1.000
i81 6.503e+01 1.373e+05 0.000 1.000
i80 3.728e+01 4.904e+04 0.001 0.999
i79 1.010e+02 5.556e+04 0.002 0.999
i78 -2.628e+01 1.546e+05 0.000 1.000
i77 4.725e+01 3.027e+05 0.000 1.000
i76 -6.517e+01 1.509e+05 0.000 1.000
i74 1.267e+01 1.175e+05 0.000 1.000
i73 2.796e+02 5.280e+05 0.001 1.000
i72 -2.533e+02 4.412e+05 -0.001 1.000
i71 -1.240e+02 4.387e+05 0.000 1.000
I know it is hard to say exactly what is going on without seeing the data, but the predictors are all 5-point Likert Scale items. However, are there any thoughts to what is occurring here? I don't have much experience with logistic regression, so I apologize if the question seems naive, but is there a certain threshold of predictors where logistic regression falls apart due to having such a large amount of predictors what is ultimately a very small amount of variance? Is the potentially a multi-co-linearity issue? Finally, when I run OLS regression on the data I get results that make more sense (or at least appear to), is it okay/what are the consequences of running OLS regression on a binary outcome? Thank you!