0

I am working on a problem where response variable is binary and my features are dummy variables. I observed when I include intercept to model all the dummy variables' p-values are equal to 1. When I remove constant p-values seems ok. My question is should not we include intercept when we have dummy variables as our only features? If it is so, what is the reason for that?

Another follow up question is about perfect multi-colinearity, I remember from linear regression, if we do not drop one of our dummy variables we cannot invert the matrix due to multi-colinearity thus we cannot get our coefficients. I know logistic regression coef. found through MLE but I wonder if multi-colinearity still causes different issues in this case?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
CheeseBurger
  • 581
  • 1
  • 9
  • 2
    Logistic regression with dummy variables is equivalent to linear discriminant analysis. Inversion of the cross-products matrix with an intercept and dummies spanning all of the possible combinations is not about *multicollinearity*. It's about complete or quasi-complete separation, a data problem that will cause maximum likelihood estimation convergence to fail. Paul Allison explains this well here, https://www.researchgate.net/publication/228813245_ – Mike Hunter Oct 03 '20 at 16:38
  • https://stats.stackexchange.com/questions/131456/confused-about-0-intercept-in-logistic-regression-in-r https://stats.stackexchange.com/questions/430906/interpreting-intercept-in-logistic-regression-with-binary-variable-in-r https://stats.stackexchange.com/questions/154404/fitting-a-logistic-regression-without-an-intercept https://stats.stackexchange.com/questions/230839/interpreting-intercept-in-logistic-regression-when-there-is-more-than-one-catego, https://stats.stackexchange.com/questions/215779/removing-intercept-from-glm-for-multiple-factorial-predictors-only-works-for-fir/218034#218034 – kjetil b halvorsen Oct 03 '20 at 17:11

0 Answers0