In my model, I have a Response variable, 0s or 1s.
I have 15 categorical variables, some of which have 150+ levels. Should I potentially exclude them from my model?
When I run full model <- glm(Response ~ Category1 + Category2 + ... + Category15 -1, data=dataframe, family="binomial")
I get:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
Should I exclude Categories with many levels? Ideally, I would exclude them by running anova with the full model and model with that category omitted as:
anova(fullmodel, model_test, test="LRT")
Note, that for the model_test
the GLM converges fine.