0

so I am using GLM for logistic regression in R and I have some variables with many factors.

I ran the model and has the result like this:

enter image description here

My question is: 1. Is this variable significant? Considering only Married and Separated are significant here (p-values <0.05). What about if there are 10 different categories for a variable, but only 3 of them are significant? Should I drop it or keep it in the model?

  1. Also, in the case of a model where all the variables are significant but the AIC value is much higher than the one where a variable included is not significant, which one is more useful? What can cause this?
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user71812
  • 123
  • 2
  • 5
    You should employ some knowledge about the research matter. Depending on what is studied there, it might make sense to group `married` and `widow` together or maybe not. Use `anova` output to judge if the factor variable is significant. If you have many factor levels, you could try dimension reduction techniques. Or it might make sense to model this as a random effect. Not enough information provided and it's off-topic anyway. Voting to close. – Roland Aug 01 '19 at 07:52
  • @Roland Regarding random effects, I think it's generally recommend to have quite a bit more than 4 levels. – mkt Aug 02 '19 at 05:28
  • @mkt Yes, and the previous sentence mentions having many factor levels. – Roland Aug 02 '19 at 05:50
  • 1
    @Roland Ah, I initially didn't read that as applying to your random effect recommendation as well. – mkt Aug 02 '19 at 05:57

0 Answers0