0

I have the following result of a logistic regression:

                                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)                        3.34260    0.41116   8.130 4.30e-16 ***
AgeGr18-22                        -0.61845    0.28974  -2.135 0.032799 *  
AgeGr28-30                        -0.46384    0.27474  -1.688 0.091361 .  
AgeGr31-35                         0.38351    0.28102   1.365 0.172352    
AgeGr36-40                        -0.24538    0.25113  -0.977 0.328525    
AgeGr41-50                         0.11316    0.23918   0.473 0.636140    
AgeGr51-high                       0.49277    0.29597   1.665 0.095924 .  
AutomobileGr1                      0.61832    0.17570   3.519 0.000433 ***
AutomobileGr2-high                -0.07095    0.37665  -0.188 0.850590    

The dummy "AutomobileGr2-high" (having 2 or more automobiles) have a P-value 0.85 and I would like to drop it from the model.

Would it make a difference if I just drop the dummy (i.e. create a regression formula without this group) or if I put it into the reference group (our reference group in the example above is AutomobileGr0 - ppl that do not have a car, hence the combined reference group will be AutombileGr0 & AutombileGr>=2 -> ppl either without a car or with at least 2).

From what I read so far (http://www.ats.ucla.edu/stat/mult_pkg/faq/general/dummy.htm) it should make a difference on the intercept hence on the whole resulting target log odds.

What approach is suggested as best practice (or what are the pros and cons of the two approaches)?

Bullzeye
  • 33
  • 4
  • See also [If a factor variable is to be dropped in model selection, should all levels be dropped simultaneously?](http://stats.stackexchange.com/q/18745/17230), [Regression with categorical predictors - use only some dummy variables](http://stats.stackexchange.com/q/146351/17230), [Is it advisable to drop certain levels of a categorical variable?](http://stats.stackexchange.com/q/141063/17230). – Scortchi - Reinstate Monica Jun 26 '15 at 08:45
  • And it looks like both of your predictors could well be considered ordinal: see [Logistic regression and ordinal independent variables](http://stats.stackexchange.com/q/101511/17230), [Coding for an ordered covariate](http://stats.stackexchange.com/q/77796/17230), [Continuous dependent variable with ordinal independent variable](http://stats.stackexchange.com/q/33413/17230), & [Logit with ordinal independent variables](http://stats.stackexchange.com/q/5387/17230). – Scortchi - Reinstate Monica Jun 26 '15 at 08:54
  • Thank you guys!!! Your help is much appreciated. I should improve my searching skills, obviously! ;) – Bullzeye Jun 26 '15 at 11:15

0 Answers0