0

What is the best way to know the best selection method for our logistic regression model ?

  • Forward Selection
  • Backward Elimination
  • Step-wise Selection
  • Full Model.

I am running logistic regression in SAS EG.

I have tried running each of them for the data that we have but for each method I get different results. Some of the factors are common but some are not. Basically, I just want to understand the logic behind selection of the method. So, Is there a way we can identify that for a specific data one method is better than the other based on the results that we get ? Please help.

Thanks

Gavin Simpson
  • 37,567
  • 5
  • 110
  • 153
  • 1
    This question has already been asked [here](http://stats.stackexchange.com/questions/18638/model-selection-logistic-regression) (look at the linked questions too, & try the `model-selection` tag (the same considerations apply for all regression methods). – Scortchi - Reinstate Monica Jan 10 '14 at 21:12
  • But that thread doesn't have any correct answer which I can follow ? – learnlearn10 Jan 10 '14 at 21:14
  • 1
    It has one with seven up-votes, which should suggest it's worth reading to see what you think. (A tick next to an answer only means the person who asked it thought it useful, not that it's correct; the absence of a tick means nothing at all.) – Scortchi - Reinstate Monica Jan 10 '14 at 21:17
  • 3
    [This one](http://stats.stackexchange.com/questions/20836/algorithms-for-automatic-model-selection) should also be very useful. And for the question in your penultimate sentence, the clue's in the name of our site. – Scortchi - Reinstate Monica Jan 10 '14 at 21:24
  • 3
    None of those. The first three will inevitably bias the estimates of any coefficients (variables not selected are forced to have exactly 0 effect). What is the purpose of the model? The Lasso or elastic net methods might be usefully applied in this case which applies shrinkage to size of the absolute (or absolute and squared in case of elastic net) values of the coefficients. When a coefficient is shrunk to zero it is removed from the model. I don't know is SAS has this, but the **glmnet** package for R will do this for you. – Gavin Simpson Jan 10 '14 at 21:55
  • @Gavin: Not necessarily anything wrong with the fourth. – Scortchi - Reinstate Monica Jan 10 '14 at 21:58
  • @Scortchi not necessarily, though implication of OP's question is that the full model is over-fitted. – Gavin Simpson Jan 10 '14 at 22:10
  • @Gavin: I'd like to think so, but there's an odd idea amongst some that automatic model selection is just what you always do rather than a solution to a problem you've found you have with regard to the purpose of the model. – Scortchi - Reinstate Monica Jan 10 '14 at 22:19

0 Answers0