Logistic Regression Model Selection Criteria

Question

I'm having a go at coding a logistic regression model building algorithm and I'd appreciate some advice. I've read in several places (including here) that minimizing both AIC and BIC could be an effective strategy for model evaluation.

My algorithm employs the following (undoubtedly simplistic) approach:

Fit a model with the intercept only, establish AIC and BIC for null model.
Iterate through the list of features, creating test models that consist of each feature plus the intercept.
Evaluate the test models, selecting that model which minimizes both AIC and BIC scores.
Set AIC and BIC threshold to that of current best model values.
Remove the feature selected in current round.
Repeat, iterating through remaining features, creating new sets of test models for comparison (including previously selected features plus intercept), until no further reduction in AIC and BIC values occur, return the model with current lowest scores for AIC and BIC.

I'm sure this is a hopelessly naive approach, and I'd appreciate some feedback. I suppose my basic question is whether I'm completely misusing these criteria. Thus far, it returns a parsimonious model with very low p-values for all features. Examining the features in depth, it certainly provides some useful insights into the data I'm working with. It's definitely narrowed the field, and given that my data set has 40 features (the final model contained only 6), it's obviously preferable to guessing!

And what makes you want to do 'model selection' instead of just 'model specification'? — Frank Harrell, Nov 02 '17 at 01:28
Thanks for the link, I'll definitely check it out. I'm working with both categorical and interval data, so I've gone with binary logistic regression. — mplane, Nov 02 '17 at 01:33
I went ahead and voted to close as a duplicate. If you think the proposed duplicate does not apply to your question, then please edit your question to draw attention to *why* the proposal is not a duplicate. Thanks! — Stephan Kolassa, Nov 02 '17 at 08:44

Logistic Regression Model Selection Criteria

0 Answers0