Small dimensional classification (< 20 features), one (or two) dominant predictors

Question

I am trying to learn a logistic regression classifier using glm on a dataset of <20 features, but lots of samples. One of the features is a very strong predictor. As a result, the trained model is predicting extreme probabilities of 1.0 and 0.0 on majority of test data. Although the model converges, I get repeated warnings of such extreme predictions.

Q1. Are the posterior probabilities/predictions still valid?

Q2. How should I reduce the effect of the strong predictor so that the contributions from other variables are taken into account during inference? Right now, they are completely overpowered by the one predictor.

You can check whether predictions is valid or not by Cross Validation, for example. But I didn't get your point about Q2: why do you need to reduce the effect of a good predictor? In most practical cases that would mean that you reduce classification quality? — Dmitry Laptev, Jun 14 '12 at 13:31

score 0 · Answer 1 · answered Aug 17 '18 at 09:39

Partially answered in comments:

You can check whether predictions is valid or not by Cross Validation, for example. But I didn't get your point about Q2: why do you need to reduce the effect of a good predictor? In most practical cases that would mean that you reduce classification quality? – Dmitry Laptev

You should still check if the predictor is legitimate, could it be that it is almost a version of the $Y$ variable? Another possibility is that there is (quasi)separation, which isn't a problem per se if the strong predictor is correct, but might be a problem if the goal is inference and not (only) prediction, since it invalidates the usual approximations. See How to deal with perfect separation in logistic regression?

Small dimensional classification (< 20 features), one (or two) dominant predictors

1 Answers1