I am trying to learn a logistic regression classifier using glm on a dataset of <20 features, but lots of samples. One of the features is a very strong predictor. As a result, the trained model is predicting extreme probabilities of 1.0 and 0.0 on majority of test data. Although the model converges, I get repeated warnings of such extreme predictions.
Q1. Are the posterior probabilities/predictions still valid?
Q2. How should I reduce the effect of the strong predictor so that the contributions from other variables are taken into account during inference? Right now, they are completely overpowered by the one predictor.