Logistic regression - regularized model always predict lower probabilities on average compared to unregularized model

Question

I have a model that is using L2 regularization. The non-regularized model has a few coefficients with a high positive value, but otherwise the features have very similar coefficients. In the regularized model the magnitude of those few positive coefficients have been reduced. This means that the sum total of the log odds are strictly lower in the regularized model than the unregualarized model (vs distributing the same log odds sum differently), so the regularized model should always produce a lower average prediction, which makes it a more conservative model overall.

In my case, two binary flags have their coefficients reduced, but this value is not distributed elsewhere. So if L2 reduces the coefficient for a binary flag, it means the reference group (with value 0) will have the same predictions, while those with value 1 will have slightly lower prediction. It feels like something should happen to the reference group, so it's not just the same or lower....am I thinking about it wrong?

I think you're on the right track. To anchor the idea more firmly, consider what might happen if you use $-1$ and $+1$ instead of $0$ and $1$ as your encoding. — Sycorax, Oct 27 '20 at 22:02
Maybe this answers your Qs: https://stats.stackexchange.com/questions/231285/dropping-one-of-the-columns-when-using-one-hot-encoding/329281#329281 — kjetil b halvorsen, Oct 28 '20 at 02:58

score 1 · Answer 1 · answered Oct 27 '20 at 22:00

This is due to the fact that you are using L2 regularization on a particular model setup where the coefficients represent some sort of difference between group 1 and a reference group 0. If you set up the coefficients differently (e.g., with respect to a different reference group, or e.g. removing the intercept and encoding the group means directly as coefficients), the L2 regularization will penalize things differently.

Logistic regression - regularized model always predict lower probabilities on average compared to unregularized model

1 Answers1