Regularised Regression and Feature Scaling

Question

When performing regularised regression, such as LASSO, ridge regression and elastic net, I understand that it is important to scale variables before calculating and applying a penalty term. I have noticed that standardising is commonly used (and included in R's glmnet package) however I wondered if this is appropriate in all instances, such as:

If data are slightly skewed
If you have one-hot/ dummy encoded variables

Is there ever a situation where min-max scaling/ normalising (ie: x - min(x)/ max(x) - min(x)) is preferable?

For my own work, I am particularly interested in interpreting variable importance so I would be greatly appreciative of answers that take this in to account.

(This is my first post, so if you need any further clarification from me please let me know!)

Welcome to CV. A few questions have been asked about this, which you can find by searching for "lasso scaling". [This question](https://stats.stackexchange.com/questions/359015/ridge-lasso-standardization-of-dummy-indicators) provides the most thorough answer I've seen. The answer is basically that there is no answer, but you can try multiple methods and treat each method as values of an additional tuning parameter to select. — Noah, Mar 29 '20 at 22:42
@Noah Thank you for responding. I'm glad to hear that the issue itself is confusing and it wasn't just me! I'll give the link a look - much appreciated. — saturday333, Mar 31 '20 at 09:51

Regularised Regression and Feature Scaling

0 Answers0