When performing regularised regression, such as LASSO, ridge regression and elastic net, I understand that it is important to scale variables before calculating and applying a penalty term. I have noticed that standardising is commonly used (and included in R's glmnet package) however I wondered if this is appropriate in all instances, such as:
- If data are slightly skewed
- If you have one-hot/ dummy encoded variables
Is there ever a situation where min-max scaling/ normalising (ie: x - min(x)/ max(x) - min(x)) is preferable?
For my own work, I am particularly interested in interpreting variable importance so I would be greatly appreciative of answers that take this in to account.
(This is my first post, so if you need any further clarification from me please let me know!)