Assumptions of Ridge and LASSO Regression

Question

What are the assumptions of Ridge and LASSO Regression? Which assumptions of Linear Regression can be done away with in Ridge and LASSO Regressions?

Assumptions for what? Cosistency, asymptotic normality, ...? — Richard Hardy, Mar 15 '17 at 07:45
Possible duplicate of [How to interpret the results when both ridge and lasso separately perform well but produce different coefficients](http://stats.stackexchange.com/questions/267345/how-to-interpret-the-results-when-both-ridge-and-lasso-separately-perform-well-b) — Hugh Perkins, Mar 15 '17 at 08:33
Possible duplicate of [What are the assumptions of ridge regression and how to test them?](http://stats.stackexchange.com/questions/169664/what-are-the-assumptions-of-ridge-regression-and-how-to-test-them) — amoeba, Mar 15 '17 at 10:11
I don't think it's an exact duplicate, since this asks about ridge as well as LASSO. Maybe the question should be reworded to be just about LASSO? — Peter Flom, Mar 15 '17 at 13:45

JohnK · Answer 1 · 2017-03-15T07:47:08.503

2

This is a broad question and I am sure a search on this site would return several results. Nevertheless here is a couple of things to remember.

The basic thing to remember about Ridge and Lasso is that they are both parametric methods. What this means is that for them to be applicable, a specific model has to be postulated, usually a linear one:

$$\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}$$

The major advantage of these methods compared to OLS is that they can handle multicollinearity, i.e. a predictor matrix with rank less than the number of its columns.

Another thing to remember is that neither Ridge nor Lasso actually respond well to outlying observations. This may be seen most easily for the case of an orthonormal predictor matrix as then the estimators may be written as (unbounded) functions of the notoriously non-robust OLS estimator. Therefore, much like the OLS estimator, Ridge and Lasso should be used with caution in non-clean datasets.

edited Mar 15 '17 at 07:47

answered Mar 14 '17 at 23:43

JohnK

18,298
10
60
103

1

What about the assumptions of homoscedasticity and normality? Could I drop them? – Shreyo Mallik Mar 14 '17 at 23:51
1

@ShreyoMallik Normality is not necessary for OLS regression, the Gauss-Markov theorem has no need for it. Homoscedasticity is a more persistent problem, however. – JohnK Mar 14 '17 at 23:53
Then, why do we assume normality and homoscedasticity in OLS Regression? Could you please elucidate? – Shreyo Mallik Mar 15 '17 at 00:08
@ShreyoMallik You assume normality for inference purposes, e.g. t-tests, and homoscedasticity to prove that the OLS is BLUE, i.e. the Gauss-Markov theorem. Neither of these assumptions are crucial in OLS regression and there are work-arounds if they are violated. – JohnK Mar 15 '17 at 00:12
How to take care of the violations? – Shreyo Mallik Mar 15 '17 at 00:19
Linear regression is fine as a predictive method even if both of those assumptions are violated (though there may be better models for your data); so if you're after predictions, and want to use linear regression, there is nothing to do. – Matthew Drury Mar 15 '17 at 03:00
"The major advantage of these methods compared to OLS is that they can handle multicollinearity, i.e. a predictor matrix with rank less than the number of its columns." Is that really true? That's not how I use those methods; they appeal to me because they have a tuning parameter that continuously varies the model complexity. – Matthew Drury Mar 15 '17 at 03:02
@MatthewDrury Please read again the question. The OP is interested in the assumptions of OLS, which LASSO and ridge dispsense with. Multicollinearity is of course the first thing that comes to mind. I use them for the reason you say as well. – JohnK Mar 15 '17 at 07:12
@JohnK, your linking of parametric methods and linearity of a model can be a misleading one. Parametric nonlinear models exist, too. I would rephrase that part. – Richard Hardy Mar 15 '17 at 07:44
@RichardHardy Thanks for pointing that out, I have rephrased that part. – JohnK Mar 15 '17 at 07:47
1

@ShreyoMallik, as JohnK correctly said, normality is not assumed for OLS (assuming normality is a very common mistake). Its only use is in small sample inference. In large sample inference it is not needed anymore due to the central limit theorem. Also, consistency, asymptotic normality and Gauss-Markov theorem does not rest on the assumption of normality. – Richard Hardy Mar 15 '17 at 07:47

Assumptions of Ridge and LASSO Regression

1 Answers1

Linked