0

I can not see any difference between Ridge Regression and Linear Regression

MY understanding, The point of ridge Regression is based on the training data we find the best line that fits training data.

Best line means minimum RMSE

then try to play with the line sloop to get better results through n-fold cross validatin!.

isn't easier and simpler to use all dataset (both training and test) to build this line and find sloop through

$y\ =\ \beta_0+{\beta_1x}_1$

$\beta_1\ =\ \rho\frac{\sigma_y}{\sigma_x}$

$\beta_0\ =\ \mu_y\ -\ \mu_x\beta_1\ $

$\rho\ =\ [(x-μx)(y-μy)] [(x-μx)2][(y-μy)2]$

$\sigma_x=\ \sqrt{\frac{{\sum{(x-\mu_x)}}^2}{n}}$

$\sigma_y=\ \sqrt{\frac{{\sum{(y-\mu_y)}}^2}{n}}$

the linear regression will give us the best fit.

if i misunderstood.

please tell me what is the difference between these models before down rating my question.

Thanks

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
asmgx
  • 239
  • 2
  • 9
  • 1
    Contemplate what "best" means, because different forms of regression often differ in how they measure how "good" the fit is. – whuber Jun 27 '19 at 12:07
  • @whuber Best line means minimum RMSE – asmgx Jun 27 '19 at 12:10
  • 1
    Then please review the definition of Ridge Regression, because it explicitly does not minimize the RMSE. – whuber Jun 27 '19 at 12:13
  • @whuber so when we do cross validation, to find the best fit.. in that case what are we looking for? how can we find best fit without minimizing the errors? – asmgx Jun 27 '19 at 12:15
  • 1
    Ridge regression is not just *similar* to linear regression... instead, it is *exactly* like linear regression. It is a *specific* type of linear regression, where the 'linear' refers to the model $y = \beta X$. How ridge regression differs from the most common type of linear regression, ordinary least squares regressions, is in **the added penalty that makes one favor solutions with small effect sizes** (this is advantageous when you have lots of regressors for which you can reasonably expect that most of the associated effects should be equal to zero or close to it). – Sextus Empiricus Jun 27 '19 at 12:18
  • @MartijnWeterings do you mean we can find a line going through data we have, this line has the minimum variance but its equation is not y = β0+β1x ? – asmgx Jun 27 '19 at 12:26
  • @asmgx I do not understand what you are saying. There are *a lot of ways* to fit a line $y = \beta_0 + \beta_1 x$ to the data (or you could generalize 'line' to 'hyperplane' $y_j = \sum \beta_i x_{ij}$ when there are more variables). Every technique that helps you find such line/hyperplane is a form of linear regression. Note: while they differ, this does not mean that its equation is not y = β0+β1x (except possibly an extension to multiple dimensions instead of only x and y). – Sextus Empiricus Jun 27 '19 at 12:36
  • @MartijnWeterings now I am really confused coz my understanding was y=ΣBX with the factors mentioned in my question are going to provide best line fit between data points, and I thought this equation was mathematically proved.! now you telling me there could be another line that fits data and it is not y=ΣBX based on the factors in the question – asmgx Jun 27 '19 at 12:37
  • 1
    Note the double negation. I am *not saying* that the other line is not y=ΣBX. The other line will still be like y=ΣBX but with other coefficients B. It depends on what you consider 'best' line. If your goal is to minimize the least squares residuals than there is only a single unique solution. But **linear regression is *not* synonymous/equal to minimizing least squares residuals**. There are other ways to fit a line that do not minimize least squares residuals, but instead minimize something else. – Sextus Empiricus Jun 27 '19 at 12:40
  • Your intuition is working: the "best fit" of Ridge Regression *does not* minimize the errors. It minimizes a combination of two things: the typical size of the errors (expressed as the sum of their squares) and the typical size of the standardized coefficients (which aren't "errors" in any sense). So indeed it is the case that the sum of squared errors in a Ridge regression is *not* minimal. Nevertheless, Ridge regression looks *only* at linear functions of the form $y=\beta_0+\beta_1x,$ guaranteeing that whatever solution it produces is still a line. – whuber Jun 27 '19 at 12:44
  • 1
    @MartijnWeterings & whuber thanks heaps. things are much clearer now. really appreciate your help. I will re-read the topic i think I will understand it better now. – asmgx Jun 27 '19 at 12:48

0 Answers0