2

My class started learning about ridge regression two weeks ago. Before that we learned about Lagrange multipliers and the connection between that and ridge penalty/constrain function.

Ridge:

enter image description here

Lagrange:

enter image description here

My question is why is the constant $c$ is not shown in ridge regression? Are we setting $c=0$? So is our constrain in ridge regression to set the sum of $\beta_j^2$ to zero?

Also, since $\beta_j^2$ is positive doesn't that mean that we are requiring all $\beta_j$ to equal to zero?

gbd
  • 211
  • 1
  • 2

1 Answers1

0

Since the penalty multiplier $\lambda$ is finite, no, we're not requiring $b_j$ to be zero. However, if we keep increasing the number of parameters, then it pushes them closer to zero, since the sum $\sum_jb_j$ is penalized. The idea of the ridge regression is to limit the magnitude of the parameters.

If you were to use the penalty as $\lambda\sum_j(b_j-c)^2$ then the implications is that $Xb$ would be more likely to overfit. By forcing $b_j$ to be small we limit the ability of $Xb$ to overfit the data.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • I am confused as to why we don't write the penalty function as $\lambda(\sum_j(b_j)^2-c)$ since that is how Lagrange multipliers usually written. Also the solution must satisfy $\sum_j(b_j)^2=c$ because that is how we learned Lagrange multipliers. – gbd Apr 24 '19 at 03:10