Squaring the $\ell_2$-norm in $\ell_2$-regularization

Asked Aug 08 '16 at 20:44

Active Aug 08 '16 at 20:44

Viewed 45 times

It's said the $\ell_2$-penalty term is based on the $\ell_2$-norm. Indeed, the term often is written as $\lambda \|w\|^2_2$.

Notice, though, that the norm is squared, differing from the $\ell_1$-penaltym which is simply the $\ell_1$-norm. It obviously helps in differentiation of the function, but does it change the interpretation of the penalty term?

Would a regularization term like $\lambda \|w\|_2$ lead to different results or are them equivalent?

How does it generalizes to $\ell_p$ regularization with $1\lt p\lt2$? Take or not take the $p^{th}$-root of the penalty term?

asked Aug 08 '16 at 20:44

Firebug

15,262
5
60
127

@Ben That's a cool question, hadn't seen it before, thanks for the suggestion! I actually closed this one as duplicate because I was quite satisfied with [@bdeonovic's arguments](https://stats.stackexchange.com/a/120140/60613) toward ease of derivation. Now I can also see some arguments about priors on the coefficients would be as efficient. – Firebug Sep 12 '17 at 11:37

Squaring the $\ell_2$-norm in $\ell_2$-regularization

0 Answers0