SVM - Reason for using Lagrangian Dual

Question

Objective

To confirm if the understanding is correct regarding the reason why Lagrangian Dual is used in SVM.

Background

While Machine Learning, gradient descent is used at regression, logistic regression, back propagation, etc. However for SVM, suddenly Lagrangian Dual pops up. Trying to understand why.

It looks to me gradient descent is possible to solve "primary" issue of SVM as in the left side of the image below (please correct if it is wrong). However never saw any articles or books which mention using gradient descent. Instead, always Lagrangian Dual and KKT conditions pop up out of blue and do not explain well why.

Question

I suppose using the dual reduces the dimensions , e.g (x1, x2) to λ in the image, and results in a formula that produces definite solutions.

Is it the only reason to use Lagrangian Dual, or are there any other reasons?
Is there any reason why Gradient Descent cannot be used?

I also once answered something on the subject of primal SVM mentioning Chapelle as well: http://stats.stackexchange.com/questions/215524/is-gradient-descent-possible-for-kernelized-svms-if-so-why-do-people-use-quadr/230579#230579. Primal Sub-Gradient Descent is possible, and the dual doesn't guarantee approximate solutions. — Firebug, Jan 10 '17 at 12:21
@seanv507, thanks a lot for the references, especially the link to Chapelle's paper. — mon, Jan 10 '17 at 21:14
@Firebug, thanks for the reference and pointer to Chapelle's article. — mon, Jan 10 '17 at 21:15

SVM - Reason for using Lagrangian Dual

Objective

Background

Question

0 Answers0