SVMs and solution

Question

In SVMs, is the solution to the minimization problem $$\textbf{w} = \sum_{i=1}^{n} \alpha_i x_i y_i $$ and once we know $\textbf{w}$ we can get $\textbf{b}$?

In plain English can somebody please describe the solution above? Basically the hyperplane that separates the data with the maximum margin is a linear combination of the training data?

The quantity $\textbf{b}$ is the distance from the decision boundary to the closest training example?

I have not been able to find a simple explanation of the solution above in the books and notes I have looked at.

which tutorials and books have you looked at? This one are great at an introductory level http://cs229.stanford.edu/notes/cs229-notes3.pdf. This is quite more technical, but insightful http://research.microsoft.com/pubs/67119/svmtutorial.pdf. — jpmuc, Aug 06 '14 at 16:01
Maybe this can help ? http://stats.stackexchange.com/questions/164935/how-to-calculate-decision-boundary-from-support-vectors/167245#167245 — , Aug 24 '15 at 17:09

score 2 · Answer 1 · answered Aug 06 '14 at 08:34

It is a bit more than that. If I were you I would take a look on the great cvx book from Boyd and Vandenberghe, which is free, at page 425, where you will see how the problem that you are solving is:

\begin{equation*} \begin{aligned} & \underset{w}{\text{minimize}} & & \frac{1}{2} \lambda\|\mathbf{w}\|^2_2 + \sum_i^n \xi_i \\ & \text{subject to} & & y_i(wx_i + b) \ge 1 - \xi_i, \; i = 1, \ldots, m. \\ & & & \xi_i \ge 0, \; i = 1, \ldots, m. \end{aligned} \end{equation*}

And the equation that you posted is how to recover w from the dual. Regarding b, it is an offset in the direction of the norm of the hyperplane, which can be optimized in various ways, some augment the dimension of the data with ones, some take the gradient with respect to b, others simply compute the mean of the data. I suggest you to take a look at Pegasos section 6, from Shalev-Shwartz et al.

score 0 · Answer 2 · answered Aug 19 '18 at 19:17

$b$ can be calculated based on any support vector that lies on the boundary; in particular having $0 < \alpha_i < C$. For such a vector $x_i$, we know that $y_i (w^T x_i + b) = 1$, and the value for $b$ can simply be calculated (so yes, you do indeed need $w$ to calculate $b$). Numerically, one generally averages over the values one gets from such a calculation.

There are alternative methods for calculating; see for example Section 4.1.5 in the LibSVM paper, they handle the corner case as well where there no support vectors having $0 < \alpha_i < C$.

SVMs and solution

2 Answers2