4

I am looking at the derivation of the Maximal Margin Classifier model, where they typically transform the following maximization problem: max_model

into this minimization problem:

min_model

Why is maximizing $\frac{1}{\left\|\mathbf{w}\right\|}$ equal to minimizing $\left\|\mathbf{w}\right\|^2$ ?

Drizzt
  • 83
  • 1
  • 5
  • 4
    Your question seems to be based entirely on a typographical error: the quotation minimizes $\frac{1}{2}||w||^2$ while you write $1/||w||^2$. Does that resolve the issue? – whuber Apr 16 '14 at 17:51
  • 1
    @whuber Many thanks, that was indeed a typo. The question still remains though. – Drizzt Apr 16 '14 at 18:54

1 Answers1

9

Note that $\| w \|$ is nonnegative. Thus $1/\| w \|$ is a monotone decreasing function of $\| w \|$. Minimizing $\| w \|$ is thus equivalent to maximizing $1/\| w \|$. Note here that by "equivalent", I don't mean that the optimal objective value will be the same, but rather that any optimal solution to the first problem will be optimal for the second problem and vice versa.

Since squaring a positive quantity is a monotone increasing transformation, minimizing $\| w \|^{2}$ is equivalent to to minimizing $\| w \|$. Finally, multiplying by $1/2$ has no effect on the set of optimal solutions. Thus minimizing $(1/2) \| x \|^{2}$ is equivalent to maximizing $1/\| w \|$.

The advantage of minimizing $ (1/2) \| w \|^{2}$ is that it's easy to compute the gradient of this objective function (it's $w$.)

Brian Borchers
  • 5,015
  • 1
  • 18
  • 27
  • 1
    Thank you for explaining this in small steps, I now see what is happening. – Drizzt Apr 16 '14 at 19:02
  • 1
    These are standard tricks that are frequently used in formulating optimization problems. I'd encourage you to read the textbook (and do the exercises) on convex optimization by Vandenberghe and Boyd if you want to learn how to do these kinds of things. – Brian Borchers Apr 16 '14 at 21:02
  • Hi Brian - this is a great answer - could you clarify what you mean by "Note here that by "equivalent", I don't mean that the optimal objective value will be the same, but rather that any optimal solution to the first problem will be optimal for the second problem and vice versa." please? – user1058210 Mar 25 '16 at 08:51