1

I'm asked to differentiate the following hinge loss term. $$ \dfrac{1}{n}\sum _{\left( x_{i},y_{1}\right) \in S}\sum _{j'=1}L\left( w^{j'};\left( x_{i},y_{i}\right) \right) $$

where $$ L\left( w^{j'};\left( x_{i},y_{i}\right) \right) = ( \max \{ 0,2-\left( \langle w^{j'},x_{i}\rangle \right) 1\{ y_{i}= j'\} ) ^{2} $$

The one function is = 1 when $y_i = j'$ and 0 otherwise. I need the partial derivative along $w^j_k$. I also have the following assumption to help me : $$\dfrac{\partial }{\partial a}\max \left\{ 0,a\right\} =\begin{cases} 1 \quad \text{if} \quad a >0\\ 0 \quad \text{if} \quad a\leq 0\end{cases}$$ I know that this assumption not really true but just consider it as true in my case.

My problem is that I have no clue how to differentiate this and I need this to compute the gradient for my SVM model.

Firebug
  • 15,262
  • 5
  • 60
  • 127
WindBreeze
  • 131
  • 4
  • 3
    Your formula for the derivative is not fully correct: it is not defined for $a=0.$ That's more than a technical nitpick: this failure to be differentiable everywhere leads to failure of certain optimization algorithms, for instance. – whuber Oct 27 '20 at 20:44
  • I know that the assumption is not really true but I'm required to use it. – WindBreeze Oct 27 '20 at 20:44
  • Strictly speaking, in mathematics when you adopt a known false assumption you may immediately conclude literally anything you like ;-). It sounds like you are expected to conduct an exercise in applying the Chain Rule of Calculus. – whuber Oct 27 '20 at 20:45
  • I have trouble seeing how to derivate elements of a matrix do we get a sum of x because w derivate to 1 when doing the dot product between w and x? – WindBreeze Oct 27 '20 at 20:48
  • 1
    Because derivatives of the $w^j$ with respect to the $w^{j^\prime}$ are either $0$ or $1,$ that makes your work easy. The multivariate Chain Rule is stated at https://en.wikipedia.org/wiki/Chain_rule#Multivariable_case. – whuber Oct 27 '20 at 20:50
  • So if w_j = w_j' the derivative is 1 otherwise it's 0? But what is w_j' ? Is it a specific value? – WindBreeze Oct 27 '20 at 20:55

0 Answers0