1

I've read in some posts that the absolute function is not differentiable. However, it is not differentiable only at 0. Can I not check the value at each training point to calculate the derivative. Only if the error happens to be 0 will it create a problem, in which case I could ignore the point. Simplex is the preferred way, probably because it is more efficient, but I just want to understand that is efficiency the only reason why we might not want to use gradient descent or is there some other issue as well.

racerX
  • 111
  • 2
  • 1
    You didn't ask about that, but you can find the exact solution with linear programming methods. Some code at https://stats.stackexchange.com/questions/260504/how-to-solve-least-absolute-deviation-by-simplex-method/269966#269966 – kjetil b halvorsen Nov 11 '17 at 21:52

1 Answers1

7

This difficulty can be addressed using subgradients. A subgradient of a convex function $f$ at some point $x$ is any value $v$ that satisfies $f(y)-f(x)\ge v\cdot (y-x)$ for all $y$ in the domain. Note that if $f$ is differentiable at $x$ then the subgradient is just the gradient.

The subgradient method is very similar to gradient descent in concept; update the current solution in the direction of the subgradient. That is, take $x^{\{k+1\}}=x^{\{k\}}-\alpha_k v^{\{k\}}$, where $v^{\{k\}}$ is a subgradient at $x^{\{k\}}$ and $\alpha_k$ is the step size at iteration k.

The subgradient of $f(x)=|x|$ is $v(x)=\left\lbrace \begin{array}{cc} 1, & x>0 \\ -1, & x<0 \\ [-1,1], &x=0 \end{array}\right.$

Kyle
  • 226
  • 1
  • 7
  • Thanks! so basically it can be solved, using sub gradient instead of gradient. However, I see you mention sub gradient has 2 values at x = 0 for your example, what value would the algorithm pick in this case? – racerX Nov 14 '17 at 06:47
  • The subgradient of $|x|$ at zero is not two values, it's the interval from -1 to 1 (infinite values). The algorithm need only pick one (e.g. 0). – Kyle Nov 14 '17 at 16:32