What is the derivative of the ReLU activation function?

Question

What is the derivative of the ReLU activation function defined as:

$$ \mathrm{ReLU}(x) = \mathrm{max}(0, x)$$

What about the special case where there is a discontinuity in the function at $x=0$?

Jim · Accepted Answer · 2018-03-15T17:28:58.507

47

The derivative is:

$$ f(x)= \begin{cases} 0 & \text{if } x < 0 \\ 1 & \text{if } x > 0 \\ \end{cases} $$

And undefined in $x=0$.

The reason for it being undefined at $x=0$ is that its left- and right derivative are not equal.

edited Mar 15 '18 at 17:28

answered Mar 14 '18 at 09:14

Jim

2

So in practice (implementation), one just picks either $0$ or $1$ for the $x=0$ case? – Tom Hale Mar 14 '18 at 09:51
6

The convention is that drdx=1(x>0) – neuroguy123 Mar 14 '18 at 13:10
5

@TomHale by the way, see Nouroz Rahman's answer at https://www.quora.com/How-do-we-compute-the-gradient-of-a-ReLU-for-backpropagation: _"[...] In my view, in built-in library functions (for example: `tf.nn.relu()`) derivative at x = 0 is taken zero to ensure a sparser matrix..."_ – Jim Mar 29 '18 at 16:17

1 Answers1