# Why ReLU is not differentiable at x=0 (zero)?

ReLU is one of the widely used activation functions. For any $$x > 0$$, the output of ReLU is $$x$$ and $$0$$ otherwise. So,
$ReLU(x) = \left\{\begin{matrix} x & \textrm{if}\ x > 0, \\ 0 & \textrm{otherwise} \end{matrix}\right.$

We can also write it as $$ReLU(x) = max(0, x)$$. For rest of the post let’s say $$f(x) = ReLU(x)$$. So, for $$x = 0$$ the value of the ReLU will be $$f(x) = 0$$. From the following graph we can clearly see that the function is continuous at $$x = 0$$.

So, why it is not differentiable $$x = 0$$? To be differentiable a function must meet the following criteria:

• The function is continuous at the given point
• The left-hand limit exits at the given point
• The right-hand limit exits at the given point
• Both of the left-hand limit and right-hand limit are equal at the given point

We have already seen that $$f(x)$$ is continuous at $$x = 0$$. Let’s investigate the further criteria. But before that let’s have a quick recap on how we can obtain the derivative using the fundamental concepts.

## Derivative of a function

Say, we have a function $$f(x)$$. We can get the derivative of this function, $$f'(x)$$ using the basic concepts of the derivative with the following formula. $f'(x) = \lim_{x\rightarrow 0}\frac{f(x+\Delta x) – f(x)}{\Delta x}$

Here $$\Delta x$$ is an “infinitely small” change of input $$x$$. The derivative is sometimes referred to as ‘rise over run’, i.e. how much the outputs changes over a tiny change of the input.

## Derivative of ReLU

For $$x \neq 0$$ we can calculate the derivative of ReLU using the fundamental formula. So, the derivative will be $f'(x) = \lim_{x\rightarrow 0}\frac{max(0, x+\Delta x) – max(0,x)}{\Delta x}$ For $$x>0$$ we obtain the derivative as: $f'(x) = \frac{x+\Delta x – x}{\Delta x} = \frac{\Delta x}{\Delta x} = 1$ For $$x<0$$ it would be: $f'(x) = \frac{0\ – 0}{\Delta x} = 0$ So, what’s about $$x = 0$$? Let’s figure it out.

## Why the derivative does not exist

We have already known that a function must follow a particular set of conditions to get the derivative. We also figured out that ReLU satisfies the first condition. Let’s investigate the other conditions.

#### The left-hand limit

To get the left-hand limit, let’s approach $$0$$ from the left-hand side. So, $f'(x) = \lim_{x\rightarrow 0^-}\frac{max(0, x+\Delta x) – max(0,x)}{\Delta x} = \lim_{x\rightarrow 0^-}\frac{0 – 0}{\Delta x} = 0$

#### The right-hand limit

And the right-hand limit is: $f'(x) = \lim_{x\rightarrow 0^+}\frac{max(0, x+\Delta x) – max(0,x)}{\Delta x}\\ = \lim_{x\rightarrow 0^+}\frac{x – \Delta x + x}{\Delta x} \\ = \frac{0 – \Delta x + 0}{\Delta x}\\ = 1$

It clearly satisfies conditions 2 and 3 as well. But for ReLU, the left-hand limit and the right-hand limits are not equal. So, it fails to satisfy the last condition.

And thus the derivation of ReLU Does Not Exist (DNE) at $$x=0$$. Generally, for machine learning in practice it is rare to get $$x=0$$. If we encounter $$x=0$$ then generally we set it to either $$0$$, $$1$$ or $$0.5$$.

I'm a Computer Science and Engineering student who loves to learn and spreading what is learnt. Besides I like cooking and photography!
Scroll to top
Share via