What does it mean when all gradients of a neural network are 0?

Question

I am running a convolutional neural network on image data, and returning the gradients in each step yields gradients of exactly zero. At the same time, the network is not converging, and returns high loss.

What does this mean in terms of what I should do to learning rate, momentum, decay, etc.?

Thanks!

I'm using RELU which has f'(x)=0 for x<0 and f'(x)=1 for x>0. — user135237, Dec 04 '16 at 23:44
I'm familiar with the RELU function. It's possible that all of the neurons have died. However, my question was more general. — Sycorax, Dec 05 '16 at 00:06
I'm not sure of the more general implications -- I just know that it means f'(x)=0, and that parameter would no longer update.... Is there something else you're alluding to? — user135237, Dec 05 '16 at 02:17
Have a look at https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn/352037#352037 — kjetil b halvorsen, Oct 19 '21 at 13:01

score 1 · Answer 1 · answered Dec 05 '16 at 05:36

1

I am assuming there are no bugs in your code.

Sounds like it might be the Vanishing Gradient Problem.

Try decreasing the depth of your network. Regularization might also help

answered Dec 05 '16 at 05:36

Souradeep Nanda

515
4
14

What does it mean when all gradients of a neural network are 0?

1 Answers1