0

I am running a convolutional neural network on image data, and returning the gradients in each step yields gradients of exactly zero. At the same time, the network is not converging, and returns high loss.

What does this mean in terms of what I should do to learning rate, momentum, decay, etc.?

Thanks!

user135237
  • 271
  • 1
  • 3
  • 3
    What does it mean when the gradient of any function is 0? – Sycorax Dec 04 '16 at 23:24
  • I'm using RELU which has f'(x)=0 for x<0 and f'(x)=1 for x>0. – user135237 Dec 04 '16 at 23:44
  • 1
    I'm familiar with the RELU function. It's possible that all of the neurons have died. However, my question was more general. – Sycorax Dec 05 '16 at 00:06
  • I'm not sure of the more general implications -- I just know that it means f'(x)=0, and that parameter would no longer update.... Is there something else you're alluding to? – user135237 Dec 05 '16 at 02:17
  • Have a look at https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn/352037#352037 – kjetil b halvorsen Oct 19 '21 at 13:01

1 Answers1

1

I am assuming there are no bugs in your code.

Sounds like it might be the Vanishing Gradient Problem.

Try decreasing the depth of your network. Regularization might also help

Souradeep Nanda
  • 515
  • 4
  • 14