1

What does it mean when my neural network always gets stuck at the exact number 8.6791 when I use binary-crossentropy loss? Some strange local minimum?

It happens regardless of my learning rate, initialization, activation function, regularizer, or choice of optimizer.

The only thing that changes it is a change of architecture or a new loss function, which are things I don't want to change...

John Smith
  • 81
  • 1
  • 5
  • Is there any chance your architecture leads to a convex loss function for some reason? For instance, is it equivalent to a logistic regression? – RMurphy Aug 13 '18 at 22:50
  • 1
    Take a look at the predictions of a model. Most possibly your model is stuck predicting one class and the cost for this is 8.6791. – Djib2011 Aug 14 '18 at 00:21
  • 3
    If you have $k$ classes and you're using cross-entropy loss, a random classification has expected loss $-\log(k)$. For $k=2$ classes, the expected loss is approximately $0.69$, implying that something about your model is causing the loss to *increase* from a random guess to a very high loss. There are a number of suggestions of how to debug this in [What should I do when my neural network doesn't learn?](https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn) – Sycorax Aug 14 '18 at 00:37
  • ^ Sorry, that comment should read "expected loss $\log(k)$," without the negative sign. – Sycorax Aug 15 '18 at 01:30

0 Answers0