Weights increasing in Multi-class NN

Question

I am trying to make this network but here are the problems I find:

In case of more than 1 hidden layer, I get nans as losses.
In case of a single hidden layer, the loss first increase and then decreases.

Code(starts line 58 onwards, before that is data generation): https://github.com/abhigenie92/multiple_class_NN/blob/master/multiple_class_NN.py Slides based on(23,24,25): http://www.deeplearningforcomputervision.com/uploads/9/6/6/6/96660590/lecture_7.pdf I am using numerically stable computations. Any help would be great.

score 2 · Answer 1 · answered Mar 11 '17 at 17:55

Have you tried regularizing the network by penalizing large weights (slides 30-40 in the linked PDF)? I didn't see anything like that when I glanced at your code, but it should be very straightforward to add it: all you need to do is add up the squares of your weights, multiply the result by a small constant, and add it to your loss function.
If your loss increases, and you don't have a bug somewhere, the most likely explanation is that your learning rate is too high.

This question addresses both regularization ("weight decay") and learning rates.

1 Answers1