2

Is neural network considered to be overfitting if after N iterations validation error stops improving and stays more or less constant, whilst training error keeps decreasing?

spacemonkey
  • 365
  • 1
  • 2
  • 7

1 Answers1

1

A quite probable scenario is that yes you are starting to overfit on your training dataset and not learning generally applicable informative features.

In order to check that, you could try to use Dropout in your network, but make sure you disable it before you run the evaluation of each set. Dropout randomly shuts down units at each iteration, so that it avoids the neurons of your model to overfit the data.

Furthermore, I would like to see how your model would work when you use:

1) Adam (or Adagrad, Adadelta) instead of simple SGD. The training method can be crucial in such cases.

2) Batch Normalization which can speed up training, so that you can draw larger-scale conclusions faster, and

3) I would also try to use ReLU (or PReLU units) as they make much more sense than Tanh or Sigmoid (this is a personal preference, for more details I would refer you to this answer)

Yannis Assael
  • 3,057
  • 17
  • 25