1

For image classification problem, let's say, and given a neural network to train on,

if you were to run too many iterations for a single image of a cat would not generalize well into other images of cats. But then, if you were to run only 1 iteration for a single image of a cat, then using the same weights of the network, you go through another iteration using another picture of a cat, then it would simply not converge fast enough since you wouldn't be able to use RMSprop....etc

So one way to prevent that is by using dropout regularization but is there a proof that even with so many iterations per example, it makes the network "difficult" to overfit for that each example?

Kevvy Kim
  • 349
  • 3
  • 12
  • Also, when using batch normalization, do you normalize the batch after the dropout? I would assume so, in order to be able to do backpropagation – Kevvy Kim Oct 31 '18 at 23:43

1 Answers1

1

Dropout prevents overfitting due to a layer's "over-reliance" on a few of its inputs. Because these inputs aren't always present during training (i.e. they are dropped at random), the layer learns to use all of its inputs, improving generalization.

What you describe as "overfitting due to too many iterations" can be countered through early stopping.

Djib2011
  • 5,395
  • 5
  • 25
  • 36
  • So "early stopping" as in, if it takes, say 1000 iterations until convergence for a given example, then you would do, say, 500 iterations instead? Then how would you be able to determine that number? – Kevvy Kim Nov 01 '18 at 00:41
  • I was thinking if dropouts alone could help even without using early stopping. Since overfitting due to too many iterations are mostly caused by trying to overfit on the "noise" on the data, wouldn't dropouts that does not overly rely on such noise on the data help make it so early stopping is not necessary? Possibly incorporating dropouts for the input layer as well? – Kevvy Kim Nov 01 '18 at 00:43
  • Dropout does help with overfitting in general and it's really effective, so it should be used if possible. By combining it with early stopping you could achieve even better results. I'd like to point you to an [answer](https://stats.stackexchange.com/questions/365778/what-should-i-do-when-my-neural-network-doesnt-generalize-well/365806#365806) of mine where I explain how to perform early stopping and how to combat overfitting in a Neural Network, in general. – Djib2011 Nov 01 '18 at 23:18