1

Say that I want to train a CNN model that consists of $\sim1.5M$ hyperparameters (i.e., total number of filters weights and fully-connected layers coefficients) where the input layer is a $256\times256$ grayscale image.

So I am wondering is there an exact minimum number of training images that I can use to claim that my model is not overfitting regardless whether I use dropout layers or not.

user2987
  • 165
  • 1
  • 9

2 Answers2

4

Say that I want to train a CNN model that consists of $\sim1.5M$ hyperparameters (i.e., total number of filters weights and fully-connected layers coefficients) where the input layer is a $256\times256$ grayscale image.

Filters weights and fully-connected layers coefficients are parameters, not hyperparameters.

So I am wondering is there an exact minimum number of training images that I can use to claim that my model is not overfitting regardless whether I use dropout layers or not.

To claim that my model is not overfitting, the typical way is to plot the performance of your network for the valid and train set vs. epoch number. See How to Identify Overfitting in Convolutional Neural network? and How few training examples is too few when training a neural network?

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271
  • Please forgive the fact that I am not enough knowledgeable.. Is that possible to compare the loss instead of the accuracy/error. So if the training and testing data lead to the same loss we can claim that the CNN model is not overfitting.. – user2987 Nov 15 '16 at 17:07
  • 1
    @user2987 yes loss is ok – Franck Dernoncourt Nov 15 '16 at 17:19
  • I am working on a small field in signal processing and reviewers rejected my work many times saying that my CNN model is overfitting that's why I am getting very high accuracy even though I tested it with unseen data from the wild and I got pretty much same performance. Is there a reference that I can include in my manuscript to show that if the training and validation data lead to the same loss after a certain number of epochs we can claim that the model is not overfitting? – user2987 Nov 16 '16 at 16:19
0

So I am wondering is there an exact minimum number of training images that I can use to claim that my model is not overfitting regardless whether I use dropout layers or not.

There is not a known way to compute the minimum training images. This is clear because the opposite question is an active area of research: How complicated can a model be before overfitting occurs for a set amount of training data.

Hugh
  • 3,659
  • 16
  • 22