0

at high accuracy, after some epochs the accuracy as well as validation accuracy is decreasing and got stuck after few more epochs. i dont understand why this happened. does more epochs at some point decrease the performance? what can i do to fix this ? i am new at this. thanks for the help.

here is my output

Epoch 1/25
4000/4000 [==============================] - 90s 22ms/step - loss: 0.7030 - acc: 0.7676 - val_loss: 0.3410 - val_acc: 0.8980
Epoch 2/25
4000/4000 [==============================] - 52s 13ms/step - loss: 0.1790 - acc: 0.9444 - val_loss: 0.2049 - val_acc: 0.9388
Epoch 3/25
4000/4000 [==============================] - 51s 13ms/step - loss: 0.0860 - acc: 0.9752 - val_loss: 0.1836 - val_acc: 0.9451
Epoch 4/25
4000/4000 [==============================] - 52s 13ms/step - loss: 0.0456 - acc: 0.9880 - val_loss: 0.1612 - val_acc: 0.9548
Epoch 5/25
4000/4000 [==============================] - 52s 13ms/step - loss: 0.0249 - acc: 0.9944 - val_loss: 0.1747 - val_acc: 0.9521
Epoch 6/25
4000/4000 [==============================] - 51s 13ms/step - loss: 0.0144 - acc: 0.9972 - val_loss: 0.1763 - val_acc: 0.9556
Epoch 7/25
4000/4000 [==============================] - 52s 13ms/step - loss: 0.0090 - acc: 0.9985 - val_loss: 0.1843 - val_acc: 0.9560
Epoch 8/25
4000/4000 [==============================] - 53s 13ms/step - loss: 0.0064 - acc: 0.9990 - val_loss: 0.1892 - val_acc: 0.9579
Epoch 9/25
4000/4000 [==============================] - 52s 13ms/step - loss: 0.0043 - acc: 0.9994 - val_loss: 0.2011 - val_acc: 0.9586
Epoch 10/25
4000/4000 [==============================] - 52s 13ms/step - loss: 0.0038 - acc: 0.9993 - val_loss: 0.2100 - val_acc: 0.9598
Epoch 11/25
4000/4000 [==============================] - 53s 13ms/step - loss: 2.2301 - acc: 0.1274 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 12/25
4000/4000 [==============================] - 53s 13ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 13/25
4000/4000 [==============================] - 61s 15ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 14/25
4000/4000 [==============================] - 56s 14ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 15/25
4000/4000 [==============================] - 54s 13ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 16/25
4000/4000 [==============================] - 57s 14ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 17/25
4000/4000 [==============================] - 54s 13ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 18/25
4000/4000 [==============================] - 56s 14ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 19/25
4000/4000 [==============================] - 52s 13ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 20/25
4000/4000 [==============================] - 53s 13ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 21/25
4000/4000 [==============================] - 55s 14ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 22/25
4000/4000 [==============================] - 63s 16ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 23/25
4000/4000 [==============================] - 54s 13ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 24/25
4000/4000 [==============================] - 57s 14ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040
Epoch 25/25
4000/4000 [==============================] - 55s 14ms/step - loss: 2.3026 - acc: 0.0990 - val_loss: 2.3026 - val_acc: 0.1040

and here is my code

classifier = Sequential()

classifier.add(Conv2D(32, (3, 3), input_shape=(20, 20, 1), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))
classifier.add(Conv2D(32, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))

classifier.add(Flatten())

classifier.add(Dense(128, activation='sigmoid'))
classifier.add(Dense(10, activation='sigmoid'))

classifier.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

train_datagen = ImageDataGenerator(rescale=1./255,
                                   shear_range=0.2,
                                   zoom_range=0.2,)

test_datagen = ImageDataGenerator(rescale=1./255)

train_datagen.fit(X_train)
training_set = train_datagen.flow(X_train, y_train, batch_size=50)

test_datagen.fit(X_test)
test_set = train_datagen.flow(X_test, y_test, batch_size=50)

classifier.fit_generator(training_set,
                        steps_per_epoch=4000,
                        epochs=25,
                        validation_data=test_set,
                        validation_steps=1000)

well, i found some keras callbacks i.e EarlyStopping and ModelCheckpoint https://keras.io/callbacks/ to fix this but still dont get why this happened.

deepguy
  • 117
  • 7
Amit kumar
  • 11
  • 1
  • 3
  • What happens if you add gradient clipping? – Sycorax Mar 22 '19 at 13:32
  • accuracy and loss are fluctuating around same values. i used learning rate=0.01, clipnorm=1 with nesterov. values at 1st epoch: loss: 2.3035 - acc: 0.0958 - val_loss: 2.3040 - val_acc: 0.0920. values at 25th epoch: loss: 2.3026 - acc: 0.0961 - val_loss: 2.3040 - val_acc: 0.0920. – Amit kumar Mar 22 '19 at 16:00
  • Since your model doesn't improve at all when the gradients are clipped, this says to me that the learning rate is too high. Try smaller values. Tuning a neural network requires lots of fiddling to get it working. https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn/352037#352037 – Sycorax Mar 22 '19 at 16:26
  • fixed the issue !, I think, the model was getting mislead by output layer as i used the sigmoid activation function in my output layer, i changes it to softmax and the issue is fixed now. thanks for your time. – Amit kumar Mar 24 '19 at 20:35
  • Sounds like you could write that up as an answer. – Sycorax Mar 24 '19 at 22:24

1 Answers1

1

issue has been resolved by using softmax activation function in the output layer. the model was getting mislead by sigmoid activation function in the output layer.

Amit kumar
  • 11
  • 1
  • 3
  • 1
    This is being automatically flagged as low quality, probably because it is so short. At present it is more of a comment than an answer by our standards. Can you expand on it? – gung - Reinstate Monica Mar 25 '19 at 20:29