1

I'm designing an image classifier for places365. The validation loss after the first epoch is lower than all other epochs (9.65 cross entropy loss, 11.07 after 10th epoch), but equally the validation accuracy is also lowest after the first epoch (0.09 accuracy after first epoch, 0.28 after 10th epoch).

I understand that loss increasing while accuracy is also increasing suggests that the predictions are becoming less confident but more correct... and for an image classifier is that okay? Is it okay to succumb to less confident soft predictions as long as the hard prediction accuracy is improving?

Alternatively, would it be viable to do parallel early stopping, one instance where it checkpoints the model as the loss decreases and one instance where it checkpoints the model as the accuracy increases. Then test both against the test set and see which has the best performance? Would that count as data leakage because i'm effectively adjusting the hyperparameter of "which early stopping metric should I use" based on test data?

Dave
  • 28,473
  • 4
  • 52
  • 104
Avelina
  • 809
  • 1
  • 12
  • 1
    Of probable interest: https://stats.stackexchange.com/questions/464636/proper-scoring-rule-when-there-is-a-decision-to-make-e-g-spam-vs-ham-email https://www.fharrell.com/post/class-damage/ https://www.fharrell.com/post/classification/ – Dave Jun 04 '21 at 09:55
  • Ah, I see, thanks for the links @Dave. Does the loss decreasing then point to some potential issues in the model? Perhaps too much or too little regularization? – Avelina Jun 04 '21 at 10:05

0 Answers0