1

I have a dataset where I'm training a neural network to discriminate between 4 classes. The class distribution on the dataset is as follows:

Class 0: 0.516
Class 1: 0.159
Class 2: 0.235
Class 3: 0.088

It's clear that class 0 is over-represented. Conversely, class 3 is almost non-existent in the dataset. These distributions are more or less preserved after a split into training, validation and test sets. In general, the network's performance is poor in terms of accuracy, even if it achieves relatively good performance on confusion matrix metrics (precision, recall and f1 score). I believe the network is overfitting class 0, as the training cost follows a downward trend, while validation and test errors are high. L2 regularization does not seem to help that much with the situation.

Do you have any suggestions on how to tackle this situation?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
AutomEng
  • 175
  • 8
  • 1
    Are you shuffling your training data? How are you selecting your data? What's the size of your data like? Is it evenly distributed across the classes? Are you normalizing your data before it goes into the network? What kind of network is it? Hyperparameters? What kind of data? How many epochs/iterations? What parameters have you added/tried already? Any other regularization attempts? Just showing that your distribution is pretty much no info. – Araymer Feb 14 '17 at 16:59
  • This is a duplicate. – Michael R. Chernick Feb 15 '17 at 00:46

0 Answers0