0

I am using deep learning for a very small dataset of 20 for binary classification. 16 out of the 20 examples are labeled "0", the rest are labeled "1".

If I set my learning rate low (0.01 - 0.1), my model consistently achieves a training accuracy of 0.8, with roc-auc score of 0.5. If I set my learning rate high (value of 1), then the accuracy will fluctuate (sometimes 0.9, most of the time 0.8, sometimes 0.5).

So here's my question: What are some standard methods to try in this case? I was thinking of sampling the data labeled "1" more frequently, is this a good idea?

Ray
  • 113
  • 2
  • 5
    Why use deep learning with just $20$ observations? // Good news! Class imbalance is not a problem! https://stats.stackexchange.com/questions/357466/are-unbalanced-datasets-problematic-and-how-does-oversampling-purport-to-he https://www.fharrell.com/post/class-damage/ https://www.fharrell.com/post/classification/ https://stats.stackexchange.com/a/359936/247274 https://stats.stackexchange.com/questions/464636/proper-scoring-rule-when-there-is-a-decision-to-make-e-g-spam-vs-ham-email https://twitter.com/f2harrell/status/1062424969366462473?lang=en – Dave Jul 13 '21 at 00:19
  • 3
    You should absolutely not be using deep learning with only 20 observations. Estimation of the marginal rate of the outcome is dubious at this point. – Demetri Pananos Jul 13 '21 at 01:00

0 Answers0