I am using deep learning for a very small dataset of 20 for binary classification. 16 out of the 20 examples are labeled "0", the rest are labeled "1".
If I set my learning rate low (0.01 - 0.1), my model consistently achieves a training accuracy of 0.8, with roc-auc score of 0.5. If I set my learning rate high (value of 1), then the accuracy will fluctuate (sometimes 0.9, most of the time 0.8, sometimes 0.5).
So here's my question: What are some standard methods to try in this case? I was thinking of sampling the data labeled "1" more frequently, is this a good idea?