Is it a good idea to make DL model overfit at first?

Question

I often struggle with ML(DL) problems where my accuracy at first attempts is ridiculously low. I was asking myself whether it is a good idea to make a model big enough to overfit then progressively decrease its size to get something that may learn. Thus, follow up the iteration process knowing the range of parameters we have.

It sounds logical for me, can you give me some hints?

This paper may have some suggestions...*Underspecification Presents Challenges for Credibility in Modern Machine Learning* https://arxiv.org/abs/2011.03395 — , Dec 13 '20 at 17:29
See https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn, especially the answer by user Alex R. — kjetil b halvorsen, Dec 14 '20 at 15:10

Lulu · Answer 1 · 2020-12-14T11:03:05.410

It's a good idea to overfit first just to see that the model can actually learn i.e. to see that your training, optimization etc. are set up correctly. Once you have established this, you may then want to add regularization e.g. drop out, and do all kinds of hyperparameter search on validation set to find the best model. I don't believe the procedure of just making it smaller and smaller is good though -- instead the depth and width of layers should be among the hyperparameters you optimize. Good advice for beginners e.g. here enter link description here

score 0 · Answer 2 · answered Jan 19 '21 at 16:21

This is a standard approach for Decision Tree models, where the first trained tree is pruned back to avoid over fitting.

check out the rpart documentation (https://cran.r-project.org/web/packages/rpart/rpart.pdf) to see how the built in fuction uses a complexity parameter to determine

Is it a good idea to make DL model overfit at first?

2 Answers2