How much difference between train and test set errors can indicate over-fitting? For example in logistic regression. I am trying to classify 11746 comments based on their sentiments in three classes using logistic regression. My train accuracy is about 100% but in test i get accuracy about 52%. I want to know how much difference between these factors can indicate over-fitting?
Asked
Active
Viewed 409 times
1 Answers
3
train accuracy is about 100% but in test i get accuracy about 52%
This is definitely overfitting. Usually we want the performance on training and testing almost the same. In addition, in most cases, we do not want training accuracy to be 100% since training data may contain noise and we want the model more "general" than training data.
My answer here gives a guidance to use learning curve for over-fitting diagnosis.
How to know if a learning curve from SVM model suffers from bias or variance?

Haitao Du
- 32,885
- 17
- 118
- 213