Over-fitting diagnosis method

Question

How much difference between train and test set errors can indicate over-fitting? For example in logistic regression. I am trying to classify 11746 comments based on their sentiments in three classes using logistic regression. My train accuracy is about 100% but in test i get accuracy about 52%. I want to know how much difference between these factors can indicate over-fitting?

score 3 · Accepted Answer · answered May 10 '17 at 19:32

train accuracy is about 100% but in test i get accuracy about 52%

This is definitely overfitting. Usually we want the performance on training and testing almost the same. In addition, in most cases, we do not want training accuracy to be 100% since training data may contain noise and we want the model more "general" than training data.

My answer here gives a guidance to use learning curve for over-fitting diagnosis.

How to know if a learning curve from SVM model suffers from bias or variance?

Over-fitting diagnosis method

1 Answers1