1

How much difference between train and test set errors can indicate over-fitting? For example in logistic regression. I am trying to classify 11746 comments based on their sentiments in three classes using logistic regression. My train accuracy is about 100% but in test i get accuracy about 52%. I want to know how much difference between these factors can indicate over-fitting?

keramat
  • 181
  • 12

1 Answers1

3

train accuracy is about 100% but in test i get accuracy about 52%

This is definitely overfitting. Usually we want the performance on training and testing almost the same. In addition, in most cases, we do not want training accuracy to be 100% since training data may contain noise and we want the model more "general" than training data.

My answer here gives a guidance to use learning curve for over-fitting diagnosis.

How to know if a learning curve from SVM model suffers from bias or variance?

Haitao Du
  • 32,885
  • 17
  • 118
  • 213