I am trying to understand if my results are overfitting or not. I have the following results, using different features for model building:
Model 1
Total classified: 4696
Score: 1.0 # from cross validation
Score length 3
Confusion matrix:
[[2348 0]
[ 0 2348]]
Logistic Regression 1
precision recall f1-score support
0.0 0.96 0.97 0.97 585
1.0 0.76 0.67 0.71 76
accuracy 0.94 661
macro avg 0.86 0.82 0.84 661
weighted avg 0.94 0.94 0.94 661
and
Model 2
Total classified: 4696
Score: 0.65 # from cross validation
Score length 3
Confusion matrix:
[[2154 194]
[ 66 2282]]
Logistic Regression 2
precision recall f1-score support
0.0 0.96 0.97 0.96 585
1.0 0.73 0.68 0.71 76
accuracy 0.93 661
macro avg 0.85 0.83 0.84 661
weighted avg 0.93 0.93 0.93 661
It seems clear to me, looking at the model's 1 confusion matrix
[[2348 0]
[ 0 2348]]
and at its score (1), that I am having a problem of overfitting. However, I would like to ask you the following questions, all related to this topic:
- in the second model, I am getting a score of 65 and a confusion matrix not perfect. Would it be ok to say that it is not overfitting based on the other metrics in the contingency table (recall, f1 score,...) as they are not so far from that value? (the problem is a classification one, with imbalance data)
- what about the accuracy in the contingency table?
- is there anything else that I need to consider?
Thank you for all the answers and comments for clarifying this (challenging) concept.