F1 Score is giving good value in imbalanced dataset

Question

If I have an imbalanced dataset that consists of 90% positive points and 10% negative points. Now I created a "dumb" model which always predicts every point as a positive point. The confusion matrix of this problem will be -

Now the Precision of the above confusion matrix will be -

Precision = No. of true positive out of the number of points predicted positive by the model.

So, Precision = TP / (TP + FP) = 90 / (90 + 10) = 0.9

And Recall will be -

Recall = No. of true positive out of the number of points that are actually positive.

So, Recall = TP / (TP + FN) = 90 / (90 + 0) = 1.0

As we can see both precision and recall are high and F1score of this model is,

F1-Score = PR / P + R * 2 = 0.94

This is a high score, so according to this, the model is very good. How can the f1 score give good results on an imbalanced dataset?

score 1 · Answer 1 · answered Sep 01 '21 at 06:08

1

In a situation like this, it probably makes more sense to treat your minority class as your positive class. That way your precision number (and f1 score) is more meaningful.

answered Sep 01 '21 at 06:08

Ryan Epp

111
4

F1 Score is giving good value in imbalanced dataset

1 Answers1