5

Below is the report of my out-of-bag precision, recall and f-1score when using scikit-learn

              precision    recall    f1-score     support
pos            0.72        0.47       0.57          97929
neg            0.61        0.82       0.70          98071
avg / total    0.67        0.65       0.64         196000

I thought the f-1 score would be simmetric, but it isn't (i.e. I get a different f1-score for pos and for neg, even though this is a binary classification problem).

Is F-1 not simmetric?

Josh
  • 3,408
  • 4
  • 22
  • 46

1 Answers1

8

Let's normalize the confusion matrix, i.e. $TP + FP + FN + TN = 1$. We have: $F_1 = 2 \cdot \frac{\mathrm{precision} \cdot \mathrm{recall}}{\mathrm{precision} + \mathrm{recall}} = 2 \cdot \frac{\frac{tp}{tp+fp} \cdot \frac{tp}{tp+fn}}{\frac{tp}{tp+fp} + \frac{tp}{tp+fn}} = 2 \frac{TP} {2 TP + FP + FN} = 2 \frac{TP} {TP + 1 - TN} $

Therefore: $\text{F-1 score symmetric} \leftrightarrow 2 \frac{TP} {TP + 1 - TN} = 2 \frac{TN} {TN + 1 - TP} \leftrightarrow TN(1-TN) = TP(1-TP) \leftrightarrow (TN = TP) \vee (TN = 1 - TP)$.

So the F-1 score is symmetric only for some special cases, namely when $TN = TP$ or $TN = 1 - TP$.


By the same token, the precision and recall are generally not symmetric, but the AUROC (Area Under an ROC Curve) always is. As a result, when presenting the results, one would typically distinguish positive classes (-P) from negatives ones (-N):

enter image description here

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271