How does the F1 score equation contribute to its intended effect?

Question

I am aware of this post and it gives me a great intuition on the difference between IoU and F1 score but I don't quite understand how the equation differences creates that differences.

F1 / Dice: $$\frac{2TP}{2TP+FP+FN}$$
IoU / Jaccard: $$\frac{TP}{TP+FP+FN}$$

Copying and pasting from that post, the possible differences between the two methods could be:

In general, the IoU metric tends to penalize single instances of bad classification more than the F score quantitatively even when they can both agree that this one instance is bad. Similarly to how L2 can penalize the largest mistakes more than L1, the IoU metric tends to have a "squaring" effect on the errors relative to the F score. So the F score tends to measure something closer to average performance, while the IoU score measures something closer to the worst case performance.

Now, how does multiplying the true positives by 2 in the numerator and denominator create the intended effect, especially with regards to something like image segmentation? I'm having a hard time wrapping my mind around this.

score 1 · Accepted Answer · answered Jun 08 '18 at 04:11

I'm going to go with the same example from that post because I think it was really good:

If in image segmentation, the ground truth looks like

and the actual predicted output looks like:

The TP = 1, FP = 1 Then the DSC/F1 score would be 2*TP / 2*TP+FP+FN = 2/3 = 0.67 similarity

But the Jaccard/IOU score would be: TP/ TP+FP+FN = 1/2 = 0.5 similarity

Since a score of 1 is perfect similarity, it appears that the IOU score is punishing the algorithm more because of that extra pixel false positive than the F1 score is because the F1 score is higher.

This is a similar to how L1 and L2 norms work; if the difference between y_pred and y_true is 6 then the L1 norm says the error = 6 but the L2 norm says the error is 6^2 = 36. L2 penalizes the model more for making this large error, similar to the IOU metric for images.

How does the F1 score equation contribute to its intended effect?

1 Answers1