I have compared the rankings obtained by comparing 10+ classifiers with this two metrics:
- Jaccard score
- F1 score
They show a perfect correlation. This results holds on 50+ datasets. When comparing with other metrics, it's clear that there are variations in the rankings produced.
Is there something in the respective definitions of this two metrics that ensure that they produce the same rankings of classifiers? In other words, is there a monotone relation between the two?