3

On the F1 score sklearn page there's a section that explains each of the options for the average parameter. Under the weighted option, it says: "it can result in an F-score that is not between precision and recall."

I would like to know why this happens. Thanks

2 Answers2

0

the F1 score uses a harmonic mean rather than the actual mean, which accounts for the difference

Tavi
  • 1
  • 1
  • 1
    Hi, this answer doesn’t address the problem because of the [generalized mean inequality](https://en.wikipedia.org/wiki/Generalized_mean). The harmonic mean always falls between the minimum and maximum (inclusive). – Arya McCarthy Apr 06 '21 at 18:09
  • You can improve this answer by considering the role of the weights. – Arya McCarthy Apr 06 '21 at 18:31
0

It appears this can happen already with the macro average option. The statement needs some clarification, but I assume the precision and recall that are supposed to not bound the averaged F1 are themselves the same type of average.

Here's a simple example: $TP=TN=4$, $FP=1$, $FN=16$. Then $$\begin{align*} \operatorname{precision}(1)&=\frac{TP}{TP+FP}=0.8, \\ \operatorname{recall}(1)&=\frac{TP}{TP+FN}=0.2, \\ \operatorname{precision}(0)&=\frac{TN}{TN+FN}=0.2, \\ \operatorname{recall}(0)&=\frac{TN}{TN+FP}=0.8 \end{align*}$$

and so $F_1(1)=F_1(0)=0.32$, so the macro-average $F_1$ is also $0.32$. But the macro-averaged precision and recall are both $0.5$.

Ben Reiniger
  • 2,521
  • 1
  • 8
  • 15