Questions tagged [f1]
48 questions
6
votes
1 answer
F1 score, PR or ROC curve for regression
Due to my background as a pure biologist, I've been struggling with the comment acquired from a reviewer about the accuracy test used in my regression study. While I stick to MSE, MAE and R2 as the parameters to determine accuracy of my regression…

Tofu King
- 83
- 5
4
votes
1 answer
Use F1 or maximum F1 for model comparisons?
I am comparing a ML classifier to a bunch of other benchmark F1 classifiers by F1 scores. By AUPRC, my classifier does worse than other benchmark methods. When I compared F1 score, however, I got a curious result that my classifier does better than…

John Smith
- 91
- 4
4
votes
0 answers
What is the difference of "normal" F1 and macro average F1 score with binary classification
Please note that I always talk about binary classification here. I do not speak about multi class classification.
In case of unbalanced binary datasets it is a good practice to use F1 score. While the positive label is always the rare case.
Now…

Dieshe
- 243
- 1
- 9
3
votes
1 answer
Why use harmonic mean for precision and recall (f1 score) instead of just the product of precision and recall?
General question here, I understand the purpose of using the harmonic mean to generate the f1 score for model evaluation. I'm not exactly sure though why we don't just take the product of precision and recall (precision x recall) and just use that…

Daniel
- 173
- 6
3
votes
2 answers
Why does the 'weighted' f1-score result in a score not between precision and recall?
On the F1 score sklearn page there's a section that explains each of the options for the average parameter. Under the weighted option, it says: "it can result in an F-score that is not between precision and recall."
I would like to know why this…

George McIntire
- 31
- 1
3
votes
2 answers
Reporting F1 Scores
I have a question with regard to the proper way to report F1 scores. Say I am comparing two algorithms one with F1 score of 0.71 and the other of 0.82.
Is it correct to say:
"Algorithm 1 obtained an F1 score 11 points higher than algorithm 2"
or…

astel
- 1,388
- 5
- 17
2
votes
1 answer
Harmonic is used in F1 score because it is a conservative metric: How does it help being conservative?
I was reading Jurafsky 3rd edition, page 12-13 chapter 4
Can you explain why is it good to weigh more the smaller of the two items namely $\frac{1}{Precision}$ or $\frac{1}{Recall}$?
Here is the link to the book chapter(freely available from the…

user27286
- 279
- 1
- 7
2
votes
2 answers
Classifier can predict time series 1 day in advance, but not more. Why?
To ask the question more precisely: when doing Time Series classification, I observe the classifier prediction is good if test data directly follows (in chronology) the train data. But when the train and test sets are separated in time (even by very…

Data Man
- 61
- 7
2
votes
1 answer
F1 score macro-average
In this question I'll differentiate by using lower-case for class-wise scores, e.g. prec, rec, f1, which would be vectors, and the aggregate macro-average Prec, Rec, F1. My formulae below are written mainly from the perspective of R as that's my…

Mobeus Zoom
- 220
- 1
- 5
2
votes
2 answers
F1-Score in a multilabel classification paper: is macro, weighted or micro F1-used?
I read this paper on a multilabel classification task. The authors evaluate their models on F1-Score but the do not mention if this is the macro, micro or weighted F1-Score.
They only mention:
We chose F1 score as the metric for evaluating
our…

chefhose
- 121
- 6
1
vote
1 answer
How to calculate F1, Precision, and Recall for Multi-Label Multi-Classification
I have a predictive model as…

asmgx
- 239
- 2
- 9
1
vote
1 answer
Model accuracy versus F1
When training a model (classifier) in TensorFlow, an accuracy value is returned. What is the interpretation of an accuracy of, say, 0.79. Furthermore, how does the accuracy relate to other evaluations of predictions, such as F1?

DIGSUM
- 111
- 2
1
vote
1 answer
F1 Score is giving good value in imbalanced dataset
If I have an imbalanced dataset that consists of 90% positive points and 10% negative points. Now I created a "dumb" model which always predicts every point as a positive point. The confusion matrix of this problem will be -
Now the Precision of…

Mauj Mishra
- 55
- 2
1
vote
0 answers
Average Precision vs average F1
Average precision computes the area under the recall-precision curve by the trapezoidal rule (or midpoint rule). However, we could also compute the F1 score for every threshold and then take the average. Is there a benefit in considering the area…

displayname
- 488
- 2
- 6
1
vote
1 answer
On which set (train/val/test) do people calculate F1 score, precision and recall?
This may be a stupid question, but when I was looking at the definition of precision/recall etc. it was not mentioned anywhere which set (training/validation/test) this metric should be calculated…

Curaçao Hajek
- 133
- 7