What's the meaning of f1-score in binary classification with equal number of pos e neg test samples?

Question

I am using a CNN for movie reviews classification using the IMDB Dataset.

They provide 25000 Train Samples, half positive and half negative and 25000 Test Samples, half positive and half negative.

After the training, I evaluated the CNN on the test set using keras evaluate function.

The output is:

score: 0.339593573859
accuracy: 0.885

I'd like to have more accuracy but I can work on it. What is really worrying me is the f1 score. Isn't it too low?

I have checked its meaning, but for me in a half divided test sample is pretty useless. Am I wrong?

Related: [How to know that your machine learning problem is hopeless?](https://stats.stackexchange.com/q/222179/1352) and [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) - the arguments here apply equally to F1. — Stephan Kolassa, Jan 05 '18 at 14:15
@StephanKolassa I think they are saying that accuracy is not good when you use it on unbalanced data. In my case data is balanced so accuracy is "true". But my question is: what's wrong with the score? — Francesco Pegoraro, Jan 05 '18 at 14:21
[Accuracy is not a good measure even for balanced data.](https://stats.stackexchange.com/a/312787/1352) The same arguments apply to the F1 (or [any other F$\beta$ score](https://en.wikipedia.org/wiki/F1_score)). Whether your score is "good enough" is addressed [in the other link I posted](https://stats.stackexchange.com/q/222179/1352). — Stephan Kolassa, Jan 05 '18 at 14:30
@StephanKolassa I am sorry you didn't understand my question. But thank you very much for trying! — Francesco Pegoraro, Jan 05 '18 at 14:44
My understanding is that you worry that your F1 score is too low. My comment is that (a) [it's hard to know whether a given accuracy (or F1 score) is "good enough"](https://stats.stackexchange.com/q/222179/1352), and (b) [accuracy (or F1) is not a good measure, anyway](https://stats.stackexchange.com/q/312780/1352). If I misunderstand your question, perhaps you can clarify? — Stephan Kolassa, Jan 05 '18 at 15:25
I honestly think you are just exaggerating the matter. The task has been given me for my exam in ML and in a paper they reach 5% of error, so I don't think linking me "how to know that your ML problem is hopeless" is helping at all. — Francesco Pegoraro, Jan 05 '18 at 15:47
If you read [the thread I linked](https://stats.stackexchange.com/q/222179/1352), you will see that the theme is how to know whether a given ML performance can be improved or not. "Hopelessness" is just in the title. As such, I think it is completely appropriate for your question "What is really worrying me is the f1 score. Isn't it too low?" If not, and if I misunderstood your question, I would very much welcome a clarification of what I am missing. — Stephan Kolassa, Jan 05 '18 at 19:13

What's the meaning of f1-score in binary classification with equal number of pos e neg test samples?

0 Answers0