i'm trying to understand wether my model has good performance or not. I have binary classifier for summarization sentences: important or not (extractive approach) on specific corpus.
Dataset is imbalanced: class 1 - 8K samples, 0 - 32K
As i understand accuracy is not valid metric, and AUC ROC too because of imbalance. So i use average precision, F1, F2 metrics. As baseline i've used SVM, it shows AUC ROC 78% but AP - 20%.
I use CNN with words embeddings and have such values:
Accuracy: 27% AUC ROC - 70% Average precision - 40% Precison class 1 - 27%, class 0 - 95% Recall class 1 - 85%, class 0 - 55% F2 - 59% F1 - 41%
In my task it's better that classifier can find positive class. But i'm worrying about false positives.
I wonder when can i say, that model is good enough? I've read some articles where authors says different : F1 should be > 50%, others - it could be 40-50%.
so my question is model have good performance or i should tune it?
if you have questions, don't hesitate)