I am looking for the best performance measure.
My use case: I want to find out which dataset can be modelled best with binary classification. The datasets have an active minority class I am interested in and have different class ratios (its actually the same dataset where the classes are produced with different discretization thresholds):
- dataset A has active class ratio 1/10
- dataset B has active class ratio 2/10
- dataset C has active class ratio 3/10
AUC ROC is not suitable, as it gives over-optimistic results for highly unbalanced datasets. Hence it will probably favor dataset A.
Average Precision (a.k.a Area under Precision Recall Curve) is not suitable as well, as its baseline is the class ratio, therefore, it will favor dataset C.
Any help would be very much appreciated.