Will like to seek some veteran feedback on this.
I am working on an unbalanced, multilabel stock market dataset for educational purposes. The dataset shape looks like this
2503 - 0
234 - 1
32 - -1
0 represents hold, 1 represent buy signal and -1 represent sell signal.
I have been reading up on which evaluation metric to use and concluded that F1 score is the one to go with. Next issue comes in which is either micro
or macro
. If I understand correctly, micro
should be use if I wish to bias towards the majority label (in this case 0
that represents hold) while macro
bias towards the minority labels (in this case, I assume it is both 1
and -1
?).
Choosing of the evaluation metric seems to be depend on the domain of the data and how the labels are in term of importance. In the case of a stock market, where I do not have enough knowledge being a student still, which option will be better?
P.S: I feel like micro
is the one to use as most of the time in a stock market, we want to hold the stock rather than buy or sell. What do you guys think? I am also using SVM/Random Forest/NB to try this prediction problem.