Why metrics focus on maximizing only majority class in binary classification?

Question

Am working on a binary classification with imbalanced dataset of 75:25. Class 0 is only 25% (minority class).

My objective is to predict the 0's as 0's correctly. Maximize recall value/f1-score for class 0.

However, I realized that the scoring functions only focus on maximizing the metric for positive/majority class? Is it like that? I might be wrong too

For ex, the below code focuses on maximizing the f1-score of positive class (in my data)

model = GridSearchCV(rfc, param_grid, cv = skf, scoring='f1')
model.fit(ord_train_t, y_train)

But my objective is to maximize the f1-score of minority class (negative class - Label 0) in my case. (more costly, important)

Therefore, the only option is invert my labels? Meaning, map 1s to 0s and 0s to 1s?

Isn't there any method available to focus on maximizing the metrics for minority class? Or my understanding is incorrect and metrics work equally same for both classes? there's no preference between majority and minority class (during binary classification metrics optimization)?

Is it wring from my part to code the labels incorrectly? The class that I want to predict should always be 1?

Why do you assume that metrics focus only on the positive class? Notice also that majority class and positive class are not the same. — Tim, Feb 08 '22 at 10:30
But when I print the best_score_ , I see the `f1-score` value which corresponds to the majority class (which is positive in my case)? — The Great, Feb 08 '22 at 10:31
Does algorithms work based on maximizing the f1-score for both classes? But when I invert the labels, I do some improvement in f1-score by 15 points. Not great though but still an improvement. Hence, I assumed it to be that way — The Great, Feb 08 '22 at 10:32
Most algorithms maximize a likelihood in fitting the model, which is something different than the F1 score. This is related to optimizing a forecasting algorithm on MSE, but then evaluating it on MAPE ([Kolassa, 2020](https://doi.org/10.1016/j.ijforecast.2019.02.017)). That said, the simplest approach in your case is probably indeed to just invert the labels. — Stephan Kolassa, Feb 08 '22 at 10:37
Note that if you only care about predicting the true $0$s as $0$s, you can predict everything as a $0$. — Dave, Feb 08 '22 at 11:14
@Dave - but when it is a minority class, am not able to predict everything as zero... — The Great, Feb 08 '22 at 11:16
I mean that you set aside all fancy modeling techniques and call every case a $0$. — Dave, Feb 08 '22 at 11:19

score 4 · Accepted Answer · answered Feb 08 '22 at 10:46

It is not true that all the metrics focus on the positive class. Some metrics do, precision and recall are examples of such metrics. However, there are also equivalents of those metrics for negative class as you can learn from this Wikipedia article. It is up to you which metrics you choose and what exactly you want to measure. If you want to use something like precision, than you are correct, you just need to re-map the class labels. The labels are arbitrary and it is up to you what you label as 1 and what as 0, so this is only about interpretability of the results and metrics.

Imbalanced classes is a completely different problems. Some metrics are very poor choice for imbalaced classes, accuracy is probably the most notorious example. We have a separate thread on metrics for imbalanced data that I recommend to you.

Note Stephan Kolassa's answer about how accuracy is problematic in the balanced setting, too. — Dave, Feb 08 '22 at 14:38

Why metrics focus on maximizing only majority class in binary classification?

1 Answers1

Linked