Balanced accuracy reduces to accuracy for balanced datasets

Question

This question might be trivial, but I have problems understanding this line taken from here:

The balanced_accuracy_score function computes the balanced accuracy, which avoids inflated performance estimates on imbalanced datasets. It is the macro-average of recall scores per class or, equivalently, raw accuracy where each sample is weighted according to the inverse prevalence of its true class. Thus for balanced datasets, the score is equal to accuracy.

I have particular problems with the last line. So let's take the definition as sklearn provides it: $$\text{bal}\_{\text{acc}} := \frac{1}{2}\left(\frac{\text{TP}}{\text{TP}+\text{FN}} + \frac{\text{TN}}{\text{TN+FP}}\right)$$ The way I understood it, for a balanced dataset, $\text{TP} = \text{TN}$. Therefore, the last Eq. can be rewritten to: $$\text{bal}\_{\text{acc}} := \frac{1}{2}\left(\frac{\text{TP}}{\text{TP}+\text{FN}} + \frac{\text{TP}}{\text{TP+FP}}\right) = \frac{1}{2}\left(\frac{\text{TP}\left(\text{TP}+\text{FP}\right)}{\left(\text{TP}+\text{FN}\right)\left(\text{TP}+\text{FP}\right)} + \frac{\text{TP}\left( \text{TP}+\text{FN}\right)}{\left(\text{TP+FP}\right)\left(\text{TP}+\text{FN}\right)}\right) = \frac{1}{2}\frac{\text{TP}\left( \text{TP}+\text{FN}\right) + \text{TP}\left(\text{TP}+\text{FP}\right)}{\left(\text{TP+FP}\right)\left(\text{TP}+\text{FN}\right)}$$

Unfortunately, I do not see how we can recover the "classical" accuracy: $$\text{acc} := \frac{\text{TP}+\text{TN}}{\text{TP}+\text{TN}+\text{FP}+\text{FN}}$$ Assuming that $\text{TP}=\text{TN}$ again, we have: $$\text{acc} = \frac{2\text{TP}}{2\text{TP}+\text{FP}+\text{FN}}$$ Could anybody help, please?

Shouldn't it be that for balanced data total number of positives = total number of negatives, that is to say TP+FN = TN+FP? TN and TP are functions of your classifier. — einar, Feb 14 '22 at 10:56
Per @einar, "balanced" does not refer to your *classifications*, but to a balanced *true* prevalence of positive and negative samples in your dataset. Also, you may be interested in [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) and [Is threshold moving unnecessary in balanced classification problem?](https://stats.stackexchange.com/q/531048/1352) and [Example when using accuracy as an outcome measure will lead to a wrong conclusion](https://stats.stackexchange.com/q/368949/1352). — Stephan Kolassa, Feb 14 '22 at 11:16

score 0 · Accepted Answer · answered Feb 16 '22 at 11:13

Since no real answer came to this I will just expand on my comment above:

"Balance" is a property of the dataset: there are as many positives in total as there are negatives in total. That is TP + FN = TN + FP. TP (true positives) and TN (true negatives) are properties of your classifier: they are the number of positives and negatives that it gets right.

If we take TP + FN = TN + FP we have

$$ \frac{1}{2} \left(\frac{TP}{TP + FN} + \frac{TN}{TN + FP}\right) = \\ \frac{1}{2}\left(\frac{TP}{TP + FN} + \frac{TN}{TP + FN}\right) = \\ \frac{1}{2}\left(\frac{TP + TN}{TP + FN}\right) = \\ \frac{1}{2}\left(\frac{TP + TN}{\frac{1}{2}(TP + FN + TN + FP)}\right) = \\ \frac{TP + TN}{TP + FN + TN + FP}, $$

as per the quoted text.

My view, for what it's worth, is that "imbalanced data" is a fake problem produced by unwarranted focus on accuracy as a target for model selection. I encourage you to check out @Stephan Kolassa's links in the comments.

Balanced accuracy reduces to accuracy for balanced datasets

1 Answers1