2

I have an imbalanced dataset with 5 classes. The distribution is roughly:

Class 1: 0.5
Class 2: 0.25
Class 3: 0.15
Class 4: 0.05
Class 5: 0.05

I've been testing some different classification algorithms on this data set. To evaluate the algorithms, I've looked at overall accuracy and the kappa statistic (as well as the good ol' confusion matrix of course).

In almost every case, higher accuracy corresponds with a higher kappa score.

So, if I were to use accuracy to screen out classifiers instead of kappa, then I'd almost have the exact same results (i.e. I'd end up picking the same classifier).

Reading through Tom Fawcett's blog post, he says

Don’t use accuracy (or error rate) to evaluate your classifier!

and then goes on to explain that if you must use a single number estimator, then ROC, F1, and Kappa are all better replacements.

On my data set there's barely a difference. Just a feature of my data set? What type of data set would you expect to see a more distinct difference in the metric?

colorlace
  • 1,010
  • 11
  • 25
  • Imbalanced: https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models (besides other problems with it) – Tim May 24 '18 at 13:57
  • The idea is training and algorithm with the most balanced possible classes. Otherwise, the statistical metrics that you obtain (e.g. Kappa, OA, precision, recall) will be biased to the largest class, thus yielding (false) high statistical metrics. You should consider trying to balance your classses, but if this is not possible, perhaps the Matthews correlation coefficient (https://en.wikipedia.org/wiki/Matthews_correlation_coefficient) can help you to find a less-biased statistic metric. – iamgin May 24 '18 at 14:03
  • Related: [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) In particular, if I may say so, [my answer](https://stats.stackexchange.com/a/312787/1352). – Stephan Kolassa May 24 '18 at 14:06

0 Answers0