1

The precision vs. recall tradeoff is the most common tradeoff evaluated while developing models, but sensitivity vs. specificity addresses a similar issue. When should one of these pairs of metrics be looked at over the other?

In general, there are many different metrics that can be calculated from the standard TP, FP, TN, FN counts. Many of them are shown on this page: https://en.wikipedia.org/wiki/Confusion_matrix. It is very unclear to me when any one of these metrics should be used over the others, there seems to be a lot of redundancy.

John S
  • 95
  • 4
  • You shouldn't use *any* of them, because they are *all* misleading. [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) [Is accuracy an improper scoring rule in a binary classification setting?](https://stats.stackexchange.com/q/359909/1352) [Classification probability threshold](https://stats.stackexchange.com/q/312119/1352) The same problems apply to sensitivity and specificity, and indeed to all evaluation metrics that rely on hard classifications. – Stephan Kolassa May 03 '21 at 06:20
  • Instead, use probabilistic classifications, and evaluate these using [proper scoring rules](https://stats.stackexchange.com/tags/scoring-rules/info). – Stephan Kolassa May 03 '21 at 06:20

0 Answers0