0

I'm working on a binary classification problem and I'm trying to assess the performance of my model to correctly classify the positive class (hopefully getting a probability). I'm using the positive predictive value (PPV) statistic. The issue with the PPV is that I can move the classification threshold to the far right (on the score axis) and get a very high PPV, but also a very high miss ratio (low sensitivity). However, if I form the product of PPV * sensitivity then I can search for the classification threshold that maximizes that product. This product seems to be an excellent statistic on the performance of the model on the positive class but I can't find any reference on it. I need something that I and others can interpret. In my opinion, sensitivity * precision is saying something like (say in the field of cancer detection) "the fraction of positive diagnosis detected that are truly positive." Technically, the probability of a positive diagnosis given it was detected, given it was diagnosed positive by the model. I've generated a bunch of fake datasets and confusion matrices, and adjusted the classification threshold around and this sensitivity * precision can explain how well the model is doing on the positive class. So my questions are:

  • Does PPV * sensitivity make any sense as a statistic?
  • If this product makes sense, what is it saying about the model's performance? Does it have a name?
  • Is it a probability (PPV is a probability while sensitivity seems to be a ratio)?

Thanks.

iXombi
  • 1
  • Threshold-based metrics are generally discouraged. Why do you not use the log loss or Brier score to evaluate the probability outputs of your model? – Dave Jul 14 '21 at 17:25
  • I guess because I want to know what are the model's chances of truly detecting the positive class; not to know its performance overall. – iXombi Jul 14 '21 at 17:57
  • Have you read Frank Harrell's blog posts ([1](https://www.fharrell.com/post/class-damage/) [2](https://www.fharrell.com/post/classification/)) about classification vs probability estimation? That you're concerned about tradeoffs between false positives and false negatives tells me that you regard those errors differently. – Dave Jul 14 '21 at 18:00
  • The model will not *truly detect the positive class*. All it will do is to give you a *predicted probability* for an instance of being in the positive class. You can assess whether these predicted probabilities are well-calibrated using [proper scoring rules](https://stats.stackexchange.com/tags/scoring-rules/info). In addition (and this is a *separate* topic), you can take *decisions* and assess the costs of wrong decisions. See [here](https://stats.stackexchange.com/q/312119/1352). – Stephan Kolassa Jul 14 '21 at 20:36
  • Further to your point, as Dave notes, sensitivity *and* precision are highly controversial. I would go so far to call them useless, and worse, actively misleading. See [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) Every criticism against accuracy at that thread applies equally to sensitivity and precision. A fortiori, the product of two useless and misleading measurements will also be useless to misleading. – Stephan Kolassa Jul 14 '21 at 20:38

0 Answers0