I'm working on a binary classification problem and I'm trying to assess the performance of my model to correctly classify the positive class (hopefully getting a probability). I'm using the positive predictive value (PPV) statistic. The issue with the PPV is that I can move the classification threshold to the far right (on the score axis) and get a very high PPV, but also a very high miss ratio (low sensitivity). However, if I form the product of PPV * sensitivity then I can search for the classification threshold that maximizes that product. This product seems to be an excellent statistic on the performance of the model on the positive class but I can't find any reference on it. I need something that I and others can interpret. In my opinion, sensitivity * precision is saying something like (say in the field of cancer detection) "the fraction of positive diagnosis detected that are truly positive." Technically, the probability of a positive diagnosis given it was detected, given it was diagnosed positive by the model. I've generated a bunch of fake datasets and confusion matrices, and adjusted the classification threshold around and this sensitivity * precision can explain how well the model is doing on the positive class. So my questions are:
- Does PPV * sensitivity make any sense as a statistic?
- If this product makes sense, what is it saying about the model's performance? Does it have a name?
- Is it a probability (PPV is a probability while sensitivity seems to be a ratio)?
Thanks.