How to assess a model where you are interested in the probability output

Question

I know that we assess performance of classifiers typically with metrics like accuracy, ROC, etc. typically because we want to know whether or not a classifier can accurately predict an outcome. But, what if we are more interested in the probabilistic output of a classifier? As in, the quantity we care about is not the class that is predicted but the probability of that class. How can we go about assessing the performance of the model?

It is perhaps worth looking into the Brier score https://en.wikipedia.org/wiki/Brier_score — Cam, May 29 '18 at 01:57
There are metrics that measure how closely a probability aligns with a label. The most popular are the log-loss (used in logistic regression, classification trees, gradient boosting, and neural networks), and the brier score. Additionally, the AUC measures how well the probabilities put the observations in order, that is, how often a positive class is assigned a larger probability than a negative class. — Matthew Drury, May 29 '18 at 04:08

Brash Equilibrium · Answer 1 · 2018-05-29T21:34:32.997

1

Look into proper scoring rules, including but not limited to the Brier score and log score. These scoring rules are used to measure the performance of probabilistic forecasts. A proper scoring rule is maximized when the predicted probabilities match the true probabilities. There also exists a taxonomy of score propriety, which includes "semi-proper" scoring rules. This Cross Validated answer provides a helpful description of the different types of score propriety.

edited May 29 '18 at 21:34

answered May 29 '18 at 03:56

Brash Equilibrium

3,565
1
25
43

1

Concordance index is not a proper scoring rule, its only sensitive to ranking. – Matthew Drury May 29 '18 at 16:44
1

You are correct. Some would call the c-index semi-proper. I'll revise. – Brash Equilibrium May 29 '18 at 21:07

How to assess a model where you are interested in the probability output

1 Answers1