Evaluating Classification Models with Class Probabilities

Question

I'm curious to see if there are any useful metrics to evaluate classification models using numeric probabilities.

Traditionally, I would train a classification model, generate factor predictions on the test set, and use a confusion matrix or ROC curve to decide on the best models. However, in this instance, I'm interested in doing model evaluation from looking at the numeric probablities.

Update

An example of what I'm talking about is this: I fit a multiple models and have it predict classes on the test set. Usually I can create a confusion matrix

Model 1

     Yes  No

 Yes  10   5
 No    2  13

Model 2

     Yes  No

 Yes   3  11
 No    8   8

From the confusion matrix, I can clearly tell that model 1 is more accurate than model 2.

How would I evaluate two models if I have them give me numeric probablities instead, for instance:

Model-1 Preds    Model-2 Preds     Test Set 
    .59               .25             No
    .14               .08             No
      .                 .              .
      .                 .              .
      .                 .              .
    .33               .29            Yes

I have thoughts of discretizing them or converting the yes and no's into 1 and 0's and calculating the residuals. Just wanted to know if there are more formal best practices to use in this case.

Can you give an example / some context to make this more concrete? Are you asking if it's possible to evaluate a model based on the predicted probability of success for cases without comparing those probabilities to a threshold and classifying them as successes & failures? — gung - Reinstate Monica, Oct 06 '15 at 03:43
See e.g [Determine accuracy of model which estimates probability of event](http://stats.stackexchange.com/q/20534/17230) & Gneiting & Raftery (2007), "Strictly proper scoring rules, prediction, & estimation", *JASA* **102**, 477. — Scortchi - Reinstate Monica, Oct 06 '15 at 14:50
Thank you for adding that. The answer is going to be to use something like the Brier score (there are also other options). It is discussed in the linked thread. — gung - Reinstate Monica, Oct 06 '15 at 19:24

Evaluating Classification Models with Class Probabilities

0 Answers0