Evaluating the quality of a classifier which provides probability per class

Question

I am looking for a metric to evaluate the quality of a classifier which answers the question for a class of an object with a probability per known class. Say I use the classifier to classify an object of class A and the classifier classifies the object with 75% probability as A and 25% probability as C.

One option is, just to pick the probabilities, in this case the 75%. Are there any other metrics?

The Classifier uses a database of classes with different objects but not all objects of one class share the same properties found in an query object, so not all objects of a class may be relevant.

score 2 · Answer 1 · edited Apr 13 '17 at 12:44

If your objects do belong 100% to one class each, then you can

set a cutoff and calculate "hard" memberships from that => gives normal sensitivity & Co. (for explanation of their meaning, see e.g. my answer here
varying this cutoff yields receiver operating curves or specificity-sensitivity diagrams etc.

If you want to stay with the 0 - 100% scores as output, sensitivity & Co. can be extended for that. You can also use more regression-type performance like MAE, RMSE, but in that case you should calculate those values separately for each (reference) class.
Some presentations about the basic idea are at softclassval.r-forge.r-project.org (most of the extensions will not be needed unless the reference also takes values in between 0 and 100%). [Paper is waiting for the OK from the co-authors].

Evaluating the quality of a classifier which provides probability per class

1 Answers1