Evaluate classifier based on predicted probabilities?

Question

If I had a categorical response $Y$ and multiple categorical features $X$, and I wanted to fit a model to predict $Y$.

If all I cared about was the eventual distribution of $Y$ (say in terms of %), I was wondering if I could use that criteria somehow to evaluate my model, rather than use the actual predicted class label.

Originally I thought maybe, if I fitted some model e.g.

library(randomForest)
y <- factor(sample(seq(3), 100, replace = TRUE))
x <- matrix(sample(seq(5), 500, replace = TRUE), ncol = 5)
x <- data.frame(apply(x, 2, as.factor))

rf <- randomForest(y ~ ., data = x)

Then I could get the ratio like

  test_probs <- predict(rf, x, type = "prob")
  test_ratio <- colSums(test_probs)
  test_ratio <- test_ratio / sum(test_ratio)

and e.g. calculate the MSE on it. But of course inside a CV loop. But even then, of course the distribution throughout the CV folds will always be very similar. Would this be usable if I set the number of folds so high that the variance is very high, or is it always a bad idea?

score 2 · Accepted Answer · answered Nov 20 '18 at 04:16

Evaluating predictive distributions instead of predicted single class memberships is an enormously good idea. Quality measures for such predictive distributions are called scoring-rules. You specifically want so-called proper scoring rules, which are the ones that reward you for finding the correct distribution. The scoring rules tag wiki has more information and pointers to literature.

Related and possibly enlightening: Why is accuracy not the best measure for assessing classification models? and Is accuracy an improper scoring rule in a binary classification setting?

Evaluate classifier based on predicted probabilities?

1 Answers1