The intuition behind the different scoring rules

Question

Consider the three scoring rules in the case of a binary prediction:

Log: sum(log(ifelse(outcome, probability, 1-probability))) / n
Brier: sum((outcome-probability)**2) / n
Sphere: sum(ifelse(outcome, probability, 1-probability)/sqrt(probability**2+(1-probability)**2)) / n

What is the intuition behind them? When should I use one and not the other? I am especially interested in the case of low prevalence (e.g., 0.1%).

PS. This is to evaluate the results from my calibration algorithm which I asked about before.

possible duplicate of [Justifying and choosing a proper scoring rule](http://stats.stackexchange.com/questions/126965/justifying-and-choosing-a-proper-scoring-rule) — sds, Apr 24 '15 at 20:31
do you think your own post is a duplicate? As I read the linked thread, it does not (currently) answer all the questions I understand in your Q here. I would not vote to close, as I would be interested in answers to your questions (+1 from before). But you can always delete your own Q, if you want. — gung - Reinstate Monica, Apr 24 '15 at 21:07
@gung: I would love to see an answer too, but the referenced question and its answer is highly related and I wanted to point that out. I think a "possible dupe" is a good way, especially since you clearly indicated your disagreement (thank you!) and thus made the actual closing unlikely. :-) — sds, Apr 24 '15 at 21:27
You can simply add a comment to your Q w/ a link saying that it is related or may also be of interest to readers. That would accomplish what you set out to do here. I would not flag your Q for closing as a duplicate. — gung - Reinstate Monica, Apr 24 '15 at 23:17
Regarding only 1. the intuition is that it is the log likelihood function for a binary outcome Y which we know has certain optimality properties when maximized to fit statistical models. — Frank Harrell, Nov 15 '20 at 11:52

score 2 · Answer 1 · answered Dec 20 '20 at 18:39

One place where log scoring may be inappropriate: the comparison of human forecasters (who may tend to overstate their confidence).

Log scoring strongly penalizes very overconfident wrong predictions. A wrong prediction that was made with 100% confidence gets an infinite penalty. For example, suppose a commentator says "I am 100% sure that Smith will win the election," and then Smith loses the election. Under log scoring, the average score of all the commentator's predictions is now permanently stuck at $-\infty$, the worst possible. It should be possible to distinguish that somebody who has made a single wrong 100% confidence prediction is a better forecaster than somebody who makes them all the time.

I would say that "over-penalizing over-confidence" is a _feature_, not a _bug_. — sds, Dec 20 '20 at 18:44

score 0 · Answer 2 · edited Apr 13 '17 at 12:44

0

Log

The expected surprisal of the prediction when we discover the actual value.

Brier

$L^2$, RMSE, OLS.

However, the fact that $p=2$ is the only value which turns the $L^p$ norm into proper scoring rule detracts from this intuition.

Sphere

The cosine of the angle between the prediction vector $(p,1-p)$ and the outcome vector (0,1) or (1,0).

Note that the angle itself is not a proper scoring rule, which also detracts from the intuition.

edited Apr 13 '17 at 12:44

Community

1

answered May 15 '15 at 21:44

sds

2,016
1
22
31

Although you answered your own question, your answer is incoherent as a response to your interesting question. For example, "Log: the expected surprisal of the prediction when we discover the actual value" is incoherent as an answer to the question, "What is the intuition behind them? When should I use one and not the other?" – Tripartio Nov 11 '20 at 07:00
@Tripartio: do you have a better answer? – sds Nov 11 '20 at 11:46
1

No I don't. I upvoted the question because I would really like to learn the answer. However, I posted a related question: https://stats.stackexchange.com/questions/495935/non-mathematical-explanation-of-how-to-interpret-and-evaluate-scoring-rules-in-r – Tripartio Nov 11 '20 at 13:03

The intuition behind the different scoring rules

2 Answers2

Log

Brier

Sphere

Linked