Comparing weather predictions

Question

I'd like to know how to compare model predictions with binary data in the following example, and be pointed to more on the subject.

Specific Example - Comparing weathermen:

It either rains or doesn't rain each day; the ensemble probability is 0.5.
There are two weathermen. The lazy one (weatherman A) just says there's a 50% change of rain every day, but the hard working one (weatherman B) always gives it either an 80% or 20% chance of rain.
Weatherman A and B both are correct over a long period of time
Weatherman B's confusion matrix:

$\begin{array}{c | c c} & \textrm{shine} & \textrm{rain} \\ \hline \textrm{shine} & 0.4 & 0.1 \\ \textrm{rain} & 0.1 & 0.4 \end{array}$

Question(s)

It's clear that weatherman B is better since his predictions are actually useful, but how would one mathematically justify weatherman B is better?

One add hoc metric I've come up with is

$ 1-\left( P(\textrm{shine | shine}) + P(\textrm{rain|rain}) \right)$

but this metric would break down in a place such as Arizona where it's almost always "shine"

If not duplicate, seems to me to be clearly on-topic and does not look like homework (this has been flagged/voted as either self-study or off-topic). — Juho Kokkala, May 14 '18 at 05:05
Possible duplicate of [Methods for evaluating predictive models in two-outcome systems](https://stats.stackexchange.com/questions/48419/methods-for-evaluating-predictive-models-in-two-outcome-systems) — Juho Kokkala, May 14 '18 at 05:06

0 Answers0