I have 3 independent sources for tomorrow's weather forecast:
- 100% probability for snow, this source is 80% accurate
- 50% probability for snow, this source is 60% accurate
- 0% probability for snow, this source is 40% accurate
The accuracy for each source is: $$\frac{\text{Number of correct forecasts}}{\text{Total number of forecasts}}$$ What is the best estimate for the probability for snow tomorrow?
Essentially it is an extension of a previous question, but where each source has a different accuracy or reliability.
Also, the selected answer suggested using the geometric mean, ignoring the fact that one of the probabilities is 0, collapsing the entire answer to 0 which intuitively makes no sense: the forecast with the lowest accuracy should not supersede a more accurate forecast just because of numerical considerations.
My intuition was to solve it weighing probabilities with accuracies: $$\frac{1.0 \times 0.8 + 0.5 \times 0.6 + 0 \times 0.4} {0.8 + 0.6 + 0.4}$$ but the interviewer insisted on solving using Bayes theorem.
In addition, if one of my sources has accuracy 100%, then it makes no sense to calculate a weighed mean with the other sources.
I could weigh as in AdaBoost:
$$\alpha_m = \frac{1}{2}\ln\left( \frac{1 - \epsilon_m}{\epsilon_m}\right)$$
so e.g. for the source with accuracy 80%, the weight would be
$$\alpha_m = \frac{1}{2}\ln\left( \frac{0.8}{0.2}\right) = \frac{1}{2}\ln(4)$$
etc. Is this an acceptable solution for this question?
In any case I'd be very happy to see how it can be solved using Bayes.
I have seen a few other questions similar to it, but none exactly the same.