Other than a calibration plot, is there a way to decide how good one models' predictive probabilities as compared to another model.
I'm not interested in error rates as I find them ineffective for the level of precision I'm looking for.
The only quantity of interest is the predictive probability distribution, as I am pricing contracts using them.
EDIT:
I have no faith in scoring rules based on the experience below with several different classifiers.
I've simulated data from a known model. Trained the known model and a worse model using the simulated data, and the brier and log rules don't agree that the known model is superior. The class probabilities are materially different.