I need to predict (estimate) probabilities of (rare) events when the training data only contains the yes/no indicator.
I.e., my target (dependent) variable is binary (logical).
What I need is not just to predict yes/no, but estimate the probabilities of yes/no for each observation.
If I use logistic regression, then the model output is, indeed, an estimate of the probability. What if I am using a different model, e.g., vw
? (because, e.g., it is faster and outperforms logistic regression as a binary classifier).
So, I have a model which produces a score for each observation and I want to convert the score to probability.
It is natural to use total variation distance to evaluate the probability prediction, which motivated my previous question. The accepted answer there suggests Liblinear with L1 loss, but that produces a binary classifier, not a probability estimator.
So, how do I calibrate model scores so that they actually estimate the event probabilities?
I now train a single-independent-variable logistic regression to map the scores to probabilities. Can I do better?