Measuring the confidence of a probability prediction from a binary classifier

Question

I've trained a binary classifier for a language identification problem. The training data is $n$ sentences from language A and $n$ sentences from language B. Such that $n$ sentences are selected uniformly at random from a larger corpus of $N$ sentences, that belong to language B. The rationale is to create a balanced dataset for training.

Next, I'm using the classifier to identify the language of $m$ sentences from a new corpus, for which I don't have the ground truth language. For each such sentence, the classifier predicts a probability $p$, that this sentence belongs to language A (or equivalently a probability $1-p$, that this sentence belongs to language B).

I'd like to have some measure of how strongly the classifier believes the sentence belongs to A rather than B, so I've used $dp := |p - (1-p)|$. One concern is how does the random selection of $n$ out of $N$ sentences for the training set affects $p$. One simple solution that comes to mind, is to conduct $k$ experiments (say 10), and for each unknown sentence report the average and std of $dp$ over the $k$ experiments.

Questions:

Does it make sense to use $dp$ for such a measure?
Does it make sense to report the average and std of $dp$ over $k$ experiments? If so, what should be the value of $k$?

Note, there are, of course, questions of how do we know the classifier is any good, that the probability is calibrated, etc. I'm ignoring these issues for simplicity's sake.

Measuring the confidence of a probability prediction from a binary classifier

0 Answers0