9

Suppose I want to train a deep neural network to perform classification or regression, but I want to know how confident the prediction will be. How could I achieve this?

My idea is to compute the cross entropy for every training datum, based on its prediction performance in the neural meters above. Then, I would train a second neural network for regression, which would take each datum as input, and it's cross entropy as output (one output node). You would then use both networks in practice -- one for prediction of label / value, and the other for prediction of confidence of the first network. (....But would I then need a third network to predict the confidence of the second network, and so on...?!)

Is this a valid idea? Moreover, is it a standard idea commonly used? If not, what would you suggest?

Ferdi
  • 4,882
  • 7
  • 42
  • 62
Karnivaurus
  • 5,909
  • 10
  • 36
  • 52
  • Prediction values can be interpreted as confidence. – yasin.yazici Mar 05 '16 at 05:36
  • Perhaps you may take a bootstrap approach,replicating your model over n samples and building a variance estimator and perhaps a confidence interval for your predictions. – D.Castro Mar 11 '16 at 12:13
  • 1
    see my answer to a similar question here http://stats.stackexchange.com/a/247568/56940 – utobi Nov 24 '16 at 07:45
  • For classification, as some have answered, the probabilities are themselves some measure of your confidence. For regression, you may find [my answer](http://stats.stackexchange.com/a/247619/106369) from a very similar question useful. – etal Nov 24 '16 at 07:13

2 Answers2

2

Perhaps I am misunderstanding the question, but for classification it seems to me the standard way is to have an output neuron for each of the N classes.

Then the N vector of [0, 1] output values represent the probability of the input belonging to each class, and so can be interpreted as the "confidence" you want to obtain.

giorgiosironi
  • 320
  • 3
  • 10
  • The output is usually a softmax layer and that's how you get the value of the neurons to fall inside $[0,1]$. – horaceT Sep 11 '16 at 14:07
2

For folks who are interested in NN prediction confidence estimation, you may wish to take a look at Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (Gal et al., 2016). Briefly, it demonstrates how the variance of a network's predictions with dropout over a population of runs in which dropout is performed can be used to estimate prediction confidence. This approach can be employed for networks designed for classification or for regression.

lebedov
  • 195
  • 2
  • 8