I have a trained image classifier and 30 test images, 24 of which are classified correctly. With a decision boundary of 0.5, the remaining 6 images are misclassified as false positives. I'd like to compute a 95% confidence interval, and the following seemed like a reasonable methodology: compute the mean error $\bar x$, find the standard deviation $\sigma$, and obtain the 95% confidence interval as:
$$\bar x \pm1.96\frac{\sigma}{\sqrt {30}}$$
My first question is whether this seems appropriate for a problem of this nature, and second, how to compute $\bar x$. Since 24 images are correctly classified, their associated errors are zero, and the error of each misclassified image is the prediction's distance from the decision boundary, i.e., $|x-0.5|$. I'm therefore computing the mean over 24 zeros and 6 error values. If this approach is sound, then is it correct to say that we have 95% confidence that predictions greater than $0.5 + \bar x + 1.96\frac{\sigma}{\sqrt {30}}$ are not erroneous (i.e., not false positives)?