I have a set of 9000 outcomes and I wish to make an estimate (with 95% confidence interval) of the amount of 'positive' examples. After viewing 413 examples I found 13 positives (positive rate of 13/413 = 0.0315). With a cumulative binomial distribution I can make an estimate of the total positives with a 95% confidence interval.
However, the calculated positive rate is based on a small sample and addition of another single positive example can shift the 95% confidence interval by a large amount. How can I incorporate the size of my sample into the estimate so it is more robust?
My matlab code:
N_found = 14;
N_excl = 400;
N_tot=9000;
binoinv([0.05 0.95],N_tot-N_found-N_excl,N_found/(N_found+N_excl))+N_found