0

Suppose every day an agent randomly exhibits one of two behaviours (behaviourA or behaviourB), and the probabilities of exhibiting behaviourA and behaviourB are unknown.

After n days, we will have a sample of n days' data.

Suppose n = 1000, and we have witnessed behaviourA 800 times and behaviourB 200 times, we could estimate the probability of behaviourA as 80% and probability of behaviourB as 20%.

Question

How do we measure how confident we are of these probabilities, and does that change when the sample size is small?

Example with small sample size

Suppose n = 10 with 8 x behaviourA and 2 x behaviourB, then the probabilities would be identical, but we couldn't be as confident in our estimations of those probabilities as we could with n=1000. My guess is that we need some way of penalising the small sample size, but I am not sure how to accomplish that.

stevec
  • 259
  • 1
  • 7
  • For (much) more on this topic, see [this search](https://stats.stackexchange.com/search?q=binomial+confidence+interval+small+score%3A2). – whuber Apr 02 '20 at 13:28
  • @whuber thanks very much. There appear to be a lot of ways to ask the same or similar questions. I wouldn't have thought to google for a lot of the terms in the similar questions. Thanks also for the search link, very helpful. Really appreciate it – stevec Apr 02 '20 at 23:10
  • 1
    You have identified one of the challenges of running this site, Steve: nowadays, almost ten years after the site was started, most questions already have answers, but finding them can be difficult. Would it surprise you that the best search term in this context is ["Clopper"](https://stats.stackexchange.com/search?q=clopper)?! (It's best used in combination with other terms, but even by itself it works well.) – whuber Apr 03 '20 at 11:34

0 Answers0