Let's say I know the true population proportion of a particular attribute, say 15% of people are left-handed. Imagine I perform a random sample of 1000 people around the world. How do I compute a 95% confidence interval of the number of people I expect to be left-handed from this sample?
-
3Welcome to CV. This isn't a [confidence interval:](https://stats.stackexchange.com/questions/26450) it's a calculation of percentiles of the sampling distribution, which is *Binomial:* that word makes a good search term. – whuber Apr 14 '20 at 22:34
-
1@whuber This makes sense, but when I look up binomial sampling, all the answers/examples assume that I want to calculate the true population proportion based on a sample proportion. But I want the opposite, and I can't seem to find anything on that. Can you direct me toward something that could help clear up my confusion? – Bob Apr 14 '20 at 23:05
-
Outline: Using the normal approximation to a binomial distribution you seek $P(L \le X \le U) \approx P[(L-\mu)/\sigma < Z < (U-\mu)/\sigma)=.95,$ where $X \sim BINOM(n,p)$ and $Z$ is standard normal. So $(U - \mu)/\sigma) \approx 1.96.$ Evaluate $\mu$ and $\sigma$ and solve for $L$ and $U.$ // In my answer I show how to get L and U using R statistical software. – BruceET Apr 14 '20 at 23:50
1 Answers
Exact binomial computation. If $X \sim \mathsf{Binom}(n=1000, p=0.15),$ then
$P(128 \le X \le 172) = 0.954,$ according to exact
computations using R statistical software.
[In R qbinom
the inverse of a binomial CDF, and dbinom
is a binomial PDF.]
qbinom(c(.025,.975), 1000, .15)
[1] 128 172
sum(dbinom(128:172, 1000, .15))
[1] 0.9538581
If this is from a problem in an elementary statistics or probability class that is not using software, then you may be expected to get a good approximate value by using the fact that the distribution of $X$ is approximately $\mathsf{Norm}(\mu = np, \sigma = \sqrt{np(1-p)}),$ where $n = 1000, p = 0.15.$ You could use either software or printed tables of the standard normal CDF for that computation.
One method using R is shown below. The area under the density curve of $\mathsf{Norm}(150, 11.29)$ between 127.5 and 172.5 is about 0.954. So the normal approximation method gives essentially the same integer boundaries.
n = 1000; p = .15
mu = n*p; sg = sqrt(n*p*(1-p))
mu; sg
[1] 150
[1] 11.29159
qnorm(c(.025,.975), mu, sg)
[1] 127.8689 172.1311
diff(pnorm(c(127.5,172.5), mu, sg))
[1] 0.9536984
The following figure shows the binomial distribution (blue histogram) and the approximating normal density curve (black curve).

- 47,896
- 2
- 28
- 76