When sampling a given population many times the ratio of confidence intervals with the population proportion seems wrong

Question

I wrote a Python script generating a population of yes/no votes with 50% of the votes set randomly to yes.

Then I take samples of 50 votes many times (say 10,100,100,10000 and 100000) and test for each sample whether its confidence interval for a given confidence level of 95% contains the population proportion.

I expected the ratio between the number of CIs which contain and the number of all generated CIs to become closer and closer to 0.95, but here is what I get:

  num of samples   ratio
  ---------------- ----------
  10               1.0
  100              0.99
  1000             0.95
  10000            0.9366
  100000           0.93337
  1000000          0.935186

This is looks like it gets closer and closer to 0.935 or so.

Is this still likely correct or is there rather a bug in my program?

Some Details about my procedure:

I calculate the confidence interval CI from the confidence level cl like this

\begin{alignat*}{2} \text{CI}\; =\; \hat p\; &\pm\; &z^\star\:&\times\:\sqrt{\frac{\hat p \left(1 - \hat p\right)}{n}} \quad \text{with}\; z^\star \text{corresponding to a confidence level of}\; 95\% \end{alignat*}

or in Python code:

sigma_p_hat = math.sqrt(p_hat * (1 - p_hat) / n)
cdf = 0.5+cl/2
z_star = stats.norm.ppf(cdf)
E = z_star * sigma_p_hat
CI = pd.Interval(p_hat-E, p_hat+E, closed='both'

What type of confidence interval did you use? Many of them only have approximate coverage, and the 'exact' types for proportions are approximate! — Michael Lew, Nov 15 '19 at 01:27
@MichaelLew Good point! I have added my method of calculating the confidence interval to the question. Re *approximation*: I thought the approximation only goes wrong for confidence levels *very* close to 100%. 95% is still quite a bit off. — halloleo, Nov 15 '19 at 05:18
@Glen_b The ratio between CIs which contain $p$ and all generated CIs. Will add this to the question as well. — halloleo, Nov 15 '19 at 08:20
With a sample of 50, there are only 51 possible values of $\hat p$ and therefore only $51$ possible confidence intervals. That alone indicates it is unreasonable to expect the coverage to be exactly 95%. Are you therefore wondering why it is *less* than 95%? — whuber, Nov 15 '19 at 20:34
@whuber Very interesting aspect! haven't thought of the *granularity* of the intervals. I think **this is worth an answer**. Would you mind putting this in one? — halloleo, Nov 18 '19 at 00:19
@whuber On second thoughts:I understand that the intervals are pretty granular, and I assume this will _somehow_ limit the possible values for the "CI contains p" / "all CIs ratio", but I don't understand exactly why this is so...Maybe the ratio can still vary more finely? — halloleo, Nov 18 '19 at 00:29

score 1 · Answer 1 · answered Nov 15 '19 at 20:10

1

The coverage by the normal approximation that you have used is erratic, unreliable and frequently low. That method, often called the Wald method, should not be used despite its prominence in textbooks.

See here for more information: Discrete functions: Confidence interval coverage?

answered Nov 15 '19 at 20:10

Michael Lew

10,995
2
29
47

I had a look at the answer link you provided. It says: "...they [Neyman's confidence intervals] provide coverage over all possible parameter values in the long run." But isn't that what I attempt with my repeated sampling and its counting toward a ratio? – halloleo Nov 18 '19 at 00:23

When sampling a given population many times the ratio of confidence intervals with the population proportion seems wrong

Some Details about my procedure:

1 Answers1