4

I am currently taking a statistics course where the following scenario comes up frequently:

Suppose a sample of size $n$ is taken from a population. $X$ is a binomial variable. The number of successes in the sample is $n\hat{p}$. The confidence interval for estimating the population proportion $p$ is:
$$ \Bigg(\hat{p} - \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}t_{.975, n-1},\; \hat{p} + \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}t_{.975, n-1}\Bigg), $$ with $\hat{p}$ being the unbiased point estimator for the population proportion $p$.

What I don't understand is that the instructor has repeatedly emphasized the difference between these two statements, and I don't understand what that difference is:

  1. "With 95% confidence, the limits of the confidence interval contain the population proportion."
  2. "The sample proportion falls within the limits of the confidence interval 95% of the time"

What is the difference between these two statements?

Ben
  • 91,027
  • 3
  • 150
  • 376
elbecker
  • 43
  • 4
  • 1
    [Here](https://stats.stackexchange.com/a/552522/307000) is a related thread you might find helpful. – Geoffrey Johnson Nov 28 '21 at 20:26
  • Your statement of the confidence interval doesn't have hats on the ps. That's ambiguous. If you know the true population proportion, you can use that confidence interval to make the 2nd statement. – gung - Reinstate Monica Nov 28 '21 at 20:31

2 Answers2

7

The sample proportion is different to the population proportion. The first one is a known quantity that you can compute from your sample, whereas the second one is the unknown quantity that you are making an inference about. The second statement here is incorrect, since the sample proportion always falls within the confidence interval (by construction).

Ben
  • 91,027
  • 3
  • 150
  • 376
  • Thank you! When you say "the sample proportion always falls within the confidence interval by construction" does that mean this, particular sample proportion always falls in the confidence interval? Or any sample proportion of size n falls within that interval? Because my understanding was that the interval covers 95% of the sampling distribution of the population proportion, which means that a sample could technically fall in the outer tails. – elbecker Nov 28 '21 at 20:05
  • 1
    The sample proportion is the value $\hat{p}$ in your exposition --- it is the centre of the interval (by construction). – Ben Nov 28 '21 at 20:07
  • 2
    @elbecker the sample proportion always falls within the interval you computed *using that same sample* – hobbs Nov 29 '21 at 04:54
0

The sample proportion falls within the limits of the confidence interval 95% of the time

It's not quite clear what this is claiming. If it's read as "95% of the times that a sample is taken and a confidence interval is calculated, the sample proportion lies within the confidence interval", that's not correct; confidence interval general are constructed to include the sample proportion (I suppose technically you could have a CI process that leads to this not being the case, but such a process would be highly nonstandard). If it means "If you continue taking more samples from the distribution that this sample came from, the sample proportion of 95% of those samples will be within the CI constructed by from this sample", that's wrong (but slightly less wrong). Confidence intervals are not calculated around the percentage of samples being in the interval, but they do tend to end up be such that they are a somewhat reasonable approximation (that is, I would expect the percentage of samples being within a percentage point of 95%).

With 95% confidence, the limits of the confidence interval contain the population proportion.

The meaning of this is also a bit fuzzy. The precise statement is that the CI was created with a process that has a 95% probability of producing a CI that includes the population proportion, and the term "confidence" is sometimes used to convey this idea, but it often results in people confusing it for "The probability of the population proportion being in the confidence interval", which is not correct.

Acccumulation
  • 3,688
  • 5
  • 11