0

I've been reading a bit about the confidence intervals on Wikipedia. The section on misunderstandings says:

A 95% confidence level does not mean that for a given realized interval there is a 95% probability that the population parameter lies within the interval (i.e., a 95% probability that the interval covers the population parameter). According to the strict frequentist interpretation, once an interval is calculated, this interval either covers the parameter value or it does not; it is no longer a matter of probability. The 95% probability relates to the reliability of the estimation procedure, not to a specific calculated interval.

I read through an article making the same point. It concludes by saying:

The nature of confidence intervals is that they can encompass the true value with some chance. In our example, a confident interval tends to encompass the true value in 90% of trials. But that does not mean that for some specific interval, there is a 90% chance of finding the true value within the interval.

But I really don't understand the point of this argument. What is the difference between saying "there is a 90% probability that the true value is within our interval" and "there is a 90% probability that our interval included the true value"? Is it just a philosophical issue or can it really lead to incorrect conclusions?

  • 1
    For an extreme example of the distinction, see https://stats.stackexchange.com/questions/6652/what-precisely-is-a-confidence-interval/502199#502199. – fblundun Jan 28 '21 at 12:50
  • https://stats.stackexchange.com/search?tab=votes&q=confidence%20interval%20probability – whuber Jan 28 '21 at 13:25
  • Sorry, duplicate. Didn't find it in my quick search. Not deleting because someone already answered and gained upvotes. – relatively_random Jan 28 '21 at 15:02

1 Answers1

2

There is a difference, because you have already obtained your sample and estimated your statistics. At this point there is no more probability, your confidence interval either contains the true value or it does not.

Same logic as throwing a dice, before you throw it there is a 1/6 probability of getting a 6, after you have thrown the dice there is no probability anymore, either you obtained a 6 or you did not.

For this reason you cannot claim any probabilistic results on your obtained interval, but rather on an expected frequency, if you were to repeat this experiment many times, and estimate many intervals.

user2974951
  • 5,700
  • 2
  • 14
  • 27
  • 1
    The distinction between "probabilistic results" and "expected frequency" seems specious. The answer certainly is *not* what you claim in the first paragraph. See the duplicate thread for discussion of these issues. – whuber Jan 28 '21 at 13:27
  • I don't get your dice analogy. If you throw a die and see 6, there is no uncertainty involved anymore. This is most certainly not the case with confidence intervals. They're more like: a die was cast, you didn't see the answer, and someone who is known to lie 10% of the time told you it was 6. In that case, would it not be fair to say that there's a 90% probability that the die showed 6? – relatively_random Jan 28 '21 at 15:06
  • @relatively_random To be more precise, you throw a dice and you don't look at the result. The die has been cast, the result has happened, you just don't know what it is. But it either happened or it didn't. – user2974951 Jan 28 '21 at 15:27
  • Right, it either happened or it didn't. But you don't know whether it did or didn't happen, you can only assign probabilities to the outcomes that have possibly happened. And that probability is 90% for 6, 10% for not 6. In other words: if you were to bet on what really happened based on this information from the known witness, you should accept the bet if the payoff is more than 10/90 times the amount invested. – relatively_random Jan 28 '21 at 15:45
  • 1
    @relatively_random your calculation that there's a 90% probability that the die showed 6 assumes that whether the person lies is independent of the rolled number and that they choose which lie to tell by randomly picking one of the five wrong numbers. For example, if they lie with probability 0.6 when a 1 is rolled and always tell the truth otherwise, and if the lie is always that a 6 was rolled, then your probability that a 6 was really rolled given that they claim it was should be 4/15 by Bayes' Theorem, not 90%. The distinction you originally asked about has a similar explanation. – fblundun Jan 28 '21 at 16:51
  • @user2974951 suppose I've rolled a die and haven't looked at it. I'm going to flip a coin. If it lands heads, I will reveal the value of the die; if it lands tails I will reroll the die and reveal the rerolled value. Do you have a way to think about the probability that I will end up revealing a 6? Is it an unknowable number between 1/12 and 1/2? – fblundun Jan 28 '21 at 17:00
  • 1
    @fblundun Excellent counter-example. Love it. I feel like the misconception is actually starting to become clear. So basically, the singleton the upgraded unreliable witness gives us is still a 90% confidence interval because they give us correct numbers 90% of the time. But the probability that what they told us is correct is not 90%. – relatively_random Jan 28 '21 at 17:08
  • @fblundun Now that I think about it, this modified liar doesn't technically produce valid confidence intervals. According to the definition from Wikipedia, the 90% probability of capturing the parameter in the interval needs to apply for _all_ possible parameter values. But I get the point - the two statements from the question are not the same and they will, in general, require different answers. – relatively_random Jan 29 '21 at 09:56
  • @user2974951 correction: my last comment should have said "... an unknowable number between 1/12 and *7/12*?" – fblundun Jan 29 '21 at 10:09