Why is a 100 heads run surprising?

Question

Assume we have a fair coin. We flip it 100 times. The outcome is all heads.

Why is it that all heads outcome is more surprising to us than a "more random looking" outcome with less regularity?

Aren't all outcomes of the same probability of $2^{-100}$?

And to make it a bit more statistical question, what intuition does tests like $\chi^2$ try to capture? If a sequence with a lot of regularity and a sequence with much less regularity both have the same probability, why would I distinguish between them? Why would I consider one more surprising than the other?

originally asked at MSE, but didn't get an answer that was satisfactory enough.

What do you mean by "regulatory"? Also, are you assuming that the coin flips are independent? If they are, then all 100 successive coin flips will have the same "surprise" according to their joint probability. Otherwise, if they are not independent, then not all 100 successive coin flips have the same "surprise" as measured by their joint probability. — mhdadk, Mar 29 '21 at 23:42
This website works best when users ask on question at a time. The introduction of the $\chi^2$ test is entirely unrelated to the question about coin flips. To help us answer the coin flip question, can you elaborate on what is unsatisfactory about the MSE explanation? — Sycorax, Mar 29 '21 at 23:44
If I were to paraphrase, I'd guess that the question here is "If all length-$n$ sequences are equiprobable, why is a sequence of $n$ heads surprising?" The point of departure is that there's one and only one to make a sequence of all heads, but there's more than one way to make every sequence with a *mix* of heads and tails ($n \ge 2$). We can fully derive the *distribution of the number of heads in any fixed-length sequence* from first principles and show how this is the case; see: https://stats.stackexchange.com/questions/256563 — Sycorax, Mar 29 '21 at 23:54
I am computing outcomes, not events of all heads vs. event of sequence with less regularity. If I was comparing events, what you said would make sense. — Kaveh, Mar 30 '21 at 00:14
If the coin is fair $p(T)=p(H)=0.5$, the flips are independent, and you're conducting exactly $n$ flips, then you're correct -- all sequences are equally probably. There doesn't seem to be a question here, unless you're asking a psychological question about what makes humans surprised. — Sycorax, Mar 30 '21 at 00:16
why MSE answers not satisfactory: because they essentially say we shouldn't be surprised, whereas as we are. I sense there is something about prior on the coin and whether it is fair, but not sure if that is it or something else. — Kaveh, Mar 30 '21 at 00:16
why mention statistical tests: because they are used to catch non-random stuff, e.g. $\chi^2$ is used as a test for whether pseudorandom number generator's output is a random enough. Or if a sequence of outcomes in roulette is surprising (which iiuc was one of the reasons Pearson came up with the test). — Kaveh, Mar 30 '21 at 00:18
Are you asking about a case where the coin is fair, or the case where the $p(H)$ is unknown? Or something else? Are the coin flips independent? Are the number of flips fixed? In the question, you state the coin is fair, but in a comment you say that "there is something about a prior on the coin." — Sycorax, Mar 30 '21 at 00:29
I have a simple question really: Why 100 heads surprises us? Why repeated results in roulette surprises us? Why a string of related 1s don't look random to us? Is this psychological and cognitive weakness or is there really a statistical reason for our surprise? — Kaveh, Mar 30 '21 at 02:02
"Apophenia" is a psychological term for perceiving meaningful patterns where none exist. But questions about psychology are not on-topic here. If your question is statistical in nature, you'll need to edit your question to explain why you find it surprising for an event with probability $2^{-100}$ to occur once in $2^{100}$ experiments, as well as the exact setting of the experiment. — Sycorax, Mar 30 '21 at 02:37
@Kaveh, I've no idea why this would be closed; I think it's a good question. But I think the answer from MSE about Kolmogorov complexity or minimum description length is quite good. We expect that a sequence of coin flips will be one that we can't describe in a few words, because the average number of words needed to describe a sequence of coin flips is quite big. So "all heads" is surprising because it has a lower than average description length, in the same way that a 7ft tall person is surprising because they are far from the average height. — Flounderer, Mar 30 '21 at 03:21
I think I can point to a fundamental reason that has mathematical and psychological appeal. Independent outcomes are *exchangeable.* Thus we are naturally led to perceive any sequence of outcomes $a_1a_2\ldots a_{100}$ as a representative of a class of outcomes--an "event"--comprising all permutations of that sequence $a_{\sigma(1)}a_{\sigma(2)}\ldots a_{\sigma(100)}.$ When the sequence includes $k$ heads, this event has cardinality $\binom{100}{k}$ and the chance of its associated event is $\binom{100}{k}2^{-100}.$ Although this doesn't fully address the psychology, it's a good start. — whuber, Mar 30 '21 at 16:49
yes, that is interesting. Following that I think there is more going on. E.g. consider alternating 0s and 1s. I think that would be a bit less surprising than all heads, but still more surprising than an outcome that lacks regularity. So I think regularity plays a role. Similarly, if I do statistical tests on this, like run test, I think we would get something closer to all heads than something we would get for an outcome without regularity. — Kaveh, Mar 31 '21 at 03:23
Yes, that's what I mean by "not fully address the psychology." The closest people have come to quantifying what "lacks regularity" might mean is embodied in random number testing: see https://en.wikipedia.org/wiki/Diehard_tests for instance. Another approach (explained for 2D arrays of numbers rather than 1D) is suggested at https://stats.stackexchange.com/questions/17109. — whuber, Apr 01 '21 at 01:42

score 5 · Answer 1 · answered Mar 29 '21 at 23:43

You're right; there's nothing special in terms of likelihood about 100 heads. You're right about that. One deserves to get equally excited about 20 heads, then 3 tails, then 6 more heads, then all tails.

This leads to my slant on this, based on Bayes' rule: seeing 100 heads in a row leads us to doubt our certainty that it's a fair coin with $f=0.5$—our hypothesis $\mathcal{H}$ about the data-generating process. There are other data-generating processes that could exist, which would better explain the 100 heads.

$$ p(\mathcal{H} \mid D) = \frac{p(D \mid \mathcal{H}) \times p(\mathcal{H})}{p(D)} $$

Even if all hypotheses are equally likely, it's more natural for these observations to come from a bent coin—or even a two-sided trick coin! That's why these surprise us. It makes us question our model of the world. With a more even dispersion of heads and tails (a member of the 'typical set' for this distribution), it does not lend the same credence to these alternative models.

The $\chi^2$ part of the question seems unrelated.

score 3 · Answer 2 · answered Mar 29 '21 at 23:47

Arya's answer is great. I'll offer the frequentist take. First off, all outcomes are not equiprobable under common assumptions. Some sequences are equivalent under the assumption that the former flip tells you nothing about the next flip. If this is a dubious assumption, then we could talk about probabilities of sequences, but under this assumption its the number of heads that matters most. We call this assumption "independence".

Under independence TTHHH is the same as HTHHT. The probability of seeing a given number of heads in a sequence is a well studied distribution called the binomial. I will leave that with you to research.

Thank you. I have a BS in pure math, I know what iid and binomial distribution are. :) I know the number of heads distribution. I am comparing all heads to a single outcome, say 111001011011011000110... — Kaveh, Mar 30 '21 at 10:40

Why is a 100 heads run surprising?

2 Answers2