33

As we all know, if you flip a coin that has an equal chance of landing heads as it does tails, then if you flip the coin many times, half the time you will get heads and half the time you will get tails.

When discussing this with a friend, they said that if you were to flip the coin 1000 times, and lets say the first 100 times it landed heads, then the chances of landing a tail was increased (the logic being that if it is unbiased, then by the time you have flipped it 1000 times you will have roughly 500 heads and 500 tails, so tails must be more likely).

I know that to be a fallacy, as past results don't influence future results. Is there a name for that particular fallacy? Also, is there a better explanation of why this is fallacious?

Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
oggmonster
  • 433
  • 4
  • 5
  • 10
    If you flip a coin 100 times and it lands heads 100 times, the odds are that it is not an unbiased coin. – Robert Jun 08 '14 at 22:38
  • 1
    @Robert How so? Since each flip is independent of the other, the chance that it would be H 100x is the same as if it were a mismatched sequence of H & T, or 100x T – yuritsuki Jun 08 '14 at 23:02
  • It always amused me that we refer to "the Law of Averages" when in fact the "law" is nonexistent, as demonstrated in the post above, and yet we refer to Einstein's "Theory" of Relativity, which has been proven in many different ways over the years. When a gambler has a long losing streak and yet continues to gamble, saying, "My luck has got to change," well, no it doesn't. – user2953747 Jun 08 '14 at 23:13
  • 11
    @thinlyveiledquestionmark I'd like to play poker with you... but only if I'm allowed to deal. I think Robert means that the realization of 100 H in 100 trials would shift his belief from the coin being fair to the coin being unfair. Given this data 100 H in 100 trials, you would have to be possessed of a very strong prior on $\Pr(H)$ to not appreciably shift the posterior. – Sycorax Jun 09 '14 at 00:59
  • @user2953747 Yes, but "Theory" in that sense is "the set of things that follow from some axioms/postulates." For any set of axioms, there's a corresponding theory. The question of science is about "which theories are consistent, and which theories are *true in this world?*" – Joshua Taylor Jun 09 '14 at 15:30
  • 5
    @thinlyveiledquestionmark You have to be careful. Given independent flips, each 100-flip sequence of H or T is equally likely: 100H is as likely as 50H 50T, is as likely as HTHTHTHT...HT, and so on. But it is **much less likely** to get 100H than it is to get a total of 50 heads, because there are $10^{29}$ different ways to have 50 flips come up heads and 50 flips come up tails. – Lagerbaer Jun 09 '14 at 19:03
  • Think about it this way. Suppose you have a million dollars in pennies in a very large bin. You dump it out, "flipping" all the coins. You discard the ~50 million of them that came up tails, refill the bin with the rest and repeat. Now you're down to 25 million. Keep doing that and after 16 times you're very likely to have a coin that came up heads 16 times in a row. To make a coin that came up heads 100 times in a row in this manner you'd need to start by converting all the mass in the universe into pennies, which would collapse into a black hole when you put them in the bin. – Eric Lippert Jun 09 '14 at 22:21
  • 4
    Robert's idea is perfectly valid and may be the source of "the fallacy" in the first place. Our brains are wired in the Bayesian, not frequentist sense. The "perfect" information such as the "absolutely fair coin" rarely exists in nature. Thus, 100 Heads on 100 tries will practically lead us to believe that $P(Heads) > 0.5$ – PA6OTA Jun 10 '14 at 03:54
  • I think it is simpler than all of that. Each event instance happens in its own "time bubble". once it occurs, it no longer exists, therefore it is impossible to influence ANYTHING in relation to the "physics of the past". Even shorter; this is the equivalent of a dog chasing its own tail. – user258203 Jun 09 '14 at 21:47
  • No, this **not** entirely correct, see my [answer](http://stats.stackexchange.com/a/103264/45382) and links thereof. For starters ask "*at what point does the law of large numbers start to take effect*"? – Nikos M. Jun 13 '14 at 14:10
  • @PA6OTA --- Actually, it is still a fallacy from a Bayesian POV. If the true probability of heads is unknown, a Bayesian would update their estimate of the probability of heads accordingly --- a large excess of heads would increase the inferred probability of heads. The "Gambler's Fallacy" does exactly the opposite, expecting tails when there have been "too many" heads. – ratsalad Jul 08 '14 at 14:36
  • 1
    @ratsalad I was not discussing the OP but rather Robert's comment about believing more Heads in the past lead us to believe more Heads in the future. My point is: the reasoning like this is mathematically valid, if you have the suitable (Bayesian) framework. The inverse reasoning (many Heads in the past lead to a dearth of Heads in the future) is, actually, also rooted in practice when sampling without replacement from a finite population. – PA6OTA Jul 08 '14 at 18:03

7 Answers7

46

It's called the Gambler's fallacy.

abaumann
  • 1,910
  • 14
  • 12
35

The first sentence of this question, incorporates another (related) fallacy:

"As we all know, if you flip a coin that has an equal chance of landing heads as it does tails, then if you flip the coin many times, half the time you will get heads and half the time you will get tails."

No we won't get that, we won't get heads half the time and tails half the time. If we were to get that, then the Gambler would not be so mistaken after all. The mathematical expression for this verbal statement is as follows: For some "large" (but finite) $n'$, we have $n_{h} = \frac {n'}{2}$, where evidently $n_{h}$ denotes the number of times the coin lands heads. Since $n'$ is finite, then $n'+1$ is also finite and a distinct value from $n'$. So what happens after the $n'+1$ flip has been made? Either it landed heads, or not. In both cases, $n_h$ has just stopped being equal to "half the number of tosses".

But perhaps what we really meant was an "unimaginably large" $n$? Then we state

$$\lim_{n\rightarrow \infty}n_{h} = \frac n{2}$$

But here, the RHS ("right-hand side")contains $n$ which by the LHS ("left-hand-side"), has passed over to infinity. So the RHS is also infinity, and so what this statement says is that the number of times the coin will land heads is equal to infinity, if we toss the coin an infinite number of times (the division by $2$ is negligible) :

$$\lim_{n\rightarrow \infty}n_{h} = \frac n{2} = \infty$$

This is an essentially correct, but useless statement, and obviously not what we have in mind.

In all, the statement in the question does not hold, irrespective of whether "total tosses" is considered finite or not.

Perhaps then we should state

$$\lim_{n\rightarrow \infty}\frac {n_{h}}{n} = \frac 1{2} \;\;?$$

First, this translates into "The ratio of the number of landed heads over total number of tosses tends to the value $1/2$ when the number of tosses tends to infinity", which is a different statement - no "half of the total tosses" here. Also, this is how probability is still sometimes perceived -as a deterministic limit of relative frequencies. The problem with this statement is that it contains in the LHS an indeterminate form: both numerator and denominator go to infinity.

Hmmm, let's bring in the random variable arsenal. Define a random variable $X_i$ as taking the value $1$ if the $i$-th toss came up heads, $0$ if it came up tails. Then we have $$ \frac {n_{h}}{n} = \frac 1n \sum_{i=1}^nX_i$$

Can we now at least state

$$\lim_{n\rightarrow \infty}\frac 1n \sum_{i=1}^nX_i = \frac 1{2} \;\;?$$

No. This is a deterministic limit. It permits all possible realizations of the sequence of the $X$'s, and so it does not even guarantee that a limit will exist, let alone it being equal to $1/2$. In fact such a statement can only be seen as a constraint on the sequence, and it would destroy the independence of the tosses.

What we can say, is that this average sum converges in probability ("weakly") to $1/2$ (Bernoulli -Weak Law of Large Numbers),

$$\lim_{n\rightarrow \infty}\text {Pr}\left(\left|\frac 1n \sum_{i=1}^nX_i-\frac 12 \right|<\varepsilon\right) =1 , \;\;\;\forall \varepsilon >0$$

and in the case under consideration, that it also converges almost surely ("strongly") (Borel -Strong Law of Large Numbers)

$$\text {Pr}\left(\lim_{n\rightarrow \infty}\frac 1n \sum_{i=1}^nX_i=\frac 12 \right) =1 , \;\;\;$$

But these are probabilistic statements about the probability associated with the difference between $n_h/n$ and $1/2$, and not about the limit of the difference $n_h-n_t$ (which according to the false statement should be zero - and it is not).

Admittedly, it takes some dedicated intellectual effort to really understand these two statements, and how they differ (in "theory" and in "practice") from some of the previous ones -I do not claim such deep understanding for myself as yet.

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
15

This fallacy has many names.

1) It's probably best known as the Gambler's fallacy

2) it's also sometimes called the 'law of small numbers' (also see here) (because it relates to the idea that the population characteristics must be reflected in small samples) - which I think is a neat name for its contrast with the law of large numbers, but unfortunately the same name is applied to the Poisson distribution (and also sometimes used by mathematicians to mean something else again), so that can be confusing.

3) among people that believe the fallacy it is sometimes called the 'law of averages', which in particular tends to be invoked after a run without some outcome to argue that the outcome is 'due', but of course no such short-run law exists - nothing acts to 'compensate' for an initial imbalance - the only way an initial discrepancy is wiped out is by the volume of later values which themselves have an average of 1/2.

Consider an experiment in which a fair coin is tossed repeatedly; let $H_i$ be the number of heads and $T_i$ be the number of tails observed up to the end of the $i$-th trial. Note that $i=H_i+T_i$

It's interesting to note that in the long run (i.e. $n\to\infty$), while $\frac{H_n}{n}$ does converge in probability to $\frac{_1}{^2}$, $E|H_n-T_n|$ grows with increasing $n$ - indeed it grows without bound; there's nothing "pushing it back toward 0".

Glen_b
  • 257,508
  • 32
  • 553
  • 939
2

Just to note, that if you get a huge run of heads or tails in a row, you may be better off revisiting your prior assumption assumption that the coin was fair.

Avraham
  • 3,182
  • 21
  • 40
1

Are you thinking of 'stochastic'? The flip of a fair coin (or the roll of a fair die) is stochastic (ie independent) in the sense that it does not depend on a previous flip of such coin. Assuming a fair con, the fact that the coin had been flipped a hundred times with a hundred heads resulting does not change the fact that the next flip has a 50/50 chance of being heads.

In contrast, the likelihood of drawing a certain card drawing a card from a deck of cards without replacement is not stochastic because the likelihood of drawing a certain card will change the likelihood of drawing the card on the next draw (if it was with replacement, it would be stochastic).

user63551
  • 11
  • 1
  • stochastic does not mean independent – Ben Voigt Jun 09 '14 at 05:35
  • 1
    *"Assuming a fair con...the next flip has a 50/50 chance of being heads"*, I think you do have a deep philosophical truth here. You could expand the answer to explain what happens if it's an unfair (AKA regular?) con. – hyde Jun 09 '14 at 15:39
0

Adding on to Glen_b's and Alecos's responses, let's define $X_n$ to be the number of heads in the first $n$ trials. A familiar result using the normal approximation to the binomial is that $X_n$ is approximately $N(n/2, \sqrt{n/4})$. Now, before observing the first 100 tosses, your friend is correct that there is a good chance that $X_{1000}$ will be close to 500. In fact,

$P( 469 < X_{1000} < 531) \approx .95$.

However, after observing $X_{100} =100$, let's define $Y_{900}$ to be number of heads in the last 900 trials, then

$P( 469 < X_{1000} < 531 \mid X_{100}=100) = P( 369 < Y_{900} < 431) \approx .1$

since $Y_{900}$ approximately $N(450, 15)$.

Thus, after observing 100 heads in the first 100 trials, there is no longer a high probability of observing close to 500 successes in the first 1000 trials, assuming of course that the coin is fair. Note that this is a concrete example illustrating that an initial imbalance is unlikely to be compensated for in the short run.

Further, note that if $n=1,000,000$, then

$P(499,020 < X_{1,000,000} < 500,980) \approx .95$

but the impact of the imbalance in the first 100 tosses is negligible in the long run since

$P(499,020 < X_{1,000,000} < 500,980 \mid X_{100} = 100) = P( 498,920 < Y_{999,900} < 500880) \approx .949$

jsk
  • 2,810
  • 1
  • 12
  • 25
0

You are refering to Gambler's fallacy, although this is not entirely correct.

Indeed if phrased as "given an assumed fair coin and one observes a given sequence of outcomes, what is the estimation of the elementary probabilities of the coin", this becomes more apparent.

Indeed the "fallacy" is related only to (assumed) fair coins, where the various products of probs are equal. However this entails an interpretation that is in contrast to (study of) similar cases with a coin having another (not-symmetric/biased) probability distribution.

This is like the fallacy used in many statistical studies where correlation implies causality. But of course it can be a hint of a causality relation or common cause at least in some cases.

Nikos M.
  • 235
  • 1
  • 5