Expected number of tosses till first head comes up

Question

Suppose that a fair coin is tossed repeatedly until a head is obtained for the first time.

What is the expected number of tosses that will be required?
What is the expected number of tails that will be obtained before the first head is obtained?

This link has the answers for both questions: http://en.wikipedia.org/wiki/Geometric_distribution — swmo, Feb 08 '15 at 17:22

Antoni Parellada · Answer 1 · 2017-05-12T12:13:33.497

This can be answered using the geometric distribution as follows:

The number of failures k - 1 before the first success (heads) with a probability of success p ("heads") is given by:

$$p(X=k)=(1−p)^{k−1}p$$

with k being the total number of tosses including the first 'heads' that terminates the experiment.

And the expected value of X for a given p is $1/p=2$.

The derivation of the expected value can be found here. The last steps left implicit should be as follows:

$\frac{d}{dr} \frac{1}{1-r} = \frac{1}{(1-r)^2}$ to be plugged into the expression:

$E(X) = \frac{p}{1-p} \sum\limits_{x = 1}^{\infty}x\ r^x = \frac{p}{1-p}\ r\ (\frac{d}{dr} \frac{1}{1-r})= \frac{p}{1-p}\ r\ \frac{1}{(1-r)^2}$. With $r = 1 - p$, it simplifies to

$E(X) = \frac{1}{p}$, justifying its use above.]

Alternatively, we could use the negative binomial distribution interpreted as the number of failures before the first success. The probability mass function is given as the p(number of failures, n, before attaining r successes | given a certain probability, p, of success in each Bernoulli trial):

$$p(n;r,p) ={n+r-1\choose r-1} p^r (1-p)^n$$

The expectation for number of trials, n + r is given by the general formula:

$$\frac{r}{(1-p)}$$

Given our known parameters: r = 1 and p = 0.5,

$$E(n + r; 1,0.5) =\frac{r}{1-p} = \frac{1}{1-0.5} = 2$$

Hence we can expect to make two tosses before getting the first head with the the expected number of tails being $E(n+r) - r = 1$.

We can run a Monte Carlo simulation to prove it:

   set.seed(1)

p <- 1/2

reps <- 10000                         # Total number of simulations.

tosses_to_HEAD <- 0                   # Setting up an empty vector to add output to.

for (i in 1:reps) {
  head <- 0                           # Set the variable 'head' to 0 with every loop.
  counter <- 0                        # Same forlocal variable 'counter'.
  while (head == 0) {
    head <- head + rbinom(1, 1, p)    # Toss a coin and add to 'head'
    counter <- counter + 1            # Add 1 to 'counter'
  }
  tosses_to_HEAD[i] <- counter        # Append number in counter after getting heads.
}

mean(tosses_to_HEAD)
[1] 2.0097

For the current question, the distribution is called the Geometric distribution $\mathcal{G}(p)$. — Xi'an, Feb 08 '15 at 19:57
`And the expected value of` $X$ `for a given` $p$ `is` $1/p$ and how is one supposed to prove that? — Dilip Sarwate, Feb 09 '15 at 04:55
There is a nice derivation on http://math.stackexchange.com/questions/235927/where-am-i-going-wrong-with-this-proof-of-expected-value-of-a-geometric-random-v But I can include the end of that derivation on my response. — Antoni Parellada, Feb 09 '15 at 16:13

score 5 · Answer 2 · edited Apr 13 '17 at 12:44

Model the game by drawing a ticket out of a box. There are two kinds of tickets. On one is written "Stop, you tossed heads"; on the other is written "Continue, you tossed tails." The expected number of additional tosses in the first case is $0$ while the expected number of additional tosses in the second case is $x$, say--we don't know it yet and have to figure it out.

Write these expectations on their respective tickets: these are the values of the tickets.

The three things we do know are:

The chance of drawing a "Stop" ticket (with value $0$) is $p$.
The chance of drawing a "Continue" ticket (with value $x$) is $1-p$.
The expectation of this single draw is, by definition, the sum of the probability-weighted values on all kinds of tickets:

$$p\times 0 + (1-p)\times x = (1-p)x.$$

Let us interpret this number: it is the expected number of additional tosses that will be needed until a head appears. Since draws of tickets correspond to coin tosses, adding in the one draw needed to obtain a ticket gives us the expected number of tosses--which is just $x$ itself. Equating these two expressions,

$$x = 1 + (1-p)x.$$

Solving for $x$ answers the first question. Since the number of tails is always one less than the number of draws, the expected number of tails also must be one less than the expected number of draws. Therefore $x-1$ answers the second question.

A second intuitively clear solution can be obtained by contemplating a very long sequence of $n$ tosses. How many games were played? Answer: the number of heads (plus one more incomplete game if the sequence ends with a series of tails). How many heads are expected? Answer: $pn$. Call this number $h$. The Weak Law of Large Numbers asserts that the actual number of heads is highly likely to be very close to $pn$ provided $n$ is sufficiently large. Therefore the average game length $x$, given by some number between $n/h$ and $n/(h+1)$, will be arbitrarily close to $n/(pn)$, whence it must equal $x$ itself.

This leads to an extremely efficient way to simulate the distribution of game lengths. Here is R code. It records "heads" as true values in a boolean array and computes the tosses between successive true values.

p <- 1/3                                           # Set the chance of heads
tosses <- runif(1e6) < p                           # Make a million tosses
sim <- diff(c(TRUE, which(tosses)))                # Compute game lengths
hist(sim, xlab="Game length", main="Distribution") # Graph their distribution
mean(sim)                                          # Report the average length

When I ran this code after setting the seed to $17$ (set.seed(17)), the output differed from $x$ by only a tiny amount.

Could you help me understand why the "x" of the drawing game and the "x" in the second equation represent the same thing? I have no idea how do you get the second equation. Thank you very much. — Light, May 28 '16 at 13:06
@Light The second equation is explained in the paragraph preceding it. — whuber, May 28 '16 at 18:18
♦ Thank u for ur reply. I've read the definition of x and the paragraph you said again and again, but I still don't understand. Let me say my understanding and pls help me know if I misunderstand sth. From my understanding, x is the "additional" expected number in the drawing tickets game, which is a different game from the original game, beacause the expectation(let me call it "E") of the coin game includes the first tossing. In my opinion, E should be "x + 1", but they're not the same thing. In the equation, you made the x and E the same thing that makes me confused. Thank u. — Light, May 30 '16 at 03:52

score 3 · Answer 3 · edited Apr 11 '17 at 10:35

Let X be the number of coin flips required until a head is obtained. So, we need to calculate E(X) (i.e. expected value of X).

We can condition E(X) on whatever our first flip is. Let E(X|H) denote the number of remaining coin flips given I got a head on the first flip. Similarly, let E(X|T) denote the number of remaining coin flips given I got a tail on the first flip.

By first step conditioning, we have

$E(X) = \frac{1}{2} * (1 + E(X|H)) + \frac{1}{2} * (1 + E(X|T))$

Now, as $E(X|H)$ denoted the remaining flips after receiving head on the first, it will be equal to 0 as I don't need to flip after getting 1 head.

And, $E(X|T) = E(X)$, as we did not make any progress towards getting 1 head.

So, $E(X) = \frac{1}{2} * (1 + 0) + \frac{1}{2} * (1 + E(X))$

=> $E(X) = 2$

Expected number of tosses till first head comes up

3 Answers3

Linked