15

Let us sum a stream of random variables, $X_i \overset{iid}\sim \mathcal{U}(0,1)$; let $Y$ be the number of terms we need for the total to exceed one, i.e. $Y$ is the smallest number such that

$$X_1 + X_2 + \dots + X_Y > 1.$$

Why does the mean of $Y$ equal Euler's constant $e$?

$$\mathbb{E}(Y) = e = \frac{1}{0!} + \frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \dots $$

Silverfish
  • 20,678
  • 23
  • 92
  • 180
  • I am posting this in the spirit of a self-study question, though I think I first saw this question over a decade ago. I can't recall how I answered it back then, though I'm sure it wasn't that sprang to mind when I saw this property mentioned in the thread [Approximate $e$ using Monte Carlo Simulation](http://stats.stackexchange.com/q/193990/22228). Since I suspect this to be a fairly common exercise question, I have opted to present a sketch rather than a complete solution, though I suppose the main "spoiler warning" belongs in the question itself! – Silverfish Feb 06 '16 at 18:02
  • I remain very interested in alternative approaches; I know this was included as a question in Gnedenko's *Theory of Probability* (originally in Russian but widely translated) but I don't know what solution was expected there, or posed elsewhere. – Silverfish Feb 06 '16 at 20:35
  • 1
    I [wrote](http://stats.stackexchange.com/questions/193990/approximate-e-using-monte-carlo-simulation) a simulation solution in MATLAB using your simplex method. I didn't know about the link to simplexes, it's so unexpected. – Aksakal Feb 07 '16 at 03:33

3 Answers3

15

First observation: $Y$ has a more pleasing CDF than PMF

The probability mass function $p_Y(n)$ is the probability that $n$ is "only just enough" for the total to exceed unity, i.e. $X_1 + X_2 + \dots X_n$ exceeds one while $X_1 + \dots + X_{n-1}$ does not.

The cumulative distribution $F_Y(n) = \Pr(Y \leq n)$ simply requires $n$ is "enough", i.e. $\sum_{i=1}^{n}X_i > 1$ with no restriction on how much by. This looks like a much simpler event to deal with the probability of.

Second observation: $Y$ takes non-negative integer values so $\mathbb{E}(Y)$ can be written in terms of the CDF

Clearly $Y$ can only take values in $\{0, 1, 2, \dots\}$, so we can write its mean in terms of the complementary CDF, $\bar F_Y$.

$$\mathbb{E}(Y) = \sum_{n=0}^\infty \bar F_Y(n) = \sum_{n=0}^\infty \left(1 - F_Y(n) \right)$$

In fact $\Pr(Y=0)$ and $\Pr(Y=1)$ are both zero, so the first two terms are $\mathbb{E}(Y) = 1 + 1 + \dots$.

As for the later terms, if $F_Y(n)$ is the probability that $\sum_{i=1}^{n}X_i > 1$, what event is $\bar F_Y(n)$ the probability of?

Third observation: the (hyper)volume of an $n$-simplex is $\frac{1}{n!}$

The $n$-simplex I have in mind occupies the volume under a standard unit $(n-1)$-simplex in the all-positive orthant of $\mathbb{R}^n$: it is the convex hull of $(n+1)$ vertices, in particular the origin plus the vertices of the unit $(n-1)$-simplex at $(1, 0, 0, \dots)$, $(0, 1, 0, \dots)$ etc.

volumes of 2-simplex and 3-simplex

For example, the 2-simplex above with $x_1 + x_2 \leq 1$ has area $\frac{1}{2}$ and the 3-simplex with $x_1 + x_2 + x_3 \leq 1$ has volume $\frac{1}{6}$.

For a proof that proceeds by directly evaluating an integral for the probability of the event described by $\bar F_Y(n)$, and links to two other arguments, see this Math SE thread. The related thread may also be of interest: Is there a relationship between $e$ and the sum of $n$-simplexes volumes?

Silverfish
  • 20,678
  • 23
  • 92
  • 180
  • 1
    This is an interesting geometric approach, and easy to solve this way. Beautiful. [Here](https://en.wikipedia.org/wiki/Simplex#Volume)'s the equation for a volume of a simplex. I don't think there could be a more elegant solution, frankly – Aksakal Feb 06 '16 at 19:05
  • 1
    +1 You can also obtain the full distribution of $Y$ from any of the approaches in my post at http://stats.stackexchange.com/questions/41467/consider-the-sum-of-n-uniform-distributions-on-0-1-or-z-n-why-does-the/43075#43075. – whuber Feb 06 '16 at 19:20
  • If I stumbled on this solution, there's no way *they* could force me do it other way in a school :) – Aksakal Feb 07 '16 at 03:38
11

Fix $n \ge 1$. Let $$U_i = X_1 + X_2 + \cdots + X_i \mod 1$$ be the fractional parts of the partial sums for $i=1,2,\ldots, n$. The independent uniformity of $X_1$ and $X_{i+1}$ guarantee that $U_{i+1}$ is just as likely to exceed $U_i$ as it is to be less than it. This implies that all $n!$ orderings of the sequence $(U_i)$ are equally likely.

Given the sequence $U_1, U_2, \ldots, U_n$, we can recover the sequence $X_1, X_2, \ldots, X_n$. To see how, notice that

  • $U_1 = X_1$ because both are between $0$ and $1$.

  • If $U_{i+1} \ge U_i$, then $X_{i+1} = U_{i+1} - U_i$.

  • Otherwise, $U_i + X_{i+1} \gt 1$, whence $X_{i+1} = U_{i+1} - U_i + 1$.

There is exactly one sequence in which the $U_i$ are already in increasing order, in which case $1 \gt U_n = X_1 + X_2 + \cdots + X_n$. Being one of $n!$ equally likely sequences, this has a chance $1/n!$ of occurring. In all the other sequences at least one step from $U_i$ to $U_{i+1}$ is out of order. This implies the sum of the $X_i$ had to equal or exceed $1$. Thus we see that

$$\Pr(Y \gt n) = \Pr(X_1 + X_2 + \cdots + X_n \le 1) = \Pr(X_1 + X_2 + \cdots + X_n \lt 1) = \frac{1}{n!}.$$

This yields the probabilities for the entire distribution of $Y$, since for integral $n\ge 1$

$$\Pr(Y = n) = \Pr(Y \gt n-1) - \Pr(Y \gt n) = \frac{1}{(n-1)!} - \frac{1}{n!} = \frac{n-1}{n!}.$$

Moreover,

$$\mathbb{E}(Y) = \sum_{n=0}^\infty \Pr(Y \gt n) = \sum_{n=0}^\infty \frac{1}{n!} = e,$$

QED.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • I have read it a couple of times, and I *almost* get it... I posted a couple of questions in the Mathematics SE as a result of the $e$ constant computer simulation. I don't know if you saw them. One of them came back before your kind explanation on Tenfold about the ceiling function of the $1/U(0,1)$ and the Taylor series. The second one was exactly about this topic, never got a response, until now... – Antoni Parellada Feb 07 '16 at 02:04
  • [here](http://math.stackexchange.com/q/1642647/152225) and [here](http://math.stackexchange.com/q/1641928/152225). – Antoni Parellada Feb 07 '16 at 04:52
  • And could you add the proof with the uniform spacings as well? – Xi'an Feb 08 '16 at 12:15
  • @Xi'an Could you indicate more specifically what you mean by "uniform spacings" in this context? – whuber Feb 08 '16 at 13:21
  • I am referring to your Poisson process simulation via the uniform spacing, in the thread [Approximate e using Monte Carlo Simulation](http://stats.stackexchange.com/q/193990/22228) for which I cannot get a full derivation. – Xi'an Feb 08 '16 at 14:58
  • Can anyone explain why is it that "There is exactly one sequence in which the Ui are already in increasing order, " ?? – simran Jul 29 '21 at 04:01
  • and if sum of $ \ X_i $ is supposed to exceed 1 then why do we have less that sign in 3rd last equation ?? – simran Jul 29 '21 at 04:06
6

In Sheldon Ross' A First Course in Probability there is an easy to follow proof:

Modifying a bit the notation in the OP, $U_i \overset{iid}\sim \mathcal{U}(0,1)$ and $Y$ the minimum number of terms for $U_1 + U_2 + \dots + U_Y > 1$, or expressed differently:

$$Y = min\Big\{n: \sum_{i=1}^n U_i>1\Big\}$$

If instead we looked for:

$$Y(u) = min\Big\{n: \sum_{i=1}^n U_i>u\Big\}$$ for $u\in[0,1]$, we define the $f(u)=\mathbb E[Y(u)]$, expressing the expectation for the number of realizations of uniform draws that will exceed $u$ when added.

We can apply the following general properties for continuous variables:

$E[X] = E[E[X|Y]]=\displaystyle\int_{-\infty}^{\infty}E[X|Y=y]\,f_Y(y)\,dy$

to express $f(u)$ conditionally on the outcome of the first uniform, and getting a manageable equation thanks to the pdf of $X \sim U(0,1)$, $f_Y(y)=1.$ This would be it:

$$f(u)=\displaystyle\int_0^1 \mathbb E[Y(u)|U_1=x]\,dx \tag 1$$

If the $U_1=x$ we are conditioning on is greater than $u$, i.e. $x>u$, $\mathbb E[Y(u)|U_1=x] =1 .$ If, on the other hand, $x <u$, $\mathbb E[Y(u)|U_1=x] =1 + f(u - x)$, because we already have drawn $1$ uniform random, and we still have the difference between $x$ and $u$ to cover. Going back to equation (1):

$$f(u) = 1 + \displaystyle\int_0^x f(u - x) \,dx$$, and with substituting $w = u - x$ we would have $f(u) = 1 + \displaystyle\int_0^x f(w) \,dw$.

If we differentiate both sides of this equation, we can see that:

$$f'(u) = f(u)\implies \frac{f'(u)}{f(u)}=1$$

with one last integration we get:

$$log[f(u)] = u + c \implies f(u) = k \,e^u$$

We know that the expectation that drawing a sample from the uniform distribution and surpassing $0$ is $1$, or $f(0) = 1$. Hence, $k = 1$, and $f(u)=e^u$. Therefore $f(1) = e.$

Antoni Parellada
  • 23,430
  • 15
  • 100
  • 197