21

For a uniformly distributed variable between 0 and 1 generated using

rand(1,10000)

this returns 10,000 random numbers between 0 and 1. If you take the mean, it is 0.5, while if you take the log of that sample, then take the mean of the result:

mean(log(rand(1,10000)))

I would expect that the result to be $\log 0.5=-.6931$, but instead the answer is -1. Why is this so?

Xi'an
  • 90,397
  • 9
  • 157
  • 575
Jeremy Dorner
  • 181
  • 1
  • 7
  • 4
    A minute point ignored by either answer to date is that log 0 is indeterminate. I don't know whether MATLAB regards the distribution as having support (0, 1] or [0, 1) but this should be documented somewhere. Otherwise put, in principle your transformed distribution has an infinite left tail. – Nick Cox Oct 27 '19 at 08:19
  • 2
    @NickCox: apparently [it does not produce zeros or ones](https://stackoverflow.com/q/29813592/1154578). – Xi'an Oct 27 '19 at 18:48
  • 2
    Because the log of the mean isnt the same thing as the mean of the logs. – user207421 Oct 27 '19 at 23:58
  • @Xi'an Thanks for the link. So,MATLAB uses a support of $(0, 1)$ which certainly avoids some very occasional problems. But as this question might interest others too, check out your software if different. – Nick Cox Oct 28 '19 at 14:36
  • 3
    Why would you think it *should* be so? Consider a uniform distribution between -1 and 1. E[x]=0. Then consider y=abs(x). abs(E[x])=0 but obviously E[abs(x)]>0. – MooseBoys Oct 28 '19 at 18:40
  • @NickCox: R `runif` also [shares this feature](https://stackoverflow.com/a/48428360/1154578) of never producing 0's and 1's. – Xi'an Oct 30 '19 at 07:30

4 Answers4

54

This is another illustration of Jensen's inequality $$\mathbb E[\log X] < \log \mathbb E[X]$$ (since the function $x\mapsto \log(x)$ is strictly concave] and of the more general (anti-)property that the expectation of the transform is not the transform of the expectation when the transform is not linear (plus a few exotic cases). (Most of my undergraduate students are however firm believers in the magical identity $\mathbb E[h(X)] = h(\mathbb E[X])$ if I only judge from the frequency of this equality appearing in their final exam papers.)

Xi'an
  • 90,397
  • 9
  • 157
  • 575
53

Consider two values symmetrically placed around $0.5$ - like $0.4$ and $0.6$ or $0.25$ and $0.75$. Their logs are not symmetric around $\log(0.5)$. $\log(0.5-\epsilon)$ is further from $\log(0.5)$ than $\log(0.5+\epsilon)$ is. So when you average them you get something less than $\log(0.5)$.

Similarly, if you take a teeny interval around a collection of such pairs of symmetrically placed values, you still get the average of the logs of each pair being below $\log(0.5)$... and it's a simple matter to move from that observation to the definition of the expectation of the log.

Indeed, usually, $E(t(X))\neq t(E(X))$ unless $t$ is linear.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • 1
    Great answer, having studied signal processing I would like to stress the importance of linearity, as a concept to have in mind. The last sentence is perfect in itself, but as you have a very "easy" (and good) explanation in the first two paragraphs some people might be at a loss in the third. And as it is the most important to my mind, I feel elaborating it a bit would be great. – Kami Kaze Oct 29 '19 at 07:55
12

It is worthwhile to note that if $X \sim \operatorname{Uniform}(0,1)$, then $-\log X \sim \operatorname{Exponential}(\lambda = 1)$, so that $\operatorname{E}[\log X] = -1$. Explicitly, $$f_X(x) = \mathbb 1(0 < x < 1) = \begin{cases} 1, & 0 < x < 1 \\ 0, & \text{otherwise} \end{cases}$$ implies $$Y = g(X) = -\log X$$ has density $$\begin{align*} f_Y(y) &= f_X(g^{-1}(y)) \left|\frac{dg^{-1}}{dy}\right| \\ &= \mathbb 1 \left( 0 < e^{-y} < 1 \right) \left| - e^{-y} \right| \\ &= e^{-y} \mathbb 1 (0 < y < \infty) \\ &= \begin{cases} e^{-y}, & y > 0 \\ 0, & \text{otherwise}. \end{cases} \end{align*}$$ Thus $Y \sim \operatorname{Exponential}(\lambda = 1)$ and its mean is $1$. This furnishes a very convenient method to generate exponentially distributed random variables via log-transformation of a uniform random variable on $(0,1)$.

heropup
  • 5,006
  • 1
  • 16
  • 25
6

Note that the mean of a transformed uniform variable is just the mean value of the function doing the transformation over the domain (since we are expecting each value to be selected equally). This is simply,

$$ \frac{1}{b-a}\int_a^b{t(x)}dx = \int_0^1{t(x)}dx $$

For example (in R):

$$ \int_0^1{log(x)}dx = (1\cdot log(1)-1) - 0 = 0-1 =-1 $$

mean(log(runif(1e6)))
[1] -1.000016
integrate(function(x) log(x), 0, 1)
-1 with absolute error < 1.1e-15

$$ \int_0^1{x^2}dx = \frac{1}{3}(1^3-0^3) = \frac{1}{3} $$

mean(runif(1e6)^2)
[1] 0.3334427
integrate(function(x) (x)^2, 0, 1)
0.3333333 with absolute error < 3.7e-15

$$ \int_0^1{e^x}dx = e^1-e^0 = e-1 $$

mean(exp(runif(1e6)))
[1] 1.718425
integrate(function(x) exp(x), 0, 1)
1.718282 with absolute error < 1.9e-14
exp(1)-1
[1] 1.718282
James
  • 2,106
  • 18
  • 21