Entropy of (Sum of Gaussians) versus Sum of (Entropy of Gaussians)

Question

Short version: How can the joint entropy of two independent variables be less than the sum of those independent variables? The joint entropy should encode all information that a scalar function can, right?

Long version: Assume there are 2 independent normal random variables $X, Y$ both with mean $0$ and variance $\sigma^2$.

We know that entropy of $X$ and $Y$ is $H(X)= H(Y)= \ln(2\pi e \sigma^2)/2$ (derivation)
The variance of the random variable $SUM = X + Y$ is $2 \sigma^2$
1 and 2 mean that $H(SUM)= \ln(2 \pi e (2 \sigma^2))/2$
The sum of the entropies of 2 independent random variables is the entropy of their joint distribution, i.e. $H(X, Y) = H(X) + H(Y)$ . This implies that in this particular case $$H(X, Y) = (\ln(2\pi e \sigma^2)/2) \cdot 2.$$
Now note that if $\sigma^2=(\pi e)^{-1}$, then from 3 and 4 $$H(X,Y)= (\ln(2)/2) \cdot 2 = H(SUM)= ln(2 \cdot 2)/2 = \ln(2).$$

And if you increase $\sigma$ then $H(SUM) > H(X,Y)$. It seems quite fantastic that $(\pi e)^{-1}$ is an entropy tipping point for Gaussians. Do you know of any papers or books that make this observation? Is $N(0, (\pi e)^{-1})$ discussed as an alternative to $N(0,1)$ because of its neutrality in this context? And why is this happening at all? Shouldn't joint entropy be greater than entropy of any scalar function, it seems to be more general than any scalar function?

At some point between steps 4 and 6 you appear to confuse the distribution of $(X,Y)$ (a bivariate Normal) with that of $X+Y$ (a univariate Normal). — whuber, Jul 31 '13 at 15:12
@whuber At point 4, I am talking about the [joint entropy](http://en.wikipedia.org/wiki/Joint_entropy) which is the entropy of $(X, Y)$ (a bivariate normal) and that I show to be equal to the entropy of $X+Y$ (a univariate Normal). Note that I am allowed to compare entropies of bivariate and univariate distributions since that is just a scalar value. So, I am not confusing distributions here just saying that entropy of bivariate is equal to entropy of univariate (and wondering why?) — Pushpendre, Jul 31 '13 at 15:43
I'm sorry, I still do not see any valid demonstration that the entropy of $(X,Y)$ must equal that of $X+Y$: where is it, exactly? While you're thinking about your response, why don't you compute the entropies where $X$ and $Y$ are iid Bernoulli$(1/2)$ variables (which are simple calculations): I think you might be surprised. — whuber, Jul 31 '13 at 15:46
@whuber I edited my points so my derivation is more clear. Hopefully my point is clearer now. Thanks a lot. I'll solve the bernoulli case on my bus home now :) — Pushpendre, Jul 31 '13 at 15:57

Lucas · Accepted Answer · 2013-07-31T17:42:28.497

The issue is that you are working with a differential entropy for continuous random variables, which doesn't share all the nice properties of Shannon's entropy for discrete random variables and can behave counter to intuition. In particular, differential entropy can be negative!

The following might help to get a feel for what's going on. First, a little derivation. We have that

\begin{align} H[X + Y, Y] &= H[X + Y \mid Y] + H[Y] = H[X \mid Y] + H[Y] = H[X, Y], \\ H[X + Y, Y] &= H[Y \mid X + Y] + H[X + Y], \end{align}

so that,

$$H[X + Y] = H[X, Y] - H[Y \mid X + Y].$$

Since Shannon's entropy is always non-negative, the entropy of $X + Y$ will therefore always be smaller or equal to the entropy of $X, Y$, in line with your intuition. What must happen in your example is that $H[Y \mid X + Y]$ is negative, which is only possible because it is a differential entropy.

If you want a more well-behaved measure for continuous random variables, use relative entropy.

http://en.wikipedia.org/wiki/Limiting_density_of_discrete_points is important to understand that relative entropy is not necessarily Kullback Leibler — Pushpendre, Jul 31 '13 at 23:04
Seems like people keep getting confused by vagaries of differential entropy. This question is also good http://stats.stackexchange.com/questions/50246/differential-entropy — Pushpendre, Jul 31 '13 at 23:47

Entropy of (Sum of Gaussians) versus Sum of (Entropy of Gaussians)

1 Answers1