2

Variance and standard deviation are often used as proxies for risk and volatility. I make the analogy to information theory as follows, correct if it's wrong: a random variable $x\in \mathbb{R}$ that has no uncertainty is one that has zero volatility, and therefore is riskless.

If so, would this riskless variable have a Shannon (differential) entropy of $0$ because its outcome is fully certain and has zero uncertainty?

develarist
  • 3,009
  • 8
  • 31

1 Answers1

4

The following analysis will reveal just how little "uncertainty," measured in terms of variance (or anything related to it), is connected to Shannon entropy: the volatility may converge toward certainty while the entropy may grow without limit. This happens even when there is a vanishingly small probability that $X_n$ may differ from the constant $x.$

The idea is to exhibit your constant random variable $X$ as the limit of non-constant (but discrete) random variables. This sequence of non-constant random variables models a random variable with almost no uncertainty, but allows that in order to be realistic, we must admit there is a tiny chance--no matter how astronomically small it may be--that $X$ could vary.

To keep this limiting process from being arbitrary, we would need to demonstrate that how one takes such a limit doesn't matter.

To this end, let your random variable $X$ almost surely have the constant value $x:$ that is, $\Pr(X=x)=1.$ Let $\mathscr{R}\subset \mathbb{R}$ be any countable set of real numbers that includes $x$ among its elements. $\mathscr{R}$ represents alternative possible values of $X.$ It must be (at most) countable so that the Shannon entropy can be defined. Nevertheless, countable sets of real numbers abound and can usefully model almost anything. For instance, the set $\mathbb Q$ of all rational numbers is countable.

Let $X_1,X_2, \ldots, X_n,\ldots$ be a sequence of random variables with values in $\mathscr{R}$ that converge in distribution to $X.$ This means that for sufficiently large indexes $n,$ almost all the probability of all the $X_n$ is concentrated on $x.$

Let's see what happens to the "uncertainties" and the entropies in such a sequence. For any real number $y,$ let

$$\pi_n(y) = \Pr(X_n=y)$$

be the probability distribution of $X_n.$ The "volatility" is usually defined as a continuous function of the variance, so let's assume the variances converge to zero, thereby modeling your riskless variable:

$$0 = \lim_{n\to\infty} \operatorname{Var}(X_n) = \lim_{n\to\infty} \sum_{y\in\mathscr{R}} \pi_n(y)y^2 - \left(\sum_{y\in\mathscr{R}} \pi_n(y) y\right)^2.$$

By definition, the entropy is

$$H_n = H(X_n) = -\sum_{y\in\mathscr R} \pi_n(y)\log(\pi_n(y)).$$

We would like to show the entropies must converge to $0,$ too. But contemplate the following sequence of variables where $x=0,$ $\mathscr{R}=\{0, 1/1, 1/2, 1/3, \ldots, \},$ and $\pi_n$ is given by

$$\pi_n(y) = \left\{\matrix{1-1/n & y=0 \\ 1/n^{n+1} & 1/y\in\{1,2,\ldots,n^n\}}\right.$$

This sequence converges to $X$ because all the probability piles up around $x=0.$ The random variable $X_n$ is like $X$ but with a swarm of tiny gnats surrounding it (the numbers $1, 1/2, \ldots, 1/n^n$). As $n$ grows, the number of gnats grows too (and very rapidly), but their sizes shrink so fast that collectively these gnats have only an inconsequential probability ($1/n$ in toto).

By ignoring the subtracted term and using a crude upper bound of $1$ for the values $1/i$ in the main term of the variance formula, we may estimate that

$$\operatorname{Var}(\pi_n) = 0+\sum_{i=1}^{n^n} \frac{1}{n^{n+1}}\left(\frac{1}{i}\right)^2 - \left(0+\sum_{i=1}^{n^n} \frac{1}{n^{n+1}}\frac{1}{i}\right)^2\le \sum_{i=1}^{n^n} \frac{1}{n^{n+1}}\left(1\right)^2 = \frac{1}{n}\to 0,$$

as it ought, but nevertheless

$$\begin{aligned} H_n &= -\left(1-\frac{1}{n}\right)\log\left(1-\frac{1}{n}\right) - \sum_{i=1}^{n^n}\frac{1}{n^{n+1}}\log\left(\frac{1}{n^{n+1}}\right)\\ & = -\left(1-\frac{1}{n}\right)\log\left(1-\frac{1}{n}\right) + \frac{1}{n}(n+1)\log\left(n\right) \\ &\ge \log(n) \to \infty, \end{aligned}$$

revealing that the entropy grows arbitrarily large.

Evidently this leaves the question unsettled. One solution is simply to declare that the entropy of a constant random variable is zero, which is what it needs to be in order for the axiomatic properties for combining entropies to hold. But the insight afforded by this example ought to give us pause. It asks us to reflect on how our original random variable models reality and to consider, very carefully, the possibility that we might be overconfident in modeling a potentially risky return as being completely without risk. For more ruminations about such situations, turn to Nicholas Taleb.

whuber
  • 281,159
  • 54
  • 637
  • 1,101