6

Suppose a random variable $X$ has a distribution with support on $[0,1]$, ${\rm Prob}\{ X\in[0,1]\}=1$. I want to maximize its variance subject to the contraint that $\mathbb{E}[X]=\mu\in[0,1]$.

My gut feeling is that it will be the two-point distribution with ${\rm Prob}[X=0]=1-\mu, {\rm Prob}[X=1]=\mu$, but the formal proof of that must involve calculus of variations... and let's say it mildly that I am rusty on this.

I also think that this problem may have come up in design of experiments: if $X$ is the design variable for an experiment that needs to produce as precies estimates as possible of the regression line $Y=a+bX+{\rm error}$, then the variance of these estimates is $\sigma^2(X'X)^{-1}$, and my recollection of my DOX course is that the optimal design is the two-point one with support on the extremes of the range.

StasK
  • 29,235
  • 2
  • 80
  • 165
  • 1
    Just use convexity. No calculus of variations needed. – cardinal Jan 13 '16 at 20:43
  • 2
    More explicitly, let $f$ be an arbitrary convex function on $[0,1]$ and consider $f(X) = f(0\cdot(1-X)+1\cdot X)$. Now apply Jensen. – cardinal Jan 13 '16 at 20:50
  • 1
    I'm sure this is answered on the site. A quick search turns up closely related (and harder) questions at http://stats.stackexchange.com/questions/18621 and http://stats.stackexchange.com/questions/142655. – whuber Jan 13 '16 at 20:52
  • 1
    @cardinal, can you please expand your demonstration and post it as a formal answer? Thanks. – StasK Jan 17 '16 at 17:33

2 Answers2

4

I think I can develop a partial answer for a three-point distribution. Suppose I have ${\rm Prob}[X=0]=p_0, {\rm Prob}[X=1]=p_1$ and ${\rm Prob}[X=a]=p$ for some fixed $a,p\in(0,1)$. Then $$ \mathbb{E}[X] = ap + p_1 = \mu, $$ so that $p_1=\mu-ap, p_0=1-p-\mu+ap$ (some reasonable conditions must be applied so that the solutions are proper, $0\le p_0, p_1\le 1$; I will not bother and assume these conditions to be satisfied). Then $$ \mathbb{V}[X]=a^2p + p_1 - \mu^2 = a^2p + \mu-ap -\mu^2=(a^2-a)p+\mu(1-\mu). $$ Considering this now as a function of $p$, we see that $a^2-a<0$ for $a\in(0,1)$, so $\mathbb{V}[X]$ increases as $p$ decreases, and hence is maximized at the boundary value $p=0$ (i.e., a two-point distribution).

StasK
  • 29,235
  • 2
  • 80
  • 165
2

Let $f(x) = (x - \mu)^2$. Since $f$ is convex, we have $$ f(x) = f\bigl( (1-x)\cdot 0 + x\cdot 1 \bigr) \leq (1-x) f(0) + x f(1) $$ for all $x\in[0,1]$ and thus we get the bound $$\begin{align*} \mathrm{Var}(X) &= \mathbb{E}\bigl(f(X)\bigr) \\ &\leq \mathbb{E}(1-X) f(0) + \mathbb{E}(X) f(1) \\ &= (1-\mu)\mu^2 + \mu (1-\mu)^2 \\ &= \mu(1-\mu) \end{align*} $$ for all random variables $X\in[0,1]$ with $\mathbb{E}(X) = \mu$.

For the two-point random variable $X$ from the question, with $P(X=0)=1-\mu$ and $P(X=1) = \mu$ we have $\mathrm{Var}(X) = \mathbb{E}(X^2) - \mu^2 = (1-\mu)0^2 + \mu 1^2 - \mu^2 = \mu(1-\mu)$. Thus, the bound is sharp, and the two-point distribution indeed maximises the variance.

[I suspect that the above is what cardinal's cryptic comments on the question hint at.]

jochen
  • 651
  • 1
  • 6
  • 12