2

In following Maximum value of coefficient of variation for bounded data set I come up this question:

e.g., Say, $X$ can take integer values from $[0, 20]$ The mean of $X$ is known to be $0.005$

What is the maximum variance of $X$?

colinfang
  • 411
  • 4
  • 13
  • 3
    Is $X$ a random variable or a set of data? In the former case, it is well known (and easy to derive) that all the probability of $X$ must reside at the extreme values and the conditions of the problem give two simultaneous equations for those two probabilities that are easily solved. But if $X$ represents a *dataset* (as in the referenced question), this is a quadratic integer program whose solution varies according to the size of the dataset. There is no closed-form solution in that case. – whuber Dec 13 '12 at 00:03

1 Answers1

2

Keeping in mind the comment above... if $X$ is a random variable, then the maximal variance distribution has probability $1-p$ at 0 and probability $p$ at 20; given the mean is 0.005, we can solve to get $p=1/4000$; this has variance $400 p (1-p) = 0.099975$.

The case where $X$ is a dataset doesn't have a closed-form solution, but is easy to solve for a particular case: begin with any configuration of the data that gives the right mean; then since $(a+1)^2+(b-1)^2 = x^2 + y^2 + 2(x-y+1)$ is positive if $x>y$, you can iteratively pick any pair of datapoints that are not at the endpoints, and move them each one unit apart. Given a particular sample size, you can work out what the distribution is from this.

petrelharp
  • 344
  • 1
  • 7