9

Imagine an election where $n$ people make a binary choice: they vote for A or against it. The outcome is that $m$ people vote for A, and so A's result is $p=m/n$.

If I want to model these elections, I can assume that each person votes for A independently with probability $p$, leading to the binomial distribution of votes: $$\text{votes for A}\sim\mathsf{Binom}(n,p).$$ This distribution has mean $m=np$ and variance $np(1-p)$.

I can make other assumptions as well. For example, I can assume that probability $p$ is itself a random variable coming from some distribution (e.g. beta); this can lead to a beta-binomial distribution of votes for A. Or I can assume that people vote in groups of $k$, where each group of $k$ people makes the same choice and it is A with probability $p$. This will lead to a binomial distribution with larger variance. In all these cases, variance of the resulting distribution is larger than in the simplest binomial scheme.

Can I make a claim that binomial distribution has the smallest possible variance? In other words, can this claim be somehow made precise, e.g. by specifying some reasonable conditions on the possible distributions? What would these conditions be?

Or is there maybe some reasonable distribution that has lower variance?

I can imagine lower variance, e.g. when all $n$ people agree in advance on how they will vote, and so $\text{votes for A}$ is not really a random variable, but a fixed number $m$. Then the variance is zero. Or maybe almost all of them agreed but a few people did not, and then one can have tiny variance around $m$. But this feels like cheating. Can one have smaller-than-binomial variance without any prearrangements, i.e. when each person votes in some sense randomly?

amoeba
  • 93,463
  • 28
  • 275
  • 317
  • 2
    A related question: [Are these data underdispersed? If so, what mechanisms may explain this?](http://stats.stackexchange.com/questions/123123) – amoeba Jun 03 '15 at 22:42
  • 2
    The poisson binomial distribution has maximum variance when all p_i same (ie when reduced to binomial) for fixed mean and n.https://en.m.wikipedia.org/wiki/Poisson_binomial_distribution – seanv507 Sep 23 '16 at 19:42
  • 2
    @seanv507 Thank you, yes. I realized this myself back in 2015, see my comment under whuber's answer. But if you want to post this as an answer (elaborating on what Poisson binomial is), I will be happy to upvote. – amoeba Sep 23 '16 at 19:48

2 Answers2

11

No.

Suppose the voters consist of $n=2k$ married pairs. The husbands get together and decide to vote against their wives, who themselves choose randomly. The outcome is always $k$ votes for each of the candidates, with zero variance.

You might cry foul because the husbands are not voting randomly. Well, they are--they just happen to be tied closely with the random votes of their wives. If that bothers you, change things a bit by having each husband flip ten fair coins. If all ten are heads, he will vote with his wife; otherwise he votes against her. You can check that the election outcome still has small (albeit nonzero) variance, even though every vote is unpredictable.

The crux of the matter lies in the negative covariance between two voting blocs, males and females.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • 2
    Thanks, @whuber. It seems that there is another way to achieve lower variance too: voters should vote for A with different probabilities $p_i$ that are distributed around $p$. The compound distribution is apparently known as Poisson binomial, and if its mean $\sum p_i$ is fixed at $np$, then the variance will be *largest* for the binomial case when all $p_i=p$. If probabilities are not equal, the variance will necessarily be smaller. – amoeba Jun 03 '15 at 22:46
  • 2
    Sure: there are plenty of ways to achieve under-dispersion (as I see you belatedly realized!). I just thought this husband-wife example was sufficiently clear, amusing, and memorable to be worth writing down. Because it amounted to an answer, it would not have been appropriate to bury it in a comment (which is how it started out life). – whuber Jun 03 '15 at 22:48
3

Double-no (it maximises the variance)

The answer from whuber is excellent (except that he makes the highly unreaslistic modelling assumption of attributing such obstinant behaviour to the wrong sex!). To supplement that answer, it is also worth examining what happens if you assume that the votes are independent. If we take the votes as mutually independent with probabilities $p_1,...,p_n$ then the mean and variance are:

$$\mathbb{E}(S_n) = \sum_{i=1}^n p_i \quad \quad \quad \quad \quad \mathbb{V}(S_n) = \sum_{i=1}^n p_i - \sum_{i=1}^n p_i^2.$$

If we condition on a fixed expected value $\mu = \mathbb{E}(S_n)$ then it can be shown that the maximum variance is achieved when $p_1 = \cdots = p_n = \mu$. (To demonstrate this you can set up the Lagrangian optimisation to attain this solution.) So not only does the binomial distribution not minimise the variance, it maximises the variance out of all possible cases where we have independent votes.

Ben
  • 91,027
  • 3
  • 150
  • 376