2

The most common kind of deviation is the standard deviation.

$$ \text{Sd}(x) = \sqrt{\text{Mean}((x - \text{Mean}(x))^2)}$$

The standard deviation is very similar to the mean absolute deviance or

$$ \text{MAD}(x) = \text{Mean}(|x - \text{Mean}(x)|)$$

but is often simpler to calculate or obeys nice algebraic properties.

But there are a lot of other measures of variance. For example, there is the most common absolute deviance from the mean value: $ \text{Mode}(|x - \text{Mean}(x)|)$.

It is not clear to me why I should prefer the standard deviation over other kinds of measures of dispersion. I suppose the simplest answer is that this measure of dispersion is the most highly studied and well known and using other methods of dispersion will confuse people you try to communicate with.

I guess I would have to use these kind of alternative measures for pathological distributions like the Cauchy distribution though.

Alexis
  • 26,219
  • 5
  • 78
  • 131
  • Try to look for related threads. I think this question has been answered earlier, just cannot find the duplicate right now. – Richard Hardy Aug 09 '16 at 20:03
  • 2
    One reason variance is popular (and hence standard deviation) is that the variance of a sum of random variables has a very simple form (doubly so if they're independent). None of the main competitors does. – Glen_b Aug 09 '16 at 20:06
  • @Aksakal This is not a duplicate. This is not about squaring or square rooting but about the mean versus the mode and similar. – Molossus Spondee Aug 09 '16 at 20:10
  • 1
    This question is being asked in different forms over and over, that's my point. It's impossible to compare variance to every measure of dispersion. – Aksakal Aug 09 '16 at 20:12
  • 2
    I'd agree that the questions were not phrased identically, eg the other question talked about squaring vs square rooting, but the actual answers received there were far more general than the title suggests. – Silverfish Aug 09 '16 at 23:12
  • You're comparing the SD to the MAD. That is the question of whether to square or not. – gung - Reinstate Monica Aug 09 '16 at 23:46
  • @gung I'm comparing mean versus mode and similar and not square versus absolute value. – Molossus Spondee Aug 11 '16 at 01:59
  • @StevenStewart-Gallus your question is poorly formulated if your last comment is true. It's not clear that you're comparing mean vs mode from the question. That's why it should be either closed or edited. The question in its current from is all about dispersion, not "central tendency" – Aksakal Aug 11 '16 at 14:09

3 Answers3

3

You may come up with infinite number of dispersion measures. It's a lost cause to compare the variance to each and everyone of them.

There are two features of variance that are attractive to me. First, it's a smooth function. For instance, the mean absolute deviation is not.

Second, it's one of the central moments: $$\mu_k=\sum_ip_i(x_i-\mu_1)^k$$ Here $\mu_2$ is a variance.

Being a moment is important, for it defines the distribution when combined with all other moments. Other measures of dispersion are stand-alone metrics.

Alexis
  • 26,219
  • 5
  • 78
  • 131
Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • Well you could take the moment around the mean but couldn't you just as easily take the moment around the median? The nth moment is $\int^{\infty}_{-\infty} (x-c)^n f(x) \, \mathrm{d} x$ but c does not need to be the mean at all. – Molossus Spondee Aug 09 '16 at 20:23
  • @StevenStewart-Gallus, that wouldn't be a central moment. In fact non-central moment is defined as $\gamma_k=\sum_ip_ix_i^k$. So, the mean is not important here. The important part is squaring. – Aksakal Aug 09 '16 at 20:26
  • I don't follow. Why should I care for the central moment and not some other moment? – Molossus Spondee Aug 09 '16 at 20:41
  • @StevenStewart-Gallus, a full set of moments defines the distribution, central or non-central. The central second moment has a nice interpretation as a dispersion measure. All I'm saying is that this measure of dispersion is not a stand alone, it's like a piece which completes the puzzle, unlike other measures of dispersion – Aksakal Aug 09 '16 at 20:45
  • but you just said that other moments work too? The moment around the median seems to have just as nice an interpretation. – Molossus Spondee Aug 09 '16 at 21:15
  • @StevenStewart-Gallus, why are you stuck on the median? OP's question is about dispersion. Yes, you can build a set of moments around the median, then the dispersion would be around the median. The nice thing about the central moments is that the mean is itself a moment. So, when you talk about the mean and variance, you're talking about two moments, which are part of a full set of moment that define the distribution. If you don't see the elegance in this setup, then I guess you could prefer other measures like MAD to dispersion. – Aksakal Aug 09 '16 at 21:19
3

I'm surprised no one has mentioned the really essential property that Variance is additive for independent random variables:

$$\mbox{Var}(a_1X_1+\cdots+a_nX_n)=\sum_{i=1}^na_i^2\mbox{Var}(X_i),$$

and the equally nice linearity properties that covariance shares. This becomes completely intractable without the square inside the expectation in the definition of variance.

As well for Central limit theorem it is variance, and not L1 that gives rise to the CLT. Specifically L1 gives rise to the strong law of large numbers, but not the fluctuations therein.

Alex R.
  • 13,097
  • 2
  • 25
  • 49
1

In some sense, a related and deeper question is why do people tend to use the $L_2$ norm instead of the $L_1$ norm or indeed other norms? In the two-dimensional Euclidean vector space, why do people tend to use $\sqrt{x^2+y^2}$ (i.e. $L_2$ norm) as a measure of distance instead of $|x| + |y|$ (i.e. $L_1$ norm)?

How the $L_2$ and $L_1$ norm are related to standard deviation and mean absolute deviation respectively:

Let $x$ be a mean zero random variable and $P$ be a probability measure.

The standard deviation is simply the $L_2$ norm:

$$ \left( \int |x|^2 \;dP \right) ^\frac{1}{2} $$

And the mean absolute deviation is simply the $L_1$ norm. $$ \left( \int |x| \; dP \right)^ \ $$

Matthew Gunn
  • 20,541
  • 1
  • 47
  • 85