Can we calculate the variance without using the mean as the 'base' point?
Asked
Active
Viewed 3,101 times
7
-
3Given $\mathbb{E}(X^2) – BloXX Mar 26 '19 at 07:56
-
5Short answer: Lots of other ways to summarize variability (dispersion, spread, scale) but none of the others would be the variance. (In fact, the variance can be defined without reference to the mean.) – Nick Cox Mar 26 '19 at 08:28
-
3Yes: given data $X,$ compute the covariance of $(X,X)$ as described at https://stats.stackexchange.com/a/18200/919. This method never computes the mean. – whuber Mar 26 '19 at 13:15
2 Answers
12
The median absolute deviation is defined as $$\text{MAD}(X) = \text{median} |X-\text{median}(X)|$$ and is considered an alternative to the standard deviation. But this is not the variance. In particular, it always exists, whether or not $X$ allows for moments. For instance, the MAD of a standard Cauchy is equal to one since $$\underbrace{\Bbb P(|X-0|<1)}_\text{0 is the median}=\arctan(1)/\pi-\arctan(-1)/\pi=\frac{1}{2}$$

brazofuerte
- 737
- 4
- 19

Xi'an
- 90,397
- 9
- 157
- 575
-
7Newcomers to this idea should watch out also for mean absolute deviation from the mean (mean deviation, often) and median absolute deviation from the mean. I don't recall mean absolute deviation from the median, but am open to examples. The abbreviation MAD, unfortunately, has been applied variously, so trust people's code first, then their algebraic or verbal definition, but use of an abbreviation MAD only not at all. In symmetric distributions, and some others, MAD as defined here is half the interquartile range. (Punning on MAD I resist as a little too obvious.) – Nick Cox Mar 26 '19 at 08:23
-
3Also, note that software implementations of the median absolute deviation function can scale the MAD value by a constant factor from the form presented in this answer, so that its value coincides with the standard deviation for a normal distribution. – EdM Mar 26 '19 at 08:30
-
@EdM Excellent point. Personally I dislike that practice unless people use some different term. It's no longer the MAD! – Nick Cox Mar 26 '19 at 08:35
-
1@NickCox: the appeal of centring on the median is that the quantity always exists, whether or not the distribution enjoys a mean. This is the definition found in [Wikipedia](https://en.wikipedia.org/wiki/Median_absolute_deviation#The_population_MAD). – Xi'an Mar 26 '19 at 09:20
-
MAD is [Mutually Assured Destruction](https://en.wikipedia.org/wiki/Balance_of_terror) – kjetil b halvorsen Mar 26 '19 at 10:05
-
-
[When does a distribution not have a mean or a variance?](https://stats.stackexchange.com/questions/70088/when-does-a-distribution-not-have-a-mean-or-a-variance) – smci Mar 26 '19 at 12:58
3
There is already a solution for this question on Math.stackexchange:
I summarize the answers:
- You can use that the variance is $\overline{x^2} - \overline {x}^2$, which takes only one pass (computing the mean and the mean of the squares simultaneously), but can be more prone to roundoff error if the variance is small compared with the mean.
- How about sum of squared pairwise differences ? Indeed, you can check by direct computation that
$$ 2v_X = \frac{1}{n(n-1)}\sum_{1 \le i < j \le n}(x_i - x_j)^2. $$
- The sample variance without mean is calculated as: $$ v_{X}=\frac{1}{n-1}\left [ \sum_{i=1}^{n}x_{i}^{2}-\frac{1}{n}\left ( \sum_{i=1}^{n}x_{i} \right ) ^{2}\right ] $$

Ferdi
- 4,882
- 7
- 42
- 62