1

Way back when I was still a student, I was listening to one of my stats lecturers talk about robust statistics. He showed (on the board) a series of transformations/derivations which concludes that it is easier to measure standard deviation, than it is to measure the mean. I can't remember the details, but the conclusion has stuck with me. It might have had something to do with influence functions or something similar.

Anyone have any idea about what his reasoning could have been?

Chechy Levas
  • 891
  • 6
  • 17
  • 1
    "Easier" in what sense? – whuber Jun 05 '18 at 15:55
  • Tbh, I don’t remember exactly what sense but it was probably along the lines of being less sensitive to outliers. I realise I am not being very clear. I might dig out some of the old files I have related to the course to see what his argument was. If I find it I will post it here. – Chechy Levas Jun 05 '18 at 16:19
  • 1
    If it's sensitivity to outliers, then the opposite conclusion must hold. – whuber Jun 05 '18 at 16:20
  • OK I found it. It relates to a comparison of the standard error of the mean (from a sample used as an estimate of the population mean) vs the standard error of the standard deviation (from a sample used as an estimate of the population standard deviation). Indeed, the standard error of the mean is bigger than the standard error of the standard deviation. So in that sense it is easier to estimate standard deviations, than means. I will put some details as an answer shortly. – Chechy Levas Jun 05 '18 at 17:12

2 Answers2

1

I found some lecture notes from the course I asked about. I will relate the core idea here, as best I can.

First, I am going to assume any reader is familiar with the standard error of the sample mean: $$se(\mu) =\frac{\sigma}{\sqrt n}$$

Second, from here we get that the standard error of the sample variance for any distribution is: $$se(\sigma^2)=\sqrt{\frac 1 n (\mu_4-\frac{n-3}{n-1}\sigma^4)}$$ where $\mu_4$ is the fourth central moment: $E(X-\mu)^4$.

For a normal distribution $\mu_4 = 3\sigma^4$ so this simplifies to $$se(\sigma^2)=\frac{\sqrt 2 \sigma^2}{\sqrt{n-1}}$$

Lastly, use the delta method to derive an approximation of the standard error of a transformed parameter with a known standard error. Here the parameter with the known standard error is variance, and the transformation is $g(\theta)=\sqrt \theta $. The link in the second point uses the delta method for just this purpose, and arrive at a result of

$$se(\sigma)\approx\frac 1 {2\sigma}se(\sigma^2)$$

Substitute in the value of $se(\sigma^2)$ and we get

$$se(\sigma)\approx\frac \sigma {\sqrt {2(n-1)}}$$

So the conclusion is that the standard error of the sample standard deviation is smaller than the standard error of the sample mean by a factor of $\frac 1 {\sqrt 2}$, for large n, for a normal distribution, and so is in this (very limited) sense, easier to estimate.

So the result isn't as general as what my memory suggested, but interesting nonetheless.

PS: the result holds for any distribution and sample size where kurtosis is small enough. Specifically when

$$\frac{\mu_4}{\sigma^4} < 4 + \frac{n-3}{n-1}$$

Chechy Levas
  • 891
  • 6
  • 17
  • 1
    I don't see how you obtain your expression for the se of $\sigma^2$ from the reference you give. Indeed, it's not correct: the se generally depends on the central *fourth* moment of the underlying distribution. The delta method is only an approximation, further limiting the generality of your conclusion. – whuber Jun 05 '18 at 19:41
  • @whuber I should have mentioned that the formula is only valid for a normal distribution (see the question in the link). The full formula, valid for any distribution (which does depend on some moments from the underlying distribution) is given in the answer in the link. I will edit this answer to mention this, and the fact that the delta method is only an approximation. – Chechy Levas Jun 06 '18 at 05:03
  • +1 The clear analysis in the edited version is convincing. Thank you for sharing it! – whuber Jun 06 '18 at 14:44
0

It can't be about JUST 'measuring' them. The main reason is that for 'measuring' standard deviation, you need to first 'measure' mean. Thus, the statement mentioned in your question doesn't make much sense.

Also, mean is a measure of central tendency and the standard deviation is a measure of spread. They both need to be studied together to better understand the distribution of a variable.

I think the lecturer actually meant something else and perhaps you interpreted it in a different way!

Cheers

  • The lecturer was commenting that, for a normal distribution, the standard error of the standard deviation is smaller than the standard error of the mean. So, loosely speaking, it is easier to measure standard deviation than it is to measure the mean. – Chechy Levas Jun 06 '18 at 06:23
  • In fact, the standard deviation can be defined without reference to the mean. So, there is no need to measure the mean first. It's a matter of pedagogy and convention to define it in terms of the mean. I'll add a reference when I can. – Nick Cox Jun 06 '18 at 07:18
  • See Stuart, A. and Ord, J.K. 1994. _Kendall's Advanced Theory of Statistics. Volume I: Distribution Theory._ London: Edward Arnold p.59 for the variance being half the mean square of all possible pairwise differences (and so the SD being the square root of that). This definition does not mention the mean. – Nick Cox Jun 06 '18 at 10:19
  • I think that's a different way of saying this: say we have two variables a, b. pairwise difference = (a-b), mean square = ((a-b)^2)/2. Variance = ((a-(a+b)/2)^2 + (b-(a+b)/2)^2) /2 = ((a-b)^2)/2. Thus, both are same...Its an alternate way of writing var = E (X^2) - (E(X))^2...it essentially includes mean in the definition... – Sameer Saurabh Jun 06 '18 at 11:06
  • 1
    Sameer, you can estimate a spread without estimating the mean at all. One approach is to use a multiple of an interquartile range, for instance. But if you want the SD itself, it too can be computed without ever computing the mean, by using an algebraically equivalent formula in terms of all *differences* of the data values. – whuber Jun 06 '18 at 14:02