4

Consider a sample $x=\{x_1,...,x_n\}$. Define the average as $\bar x$. Consider the following formula:

$$ \dfrac{\sum_{i=1}^n\left(\dfrac{x_i}{\bar x} \right)^c}{n} $$

or equivalently:

$$ \dfrac{\sum_{i=1}^n\left(1 + \dfrac{\epsilon_i}{\bar x} \right)^c}{n} $$

where $\epsilon_i = x_i - \bar x$ and $c$ is a constant.

To me, these formulas "look like" a measure of dispersion from the mean. But I have not found to which known measure they resemble (at least nothing from this long list). So, my questions:

  1. Do they measure dispersion? Maybe for particular values of $c$ only, e.g. $c=2$ or $c=1$?

  2. If so, do these have a name?

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
luchonacho
  • 2,568
  • 3
  • 21
  • 38
  • Note that if c = 1, then you always get 1 as a result. (Assuming n = N.) – Daniel Dostal May 05 '18 at 14:09
  • For a standard normal and $c=1$, the value is 1. For $c < 1$, it is undefined with high probability and for $c>1$ it converges to infinity as $n$ grows. Maybe $x_i$ should be positive? – Michael M May 06 '18 at 08:43
  • @MichaelM $x_i$ is always positive in my case. So would you conclude from your analysis that the measure is not related to dispersion? Maybe you can add an answer? – luchonacho May 09 '18 at 15:42
  • In my opinion, I think the preferred term here is "spread" or "variability" and not "dispersion". If you could supply a source to where this formula came from, I'd be happy to take that into consideration. See the related discussion on meta [here](https://stats.meta.stackexchange.com/a/4836/8013). – AdamO May 24 '18 at 16:10

3 Answers3

3

To facilitate analysis, define the $c$-raw-moment estimator:

$$m_c \equiv \frac{1}{n} \sum_{i=1}^n x_i^c.$$

With a little algebra, your measure can be rewritten as:

$$r_c =\frac{m_c}{m_1^c} $$

Hence, your measure is the ratio of the $c$th raw sample moment, divided by the $c$th power of the first raw sample moment (the sample mean). This does not have any special name, since it is not in common use for any particular problem. (It is best identified descriptively as I have done above.) This would not be a good measure of dispersion (even for $c=2$), since the raw sample moments do not capture the dispersion well.

Ben
  • 91,027
  • 3
  • 150
  • 376
1

In my opinion it depends how you define the term measure of dispersion. I would define it as any statistics $S(x)$ with this property:

$$ S(cx) = c^2S(x) \quad \lor \quad S(cx) = |c|S(x) $$ for any $c \in R.$

If you use this definition, your statistic is not a measure of dispersion.

Daniel Dostal
  • 654
  • 3
  • 10
  • What is the logic of defining your measure of dispersion in that way and not another? – luchonacho May 13 '18 at 09:47
  • The first definition doesn't work, but the second one is at least a necessary criterion: it expresses the idea that any measure of dispersion must not depend on the units in which $x$ is given. Any other behavior than this would give rise to a "dispersion" that depends on how you write down the values of $x$--for instance, its "dispersion" in kilometers could differ from its dispersion in miles. In many applications that would be nonsensical, because the choice of units is usually considered arbitrary. – whuber May 23 '18 at 21:20
  • whuber: It's one definition. I've edited my response to make it more clear. I'm not sure if I understood your comment. The definition fits to almost any measure of dispersion (variance, SD, MAD, mean absolute deviation...). As I know dispersion should change when you switch from miles to kilometers (as well as location). I guess you mean relative dispersion with the word dispersion. – Daniel Dostal May 24 '18 at 15:56
1

As it turn out, using Taylor expansion of second degree can reveal a variance term. We have that:

$$ x_i \equiv \bar x + \epsilon_i$$

Which is:

$$ x_i = \bar x \left(1 + \frac{\epsilon_i}{\bar x}\right)$$

Define the second order Taylor expansion of $x_i^c$ around $\bar x$:

$$ x_i^c \approx \bar x^c \left(1 + c\frac{\epsilon_i}{\bar x} + \frac{c(c-1)}{2}\frac{\epsilon^2_i}{\bar x^2}\right)$$

Thus, it follows that:

$$ \sum_i x_i^c \approx \bar x^c \left(1 + \frac{c(c-1)}{2}\frac{nV(x)}{\bar x^2}\right)$$

since by definition $\sum_i \epsilon_i = 0$, and $\sum_i \epsilon_i^2 = nV(x)$.

This means, the formula I was interested in is then equivalent to:

$$ \frac{1}{n} + \frac{c(c-1)}{2} CV^2$$

where $CV^2 = \frac{V(x)}{\bar x^2}$, which is a nice measure of dimensionless dispersion!

luchonacho
  • 2,568
  • 3
  • 21
  • 38