4

When summarizing a location in a distribution, we can take the mean into account by simply calculating $x-\mu$. We can take the standard deviation into account by calculating a Z score $(x-\mu)/\sigma$. Is there a typical way to take higher moments into account?

Thomas Johnson
  • 741
  • 5
  • 14
  • I do not understand what you mean by "take into account". Given $\mu$ and $\sigma$, when you specify $z = (x-\mu)/\sigma$ you *uniquely determine $x = \sigma z + \mu$*. What more do you want? – whuber Dec 12 '14 at 16:16
  • @whuber That's true, and you construct a similar expression given just $\mu$. But standardizing by calculating Z is often more useful that just using $\mu$. So my question is, is there an expression that standardizes not just by the standard deviation, but also by the skewness? Is there an expression that standardizes by the skewnews and kurtosis? Etc. – Thomas Johnson Dec 12 '14 at 16:26
  • 1
    Sure: you can standardize by any two statistics that locate the distribution and provide a scale for it. (Although the skewness will not work--it tells you nothing at all about the scale--the cube root of the absolute third central moment will.) The possibilities are infinite. What you need to tell us is the *why* of your question: what is the purpose of doing this? That would give us information to recommend choices of those statistics. – whuber Dec 12 '14 at 18:11
  • It's largely an exercise in feature engineering. I'm trying to create useful features for a classifier. I'm hoping to provide more information summarizing the location of the point in a (highly non-normal) empirical distribution, without overwhelming the classifier by providing tons of high-noise features. – Thomas Johnson Dec 12 '14 at 18:40
  • You won't do it this way! I think you would be better served by asking the question you really have in mind, rather than proposing something that will not help you at all and asking for commentary on it. – whuber Dec 12 '14 at 19:47
  • @whuber Why do you say it won't help me at all? – Thomas Johnson Dec 13 '14 at 00:10
  • If you are creating features according to values of the probability density (or estimates thereof), the moments tell you extremely little, as explained at http://stats.stackexchange.com/a/84213. – whuber Dec 13 '14 at 00:22
  • @whuber Thanks, that question is very helpful! I'm still intellectually interested in this question – Thomas Johnson Dec 13 '14 at 20:52

1 Answers1

3

I am not aware of such expressions containing higher moments. However, if your aim is to summarize by a single number how $x$ relates to the distribution, taking into account the shape of the distribution beyond mean and variance, I suggest reporting the Cumulative distribution function, that is, the probability \begin{equation} P(X \leq x) \end{equation} where $X$ is a random variable following your distribution.

Juho Kokkala
  • 7,463
  • 4
  • 27
  • 46