Definition of sample excess kurtosis?

Question

Wikipedia http://en.wikipedia.org/wiki/Kurtosis#Sample_kurtosis calculates the sample excess kurtosis as

length(x) * sum((x - mean(x))^4)/(sum((x - mean(x))^2)^2) - 3

On the other hand, I learned from https://stackoverflow.com/a/21484052/156458 that, the function kurtosis form R package fBasics (actually from timeDate) calculate the sample excess kurtosis as:

sum((x - mean(x))^4)/as.numeric(var(x))^2)/length(x) - 3

where var() is the unbiased, so it is equivalent to

(length(x)-1)^2/length(x) * sum((x - mean(x))^4)/(sum((x - mean(x))^2)^2) - 3

So I wonder if the function kurtosis in the packages fBasics and timeDate is wrong? Thanks!

Alecos Papadopoulos · Accepted Answer · 2014-02-01T03:45:41.600

7

The Wikipedia equation uses the biased, maximum likelihood estimator for the sample variance (divide by $n$), while, as you say the function from the R-packages uses the bias-corrected formula (divide by $n-1$).

The difference in the two formulas is therefore that the maximum likelihood formula of Wikipedia multiplies the moment quotient by $n$, while the R-function multiplies by

$$ \frac {(n-1)^2}{n} = \frac {n^2 -2n +1}{n} = n -2 + \frac 1n$$

The paper Joanes, D. N., & Gill, C. A. (1998). Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician), 47(1), 183-189.

discusses in some detail different formulas for sample kurtosis (and skewness). They write that the Wikipedia formula you mention - call it "$g_2$" is typical of "older times" (citing Cramer 1946 textbook). Also they report that (at least when they wrote the paper), MINITAB and BMDP were using the same formula as R/timeDate -call it "$b_2$", while SAS, SPSS and Excel were using a third formula for kurtosis (which you can find in the Wikipedia article under "estimation of population kurtosis"), call it "$G_2$".

They perform various tests and they find: 1) In small samples, the differences may be even a reversion in sign (i.e. one formula to give negative kurtosis the other positive...). 2) For normal populations we have in mean-squared error, $$MSE(g_2) < MSE (b_2) < MSE (G_2) $$ For sample size $100$ the MSE-differences are negligible (+1%) to small (+9%).

3) They also compare the three in MSE terms, for a chi-square for various degrees of freedom and various sample sizes. Here the main results are:
a) For sample size $100+$ the three equations are very close, irrespective of degrees of freedom.
b) For smaller sample sizes, the ranking changes, with $G_2$ being better for d.f=$1$, and $b_2$ being better when d.f. are $50$ or $100$.

Overall, no clear "winner" emerges from this paper.

edited Feb 01 '14 at 03:45

answered Feb 01 '14 at 03:15

Alecos Papadopoulos

52,923
5
131
241

Thanks! (1) On page 9 of [Analysis of Financial Time Series by Ruey S. Tsay](http://books.google.com/books?id=OKUGARAXKMwC&lpg=PP1&dq=financial%20time%20series%20tsay&pg=PA9#v=onepage&q&f=false), the factor in their sample kurtosis is $(length(x)-1)$. Is it a different definition from those in your post? (2) Ironically, the book used fBasics and timeDate to calculate sample excess kurtosis on [page 12](http://books.google.com/books?id=OKUGARAXKMwC&lpg=PP1&dq=financial%20time%20series%20tsay&pg=PA12#v=onepage&q&f=false). Is this an inconsistency? – Tim Feb 01 '14 at 03:30
Yes, the formula on page 9 is yet a 4th variant, and so it seems inconsistent with the results on page 12, if the latter use the "b_2" formula. – Alecos Papadopoulos Feb 01 '14 at 03:36
Thanks! In the version by fBasics and timeDates packages, what is the rationale or purpose of having a *biased* estimator for the 4-th central moments in the numerator and an *unbiased* estimator for the variance in the denominator? – Tim Feb 02 '14 at 02:42
I can't really tell. Sometimes these things happen by accident. As for example, everybody knows about the bias-corrected version for the variance, but we don't think much about bias for higher moments. So the function was built by using the sample analogue for the fourth moment and the already created function for the variance -which had the bias correction term. – Alecos Papadopoulos Feb 02 '14 at 02:45
Thanks! Do you know what is the purpose or rational for the version on page 9 of Analysis of Financial Time Series by Ruey S. Tsay, where the factor in their sample kurtosis is (n−1)? – Tim Feb 02 '14 at 02:52
That looks like the general principle "subtract one when estimating one more parameter" to reflect loss of "degrees of freedom". – Alecos Papadopoulos Feb 02 '14 at 02:59
Thanks! In [Tsay's book](http://books.google.com/books?id=OKUGARAXKMwC&lpg=PP1&dq=financial%20time%20series%20tsay&pg=PA10#v=onepage&q&f=false), he says under the normality assumption, his estimates of skewness and of kurtosis "are distributed asymptotically as normal with zero mean and variances 6/n and 24/n , respectively; see Snedecor and Cochran (1980, p. 78)." I wonder if the null distribution for the other three types of estimates that we have pointed out so far all have the same asymptotic normal distributions? – Tim Feb 03 '14 at 00:56

Glen_b · Answer 2 · 2014-02-01T14:20:08.153

Neither version is "wrong" any more than the fact that there is more than one sample standard deviation makes one of them wrong.

e.g. sample standard deviations include:

the ML version ($s_n$),
the variance-unbiased version ($s_{n-1}$),
there are also the MMSE-for-variance, the unbiased-for-standard-deviation version, the approximately-unbiased-for-standard-deviation version (all usually taken at the normal, giving divisors of $n+1$ and $n-1.5$ for the first and last respectively and a complicated expression for the middle one, which can be worked out from the expression for $c_4$ at the link), and some others.

None of them are 'wrong' as definitions of sample standard deviation. They're all sample estimates, and they all may legitimately be called 'standard deviation'. One of them is more common, but that doesn't make it more correct.

The same applies to other sample moments and quantities based on them, including the kurtosis, the skewness, sample covariances, measures of sample correlation, and so on.

Definition of sample excess kurtosis?

2 Answers2

Linked