6

Tweedie distributions are a family of distributions from the exponential dispersion family that have power-law mean-variance relationship:

\begin{align} \mathbb E[X] &= \mu \\ \operatorname{Var}[X]&=\phi \mu^p \end{align}

What is the formula for skewness?


For small integer values of $p$, these are well-known distributions (Gaussian, Poisson, gamma, inverse Gaussian). The most interesting case is $1<p<2$ which corresponds to the compound Poisson-Gamma distribution that has point mass at zero and is continuous for $x>0$. The formulas for the density as well as for the parameters of Poisson and Gamma in terms of $\mu$ and $\phi$ can be found e.g. in Dunn & @GordonSmyth (2005) Series evaluation of Tweedie exponential dispersion model densities.

I couldn't find the skewness formula anywhere so I derived it myself, and am posting this Q&A to share the result.

amoeba
  • 93,463
  • 28
  • 275
  • 317
  • In this question and your answer wherever you write "$0\lt p \lt 1$" it seems you [should be writing "$1\lt p \lt 2$."](https://en.wikipedia.org/wiki/Tweedie_distribution#Examples) Is this a correct impression? – whuber Jan 30 '18 at 00:21
  • 1
    Thank you @whuber, this was a mistake. I fixed the parameter range in both Q and A. – amoeba Jan 30 '18 at 08:01

1 Answers1

6

Exponential dispersion family is a broad family of distributions allowed in GLMs. The general form of the PDF can be written as follows:

$$f(x;\theta,\phi)=a(x,\phi)\exp\Big[\frac{1}{\phi}\big(x\theta-\kappa(\theta)\big)\Big].$$

The term $\kappa(\theta)$ is denoted with kappa because it is intimately related to the cumulants. Specifically, cumulant generating function (CGF) is given by

$$K(t;\theta,\lambda)=\frac{1}{\phi}\big(\kappa(\theta + t\phi)-\kappa(\theta)\big)$$

(see Wikipedia or Eq 2.6 in Jørgensen 1987, or Jørgensen's The Theory of Dispersion Models, 1997. Note that with $\phi=1$ the family reduces to the natural exponential family, see Wikipedia for its CGF.)

It follows that the first three cumulants are given by:

\begin{align} \kappa_1 &= \kappa'(\theta)\\ \kappa_2 &= \phi\kappa''(\theta)\\ \kappa_3 &= \phi^2\kappa'''(\theta) \end{align}

(Again note that for the natural exponential family cumulants are simply derivatives of $\kappa(\theta)$.)

For Tweedie distribution it must hold that

\begin{align} \kappa_1 &= \kappa'(\theta) = \mu\\ \kappa_2 &= \phi\kappa''(\theta) = \phi\mu^p \end{align}

so it follows that $$\kappa_3=\phi^2\kappa'''(\theta)=\phi^2(\kappa''(\theta))'=\phi^2(\mu^p)'=\phi^2p\mu^{p-1}\mu'=\phi^2p\mu^{p-1}\mu^p=\phi^2p\mu^{2p-1}.$$

Now we can compute skewness:

$$\operatorname{Skewness}[X]=\frac{\kappa_3}{\kappa_2^{3/2}}=\frac{\phi^2p\mu^{2p-1}}{(\phi\mu^p)^{3/2}}=\phi^{1/2}p\mu^{p/2-1}.$$

As a sanity check, this formula yields correct values for $p=0$, $p=1$, and $p=2$; these are skewness formulas for the Gaussian, Poisson, and gamma.

Let's verify that it works correctly for $1<p<2$:

# Tweedie random generation, using compound Poisson-Gamma representation
def tweediernd(n=1, p=1.5, phi=10, mu=1):
    # See Dunn & Smyth paper linked above for these formulas
    lambd = mu**(2-p)/(2-p)/phi   # Poisson rate
    alpha = -(2-p)/(1-p)          # gamma shape
    beta = phi*(p-1)*mu**(p-1)    # gamma scale

    x = np.zeros(n)
    for i in range(n):
        x[i] = np.sum(np.random.gamma(alpha, scale=beta, 
                      size=np.random.poisson(lambd)))
    return x

np.random.seed(42)
x = tweediernd(n=10000)
print('Mean:    ', np.mean(x))            # 1
print('Variance:', np.var(x))             # 10
print('Skewness:', scipy.stats.skew(x))   # sqrt(10)*1.5 = 4.74

This yields:

Mean:     0.996421833721
Variance: 9.86859188577
Skewness: 4.763172234662853
amoeba
  • 93,463
  • 28
  • 275
  • 317
  • 2
    [Clark and Thayer](https://www.casact.org/pubs/dpp/dpp04/04dpp117.pdf) give skewness and kurtosis but I bet they were not the first; I'd expect someone like Tweedie or Jorgensen. But anyway: Clark, David R. and Charles A. Thayer. 2004. “A Primer on the Exponential Family of Distributions.” CAS Discussion Paper Program, 117-148. Their skewness agrees with yours. – Glen_b Nov 14 '17 at 09:17
  • Oh, I should have mentioned ... its on the last page. – Glen_b Nov 14 '17 at 09:31
  • Yes, took me some time to find :-) I was scrolling downwards. – amoeba Nov 14 '17 at 09:33
  • I just searched for "Tweedie" in it. There's only about 4 hits – Glen_b Nov 14 '17 at 09:36
  • Oh, it's probably in Johnson & Kotz or Encyclopedia of statistical sciences. – Glen_b Nov 14 '17 at 09:37