3

There are a number of questions on this site that ask for the asymptotic distribution or moments of some function of the sample mean for IID data (see e.g., here, here, here, here and here). All these questions are for specific functions and distributions, but they all seem to employ the same general method. So, is there is any general result that can be applied for an arbitrary function of the sample mean for IID data from an arbitrary distribution?

Suppose you have some IID data $X_1,X_2,X_3,... \sim \text{IID Dist}$ from a fixed distribution (which doesn't have to be the normal distribution) and you form the sample mean $\bar{X}_n$ from the first $n$ data points. Suppose you also have some arbitrary function $g$. Is there any general form for the asymptotic distribution of $g(\bar{X}_n)$?

Ben
  • 91,027
  • 3
  • 150
  • 376

1 Answers1

4

The question is a little bit too general in its present form to get a useful result. Nevertheless, with some slight restrictions we can get a useful general form for the asymptotic distribution using the delta method. To do this, let's assume that the underlying distribution for the data has a finite mean $\mu$ and finite variance $\sigma^2$. This allows us to apply the central limit theorem to get the asymptotic distribution $\bar{X}_n \sim \text{N}(\mu, \sigma^2/n)$, which means that $\bar{X}_n \rightarrow \mu$ as $n \rightarrow \infty$. Since the sample mean gets closer and closer to the true mean in the limit, we can see that the asymptotic distribution of $g(\bar{X}_n)$ will be fully determined by the local behaviour of the function $g$ in a neighbourhood of the point $\mu$.

To proceed further, let's make some mild assumptions about this local behaviour. Specifically, we will reduce the scope of the allowable functions by assuming that $g$ is an analytic function at $\mu$ (i.e., the function $g$ is infinitely differentiable at this point and it is representable by its Taylor series representation at this point). Thus, for all points $x$ in a neighbourhood of $\mu$ we can write the function $g$ as:

$$g(x) = g(\mu) + \sum_{k=1}^\infty \frac{g^{(k)}(\mu)}{k!} \cdot (x-\mu)^k.$$

The asymptotic distribution depends on which derivatives of the function are zero at the point $\mu$. Without loss of generality, let $K = \min \{ k = 1,2,3,... | g^{(k)}(\mu) \neq 0 \}$ denote the order of the first non-zero derivative of the function at the mean of the random variable under analysis, which means we have $g^{(1)}(\mu) = \cdots = g^{(K-1)}(\mu) = 0$. To facilitate our analysis we also define the standardised sample mean $Z_n \equiv \sqrt{n} (\bar{X}_n-\mu)/\sigma$. We can now write the Taylor expansion of interest as:

$$\begin{aligned} g(\bar{X}_n) &= g(\mu) + \sum_{k=K}^\infty \frac{g^{(k)}(\mu)}{k!} \cdot (\bar{X}_n-\mu)^k \\[6pt] &= g(\mu) + \sum_{k=K}^\infty \frac{g^{(k)}(\mu)}{k!} \cdot \sigma^k \cdot \frac{Z_n^k}{n^{k/2}}. \\[6pt] \end{aligned}$$

As we take $n \rightarrow \infty$ we can apply the classical central limit theorem to get the asymptotic distribution $Z_n \sim \text{N}(0,1)$, and so the higher order terms in this expansion will converge to zero much more rapidly than the term of order $K$. The asymptotic distribution of our function at the sample mean will be determined by the $K$th term of the Taylor series:

$$\begin{aligned} g(\bar{X}_n) &\sim g(\mu) + \frac{g^{(K)}(\mu)}{K!} \cdot \sigma^K \cdot \frac{Z_n^K}{n^{K/2}}. \\[6pt] \end{aligned}$$

We can see from this result that the asymptotic distribution is heavily dependent on the order value $K$. If $K=1$ then the asymptotic distribution will be a normal distribution, if $K=2$ then the asymptotic distribution will be a scaled chi-squared distribution, if $K=3$ then the asymptotic distribution is for the scaled version of the cube of a normal random variable (see here for discussion), and so on.

Ben
  • 91,027
  • 3
  • 150
  • 376