Sampling distribution of the radius of 2D normal distribution

Question

The bivariate normal distribution with mean $\mu$ and covariance matrix $\Sigma$ can be re-written in polar coordinates with radius $r$ and angle $\theta$. My question is: What is the sampling distribution of $\hat{r}$, that is, of the distance from a point $x$ to the estimated center $\bar{x}$ given the sample covariance matrix $S$?

Background: The true distance $r$ from a point $x$ to mean $\mu$ follows a Hoyt distribution. With eigenvalues $\lambda_{1}, \lambda_{2}$ of $\Sigma$, and $\lambda_{1} > \lambda_{2}$, its shape parameter is $q=\frac{1}{\sqrt{(\lambda_{1}+\lambda_{2})/\lambda_{2})-1}}$, and its scale parameter is $\omega = \lambda_{1} + \lambda_{2}$. The cumulative distribution function is known to be the symmetric difference between two Marcum Q-functions.

Simulation suggests that plugging in estimates $\bar{x}$ and $S$ for $\mu$ and $\Sigma$ into the true cdf works for large samples, but not for small samples. The following diagram shows the results from 200 times

simulating 20 2D normal vectors for each combination of given $q$ ($x$-axis), $\omega$ (rows), and quantile (columns)
for each sample, calculating the given quantile of the observed radius $\hat{r}$ to $\bar{x}$
for each sample, calculating the quantile from the theoretical Hoyt (2D normal) cdf, and from the theoretical Rayleigh cdf after plugging in the sample estimates $\bar{x}$ and $S$.

enter image description here

As $q$ approaches 1 (the distribution becomes circular), the estimated Hoyt quantiles approach the estimated Rayleigh quantiles which are unaffected by $q$. As $\omega$ grows, the difference between the empirical quantiles and the estimated ones increases, notably in the tail of the distribution.

@John I highlighted the question: "What is the sampling distribution of [radius] $r$, that is, of the distance from a point $x$ to the estimated center $\bar{x}$ given the sample convariance matrix $S$?" — caracal, Feb 03 '14 at 13:23
@MathEE $\hat{r}$ simply because the literature I know of is concerned with the distribution of (true) $r$, not (true) $r^{2}$. Note that this is unlike the situation with the Mahalanobis distance discussed in [this question](http://stats.stackexchange.com/q/20543/1909). Of course, results for the distribution of $\hat{r}^{2}$ would be very welcome. — caracal, Feb 15 '14 at 10:05

SomeEE · Accepted Answer · 2014-02-20T05:24:15.367

As you mentioned in your post we know the distribution of the estimate of $\widehat{r_{true}}$ if we are given $\mu$ so we know the distribution of the estimate $\widehat{r^2_{true}}$ of the true $r^2$.

We want to find the distribution of $$\widehat{r^2} = \frac{1}{N}\sum_{i=1}^N (x_i-\overline{x})^T(x_i-\overline{x})$$ where $x_i$ are expressed as column vectors.

We now do the standard trick

$$\begin{eqnarray*} \widehat{r^2_{true}} &=& \frac{1}{N}\sum_{i=1}^N(x_i - \mu)^T(x_i-\mu)\\ &=& \frac{1}{N}\sum_{i=1}^N(x_i-\overline{x} + \overline{x} -\mu)^T(x_i-\overline{x} + \overline{x}-\mu)\\ &=&\left[\frac{1}{N}\sum_{i=1}^N(x_i - \overline{x})^T(x_i-\overline{x})\right] + (\overline{x} - \mu)^T(\overline{x}-\mu) \hspace{20pt}(1)\\ &=& \widehat{r^2} + (\overline{x}-\mu)^T(\overline{x}-\mu) \end{eqnarray*} $$ where $(1)$ arises from the equation $$\frac{1}{N}\sum_{i=1}^N(x_i-\overline{x})^T(\overline{x}-\mu) = (\overline{x} - \overline{x})^T(\overline{x} - \mu) = 0$$ and its transpose.

Notice that $\widehat{r^2}$ is the trace of the sample covariance matrix $S$ and $(\overline{x}-\mu)^T(\overline{x}-\mu)$ only depends only on the sample mean $\overline{x}$. Thus we have written $$\widehat{r_{true}^2} = \widehat{r^2} + (\overline{x}-\mu)^T(\overline{x}-\mu)$$ as the sum of two independent random variables. We know the distributions of the $\widehat{r^2_{true}}$ and $(\overline{x} - \mu)^T(\overline{x}-\mu)$ and so we are done via the standard trick using that characteristic functions are multiplicative.

Edited to add:

$||x_i-\mu||$ is Hoyt so it has pdf $$f(\rho) = \frac{1+q^2}{q\omega}\rho e^{-\frac{(1+q^2)^2}{4q^2\omega} \rho^2}I_O\left(\frac{1-q^4}{4q^2\omega} \rho^2\right)$$ where $I_0$ is the $0^{th}$ modified Bessel function of the first kind.

This means that the pdf of $||x_i-\mu||^2$ is $$f(\rho) = \frac{1}{2}\frac{1+q^2}{q\omega}e^{-\frac{(1+q^2)^2}{4q^2\omega}\rho}I_0\left(\frac{1-q^4}{4q^2\omega}\rho\right).$$

To ease notation set $a = \frac{1-q^4}{4q^2\omega}$, $b=-\frac{(1+q^2)^2}{4q^2\omega}$ and $c=\frac{1}{2}\frac{1+q^2}{q\omega}$.

The moment generating function of $||x_i-\mu||^2$ is $$\begin{cases} \frac{c}{\sqrt{(s-b)^2-a^2}} & (s-b) > a\\ 0 & \text{ else}\\ \end{cases}$$

Thus the moment generating function of $\widehat{r^2_{true}}$ is $$\begin{cases} \frac{c^N}{((s/N-b)^2-a^2)^{N/2}} & (s/N-b) > a\\ 0 & \text{else} \end{cases}$$ and the moment generating function of $||\overline{x} - \mu||^2$ is $$\begin{cases} \frac{Nc}{\sqrt{(s-Nb)^2-(Na)^2}} = \frac{c}{\sqrt{(s/N-b)^2-a^2}} & (s/N-b) > a\\ 0 & \text{ else} \end{cases}$$

This implies that the moment generating function of $\widehat{r^2}$ is $$\begin{cases} \frac{c^{N-1}}{((s/N-b)^2-a^2)^{(N-1)/2}} & (s/N-b) > a\\ 0 & \text{ else}. \end{cases}$$

Applying the inverse Laplace transform gives that $\widehat{r^2}$ has pdf $$g(\rho) = \frac{\sqrt{\pi}Nc^{N-1}}{\Gamma(\frac{N-1}{2})}\left(\frac{2\mathrm{i} a}{N\rho}\right)^{(2 - N)/2} e^{b N \rho} J_{N/2-1}( \mathrm{i} a N \rho).$$

Thank you! I'll have to work out the details before accepting. — caracal, Feb 16 '14 at 18:43
$\widehat{r^{2}_{\text{true}}} \sim \text{Hoyt}$, and $||\bar{x}-\mu||^{2} \sim \mathcal{N}(0, \frac{1}{N}\Sigma)$? Then the characteristic function of $\widehat{r^{2}}$ is the product of the two characteristic functions as explained [here](http://en.wikipedia.org/wiki/Characteristic_function_%28probability_theory%29#Basic_manipulations_of_distributions). That indeed answers my question. Do you know how we might suitably transform $\widehat{r^{2}}$ such that its distribution is known without access to $\Sigma$? Like the Mahalanobis distance, or the univariate $t$ statistic? — caracal, Feb 18 '14 at 13:26
I've edited my response to a full answer. Please let me know if you agree. — SomeEE, Feb 18 '14 at 20:03
I am not sure about unknown $\Sigma$. The obvious thing to do would be to try to "divide" $\widehat{r^2}$ by the sample covariance $S$ which would look like a sum of Mahalanobis distances, i.e. consider $\frac{1}{N} \sum_{i=1}^N(x_i - \overline{x})^T S^{-1}(x_i-\overline{x})$. Unfortunately this sum is always $1$. — SomeEE, Feb 18 '14 at 20:04
Thanks for continuing to work on the answer! I'm not sure about the distribution of $||x_{i}-\mu||^{2}$. I'm not able to do deal with this analytically, but a quick simulation of $r^{2}$ gives a different distribution than $\Gamma(q, \frac{\omega}{q})$: [R simulation code](https://gist.github.com/anonymous/9094590). Although it could well be that I don't correctly understand the $\Gamma$ parametrization. — caracal, Feb 19 '14 at 15:43
Yes, my mistake. The resource I was using was denoting a Nakagami-m distribution with the parameters $q$ and $\omega$ which made me think I was working with a Nakagami-q (Hoyt). I'll see what I come up with. — SomeEE, Feb 19 '14 at 17:18
Thank you for your edits! It will take me some time to cross-check with a simulation, life interferes... — caracal, Feb 23 '14 at 10:09

Sampling distribution of the radius of 2D normal distribution

1 Answers1

Linked