12

The bivariate normal distribution with mean $\mu$ and covariance matrix $\Sigma$ can be re-written in polar coordinates with radius $r$ and angle $\theta$. My question is: What is the sampling distribution of $\hat{r}$, that is, of the distance from a point $x$ to the estimated center $\bar{x}$ given the sample covariance matrix $S$?

Background: The true distance $r$ from a point $x$ to mean $\mu$ follows a Hoyt distribution. With eigenvalues $\lambda_{1}, \lambda_{2}$ of $\Sigma$, and $\lambda_{1} > \lambda_{2}$, its shape parameter is $q=\frac{1}{\sqrt{(\lambda_{1}+\lambda_{2})/\lambda_{2})-1}}$, and its scale parameter is $\omega = \lambda_{1} + \lambda_{2}$. The cumulative distribution function is known to be the symmetric difference between two Marcum Q-functions.

Simulation suggests that plugging in estimates $\bar{x}$ and $S$ for $\mu$ and $\Sigma$ into the true cdf works for large samples, but not for small samples. The following diagram shows the results from 200 times

  • simulating 20 2D normal vectors for each combination of given $q$ ($x$-axis), $\omega$ (rows), and quantile (columns)
  • for each sample, calculating the given quantile of the observed radius $\hat{r}$ to $\bar{x}$
  • for each sample, calculating the quantile from the theoretical Hoyt (2D normal) cdf, and from the theoretical Rayleigh cdf after plugging in the sample estimates $\bar{x}$ and $S$.

enter image description here

As $q$ approaches 1 (the distribution becomes circular), the estimated Hoyt quantiles approach the estimated Rayleigh quantiles which are unaffected by $q$. As $\omega$ grows, the difference between the empirical quantiles and the estimated ones increases, notably in the tail of the distribution.

feetwet
  • 703
  • 1
  • 7
  • 24
caracal
  • 11,549
  • 49
  • 63
  • 1
    What's the question? – John Feb 03 '14 at 13:17
  • @John I highlighted the question: "What is the sampling distribution of [radius] $r$, that is, of the distance from a point $x$ to the estimated center $\bar{x}$ given the sample convariance matrix $S$?" – caracal Feb 03 '14 at 13:23
  • Why $\hat{r}$ as opposed to $\hat{r^2}$? – SomeEE Feb 14 '14 at 22:06
  • @MathEE $\hat{r}$ simply because the literature I know of is concerned with the distribution of (true) $r$, not (true) $r^{2}$. Note that this is unlike the situation with the Mahalanobis distance discussed in [this question](http://stats.stackexchange.com/q/20543/1909). Of course, results for the distribution of $\hat{r}^{2}$ would be very welcome. – caracal Feb 15 '14 at 10:05

1 Answers1

8

As you mentioned in your post we know the distribution of the estimate of $\widehat{r_{true}}$ if we are given $\mu$ so we know the distribution of the estimate $\widehat{r^2_{true}}$ of the true $r^2$.

We want to find the distribution of $$\widehat{r^2} = \frac{1}{N}\sum_{i=1}^N (x_i-\overline{x})^T(x_i-\overline{x})$$ where $x_i$ are expressed as column vectors.

We now do the standard trick

$$\begin{eqnarray*} \widehat{r^2_{true}} &=& \frac{1}{N}\sum_{i=1}^N(x_i - \mu)^T(x_i-\mu)\\ &=& \frac{1}{N}\sum_{i=1}^N(x_i-\overline{x} + \overline{x} -\mu)^T(x_i-\overline{x} + \overline{x}-\mu)\\ &=&\left[\frac{1}{N}\sum_{i=1}^N(x_i - \overline{x})^T(x_i-\overline{x})\right] + (\overline{x} - \mu)^T(\overline{x}-\mu) \hspace{20pt}(1)\\ &=& \widehat{r^2} + (\overline{x}-\mu)^T(\overline{x}-\mu) \end{eqnarray*} $$ where $(1)$ arises from the equation $$\frac{1}{N}\sum_{i=1}^N(x_i-\overline{x})^T(\overline{x}-\mu) = (\overline{x} - \overline{x})^T(\overline{x} - \mu) = 0$$ and its transpose.

Notice that $\widehat{r^2}$ is the trace of the sample covariance matrix $S$ and $(\overline{x}-\mu)^T(\overline{x}-\mu)$ only depends only on the sample mean $\overline{x}$. Thus we have written $$\widehat{r_{true}^2} = \widehat{r^2} + (\overline{x}-\mu)^T(\overline{x}-\mu)$$ as the sum of two independent random variables. We know the distributions of the $\widehat{r^2_{true}}$ and $(\overline{x} - \mu)^T(\overline{x}-\mu)$ and so we are done via the standard trick using that characteristic functions are multiplicative.

Edited to add:

$||x_i-\mu||$ is Hoyt so it has pdf $$f(\rho) = \frac{1+q^2}{q\omega}\rho e^{-\frac{(1+q^2)^2}{4q^2\omega} \rho^2}I_O\left(\frac{1-q^4}{4q^2\omega} \rho^2\right)$$ where $I_0$ is the $0^{th}$ modified Bessel function of the first kind.

This means that the pdf of $||x_i-\mu||^2$ is $$f(\rho) = \frac{1}{2}\frac{1+q^2}{q\omega}e^{-\frac{(1+q^2)^2}{4q^2\omega}\rho}I_0\left(\frac{1-q^4}{4q^2\omega}\rho\right).$$

To ease notation set $a = \frac{1-q^4}{4q^2\omega}$, $b=-\frac{(1+q^2)^2}{4q^2\omega}$ and $c=\frac{1}{2}\frac{1+q^2}{q\omega}$.

The moment generating function of $||x_i-\mu||^2$ is $$\begin{cases} \frac{c}{\sqrt{(s-b)^2-a^2}} & (s-b) > a\\ 0 & \text{ else}\\ \end{cases}$$

Thus the moment generating function of $\widehat{r^2_{true}}$ is $$\begin{cases} \frac{c^N}{((s/N-b)^2-a^2)^{N/2}} & (s/N-b) > a\\ 0 & \text{else} \end{cases}$$ and the moment generating function of $||\overline{x} - \mu||^2$ is $$\begin{cases} \frac{Nc}{\sqrt{(s-Nb)^2-(Na)^2}} = \frac{c}{\sqrt{(s/N-b)^2-a^2}} & (s/N-b) > a\\ 0 & \text{ else} \end{cases}$$

This implies that the moment generating function of $\widehat{r^2}$ is $$\begin{cases} \frac{c^{N-1}}{((s/N-b)^2-a^2)^{(N-1)/2}} & (s/N-b) > a\\ 0 & \text{ else}. \end{cases}$$

Applying the inverse Laplace transform gives that $\widehat{r^2}$ has pdf $$g(\rho) = \frac{\sqrt{\pi}Nc^{N-1}}{\Gamma(\frac{N-1}{2})}\left(\frac{2\mathrm{i} a}{N\rho}\right)^{(2 - N)/2} e^{b N \rho} J_{N/2-1}( \mathrm{i} a N \rho).$$

SomeEE
  • 744
  • 4
  • 11
  • Thank you! I'll have to work out the details before accepting. – caracal Feb 16 '14 at 18:43
  • $\widehat{r^{2}_{\text{true}}} \sim \text{Hoyt}$, and $||\bar{x}-\mu||^{2} \sim \mathcal{N}(0, \frac{1}{N}\Sigma)$? Then the characteristic function of $\widehat{r^{2}}$ is the product of the two characteristic functions as explained [here](http://en.wikipedia.org/wiki/Characteristic_function_%28probability_theory%29#Basic_manipulations_of_distributions). That indeed answers my question. Do you know how we might suitably transform $\widehat{r^{2}}$ such that its distribution is known without access to $\Sigma$? Like the Mahalanobis distance, or the univariate $t$ statistic? – caracal Feb 18 '14 at 13:26
  • I've edited my response to a full answer. Please let me know if you agree. – SomeEE Feb 18 '14 at 20:03
  • I am not sure about unknown $\Sigma$. The obvious thing to do would be to try to "divide" $\widehat{r^2}$ by the sample covariance $S$ which would look like a sum of Mahalanobis distances, i.e. consider $\frac{1}{N} \sum_{i=1}^N(x_i - \overline{x})^T S^{-1}(x_i-\overline{x})$. Unfortunately this sum is always $1$. – SomeEE Feb 18 '14 at 20:04
  • Thanks for continuing to work on the answer! I'm not sure about the distribution of $||x_{i}-\mu||^{2}$. I'm not able to do deal with this analytically, but a quick simulation of $r^{2}$ gives a different distribution than $\Gamma(q, \frac{\omega}{q})$: [R simulation code](https://gist.github.com/anonymous/9094590). Although it could well be that I don't correctly understand the $\Gamma$ parametrization. – caracal Feb 19 '14 at 15:43
  • Yes, my mistake. The resource I was using was denoting a Nakagami-m distribution with the parameters $q$ and $\omega$ which made me think I was working with a Nakagami-q (Hoyt). I'll see what I come up with. – SomeEE Feb 19 '14 at 17:18
  • Let me know if your simulation matches now. – SomeEE Feb 19 '14 at 20:57
  • Thank you for your edits! It will take me some time to cross-check with a simulation, life interferes... – caracal Feb 23 '14 at 10:09