8

What is the expected magnitude, i.e. euclidean distance from the origin, of a vector drawn from a p-dimensional spherical normal $\mathcal{N}_p(\mu,\Sigma)$ with $\mu=\vec{0}$ and $\Sigma=\sigma^2 I$, where $I$ is the identity matrix?

In the univariate case this boils down to $E[|x|]$, where $x \sim \mathcal{N}(0,\sigma^2)$. This is the mean $\mu_Y$ of a folded normal distribution with mean $0$ and variance $\sigma^2$, which can be calculated as:

$\mu_Y = \sigma \sqrt{\frac{2}{\pi}} \,\, \exp\left(\frac{-\mu^2}{2\sigma^2}\right) - \mu \, \mbox{erf}\left(\frac{-\mu}{\sqrt{2} \sigma}\right) \stackrel{\mu=0}{=} \sigma \sqrt{\frac{2}{\pi}}$

Since the multivariate normal is spherical, I thought about simplifying the problem by switching to polar coordinates. Shouldn't be the distance from the origin in any direction be given by a folded normal distribution? Could I integrate over all distances, multiply with the (infinitesimal) probability to encounter a sample with that distance (e.g. CDF(radius)-CDF(radius-h), $h \rightarrow 0$) and finally make the leap to more than one dimension by multiplying with the "number of points" on a hypersphere of dimension $p$? E.g. $2 \pi r$ for a circle, $4 \pi r^2$ for a sphere? I feel that this might be a simple question, but I'm not sure how to analytically express the probability for $h \rightarrow 0$.

Simple experiments suggest that the expected distance follows the form $c\sqrt{\sigma}$, but I'm stuck on how to make the leap to a multivariate distribution. By the way, a solution for $p \le 3$ would be fine.

enter image description here

cw'
  • 282
  • 2
  • 7

2 Answers2

8

The sum of squares of $p$ independent standard normal distributions is a chi-squared distribution with $p$ degrees of freedom. The magnitude is the square root of that random variable. It is sometimes referred to as the chi distribution. (See this Wikipedia article.) The common variance $\sigma^2$ is a simple scale factor.

Incorporating some of the comments into this answer:

The mean of the chi-distribution with $p$ degrees of freedom is $$ \mu=\sqrt{2}\,\,\frac{\Gamma((p+1)/2)}{\Gamma(p/2)} $$

Special cases as noted:

For $p=1$, the folded normal distribution has mean $\frac{\sqrt{2}}{\Gamma(1/2)}=\sqrt{\frac{2}{\pi}}$.

For $p=2$, the distribution is also known as the Rayleigh distribution (with scale parameter 1), and its mean is $\sqrt{2}\frac{\Gamma(3/2)}{\Gamma(1)}=\sqrt{2}\frac{\sqrt{\pi}}{2} = \sqrt{\frac{\pi}{2}}$.

For $p=3$, the distribution is known as the Maxwell distribution with parameter 1; its mean is $\sqrt{\frac{8}{\pi}}$.

When the common variance $\sigma^2$ is not 1, the means must be multiplied by $\sigma$.

user3697176
  • 852
  • 4
  • 10
  • 2
    More specifically, the distribution for the Euclidean distance to the origin with $p=2$ is called the Rayleigh distribution, with $p=3$ the Maxwell-Boltzmann distribution (both are chi distributions). – caracal Aug 17 '15 at 07:47
  • Perhaps one should add that the mean of the chi distribution is equal to $$\mu=\sqrt{2}\frac{\Gamma((p+1)/2)}{\Gamma(p/2)},$$ and OP needs to multiply it with $\sigma$. – amoeba Aug 18 '15 at 22:53
5

The answer by user3697176 gives all the needed information, but nonetheless, here is a slightly different view of the problem.

If $X_i \sim N(0,\sigma^2)$, then $Y = \sum_{i=1}^n X_i^2$ has a Gamma distribution with parameters $\left(\frac n2, \frac{1}{2\sigma^2}\right)$. Now, if $W \sim \Gamma(t,\lambda)$, then $$f_W(w) = \frac{\lambda(\lambda w)^{t-1}}{\Gamma(t)}\exp(-\lambda w)\mathbf 1_{\{w\colon w > 0\}}$$ which of course enjoys the property that the area under the curve is $1$. This helps us find $E[\sqrt{W}]$ without actually explicitly evaluating an integral. We have that \begin{align} E[\sqrt{W}] &= \int_0^\infty \sqrt{w}\cdot \frac{\lambda(\lambda w)^{t-1}}{\Gamma(t)}\exp(-\lambda w)\, \mathrm dw\\ &= \frac{1}{\sqrt{\lambda}}\cdot\frac{\Gamma(t+\frac 12)}{\Gamma(t)} \int_0^\infty \frac{\lambda(\lambda w)^{t+\frac 12 -1}}{\Gamma(t+\frac 12)} \exp(-\lambda w)\, \mathrm dw\\ &= \frac{1}{\sqrt{\lambda}}\cdot\frac{\Gamma(t+\frac 12)}{\Gamma(t)}. \end{align} Applying this to $Y$, we get that $$E\left[\sqrt{X_1^2+X_2^2+\cdots+X_n^2}\right] = \sqrt{2} \frac{\Gamma\left(\frac{n+1}{2}\right)}{\Gamma\left(\frac{n}{2}\right)}\sigma.$$ Those Gamma functions can be simplified further and we will always get a $\Gamma(1/2) = \sqrt{\pi}$ in the denominator or the numerator according as $n$ is odd or even.

Dilip Sarwate
  • 41,202
  • 4
  • 94
  • 200