6

Suppose $x\in R^n$ follows the uniform distribution on the surface of a unit sphere centered at 0 and $a$ is a vector in $R^n$. What can we say about the distribution of the following quantity?

$$\sum_i x_i^2 a_i $$

Mean seems to be $\frac{\sum_i a_i}{n}$, what about the variance?

This quantity is the value of quadratic form with eigenvalues $a_1,a_2,\ldots$ when evaluated at some point on the surface of unit sphere.

enter image description here notebook

Yaroslav Bulatov
  • 5,167
  • 2
  • 24
  • 38
  • So are you only asking about the variance of $\sum_i x_i^2 a_i$? Also, what exactly do you mean by "sample from Gaussian, normalize"? – mhdadk Jun 24 '21 at 23:51
  • 1
    I mean that it's a uniform distribution on a sphere, I've edited with a more precise definition from Mardia – Yaroslav Bulatov Jun 24 '21 at 23:53
  • Is $\langle x^2,a\rangle$ geometrically meaningful? Or where and how does it arise? – Matt F. Jun 25 '21 at 01:14
  • 1
    @MattF. it's the value of $x'Ax$ where $A$ is a diagonal matrix with $a_i$ as diagonal entries – Yaroslav Bulatov Jun 25 '21 at 01:19
  • 2
    The variance is $$\frac{2}{n^2(n+2)}\sum_{i – Matt F. Jun 25 '21 at 02:53
  • Looks interesting! Proof/reference would be great. Another interpretation, it's the average value on unit circle for quadratic centered at 0 and Hessian A – Yaroslav Bulatov Jun 25 '21 at 03:04
  • Just to be clear, we are talking about a uniform random point on the *surface* of a sphere (as opposed to the interior of the ball), right? – Ben Jun 25 '21 at 07:50
  • @Ben, yes, as you can see from the stated mean. Eg in the case $n=2$, $a=(1,0)$, the expectation on the (edge of the) circle is $\int (\cos \theta)^2 d\theta/(2\pi)=1/2$, agreeing with the expression given, while the expectation over the interior is $\iint (r \cos \theta)^2 r \, dr \, d\theta/\pi=1/4$. – Matt F. Jun 25 '21 at 08:53
  • 2
    A multiple of $(x_1^2,x_2^2\ldots,x_p^2)$ has a Dirichlet$(1/2,1/2,\ldots,1/2)$ distribution when the sphere is centered at the origin (as strongly implied by your quotation). That will lend itself to further analysis. It yields the moments easily. – whuber Jun 25 '21 at 14:03

1 Answers1

2

Let's start with $m(a)=E[\langle x^2,a\rangle]$ and $s(a)=E[\langle x^2,a\rangle^2]$, so that the variance will be $s(a)-m(a)^2$.

We will calculate these by expressing $x$ in $n$-dimensional spherical coordinates, so \begin{align} & & x_1&=\cos(\phi_1)\\ 0\ \le\ &\phi_1\le\pi & x_2&=\sin(\phi_1)\cos(\phi_2)\\ 0\ \le\ &\phi_2\le\pi & x_3&=\sin(\phi_1)\sin(\phi_2)\cos(\phi_3)\\ &\vdots & \vdots\\ 0\ \le\ & \phi_{n-1}\le\pi & x_{n-1}&=\sin(\phi_1)\cdots\sin(\phi_{n-2})\cos(\phi_{n-1})\\ 0\ \le\ &\phi_n\le2\pi & x_n&=\sin(\phi_1)\cdots\sin(\phi_{n-2})\sin(\phi_{n-1})\\ \end{align} and the element of $n-1$-dimensional surface area is $$dS = \sin^{n-2}(\phi_1)\sin^{n-3}(\phi_2)\cdots \sin(\phi_{n-2})\ d\phi_1\, d\phi_2 \cdots d\phi_n$$

We being with some special cases, letting $b_i$ be the vector that agrees with $a$ on the $i^{th}$ coordinate and is zero elsewhere. First, \begin{align} m(b_1)&=\frac {\int\cdots\int a_1\cos^2(\phi_1)\, dS} {\int\cdots\int dS}\\ &=\frac{\int a_1\cos^2(\phi_1)\sin^{n-2}(\phi_1)\,d\phi_1}{\int \sin^{n-2}(\phi_1)\,d\phi_1}\\ &=a_1\left(1-\frac{\int \sin^n(\phi_1)\,d\phi_1} {\int \sin^{n-2}(\phi_1)\,d\phi_1}\right)\\ &=a_1\left(1-\frac{n-1}{n}\right)\\ &=a_1/n \end{align} where the integrals over higher $\phi_i$ cancel, and the final ratio uses integration by parts. Since $m$ must be symmetric in all the coordinates, we get \begin{align} m(b_i)&=b_i/n\\ m(a)=m(\sum b_i)&=\sum m(b_i)=\frac{1}{n}\sum a_i \end{align} Similarly, \begin{align} s(b_i)&=3a_i^2/(n^2+2n)\\ s(b_i+b_j)-s(b_i)-s(b_j)&=2a_ia_j/(n^2+2n) \end{align}

Now by the polynomial identity $$\left(\sum x_i\right)^2=\sum_i x_i^2+\sum_{i<j} \left((x_i+x_j)^2 - x_i^2-x_j^2\right)$$ we also have \begin{align} s(a) &=s\left(\sum b_i\right)\\ &=\sum_i s(b_i)+\sum_{i<j}\left(s(b_i+b_j)-s(b_i)-s(b_j)\right)\\ &=\sum_i \frac{3a_i^2}{n^2+2n}+\sum_{i<j}\frac{2a_ia_j}{n^2+2n} \end{align}

So finally, the variance is \begin{align} v&=s(a)-m(a)^2\\ &=\sum_{i}a_i^2\left(\frac{3}{n^2+2n}-\frac{1}{n^2}\right) + \sum_{i<j} a_ia_j\left(\frac{2}{n^2+2n}-\frac{2}{n^2}\right)\\ &=\frac{2}{n^2(n+2)}\left((n-1)\sum_{i}a_i^2 - 2\sum_{i<j} a_ia_j\right)\\ &=\frac{2}{n^2(n+2)}\sum_{i<j}(a_i-a_j)^2 \end{align}

Matt F.
  • 1,656
  • 4
  • 20
  • +1. Re the expectations: because the $x_i$ are exchangeable, all the $x_i^2$ have the same expectation. Since these $n$ squares sum to unity, that common expectation must be $1/n.$ Re the covariances: the covariance matrix of $(x_i^2)$ is readily obtained from the marginal distribution derived at https://stats.stackexchange.com/a/520811/919. Some people might find these approaches more congenial than resorting to spherical coordinates. – whuber Jun 25 '21 at 18:30
  • that looks intense....but matches what I'm getting numerically, thanks! – Yaroslav Bulatov Jun 25 '21 at 18:44
  • 1
    @whuber, that is slicker! I am just happy that these calculations turned out nicer than most with high-dimensional spherical coordinates: no multiple integrals, no gamma function, and no alternating appearances of $\sqrt{\pi}$. – Matt F. Jun 25 '21 at 19:05
  • Indeed. But it is a little sobering to look at the full distribution for various vectors $(a_i)$ ;-). – whuber Jun 25 '21 at 19:33