9

Let $(x_1,…,x_n)$ be a random vector uniformly distributed on the $n$-dimensional unit sphere.

Is there a closed form solution for the joint distribution of $P(x_1, x_2)$?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
student_t
  • 141
  • 2

2 Answers2

7

I want to flesh out John L's idea. Let $d\gt 2$ be the dimension of the space in which we will be working. (When $d=2$ the marginal is the uniform distribution on the circle -- that fully determines it, but it has no density function.) As you go through, note that the same analysis applies mutatis mutandis to finding the distribution of any proper subset of the coordinates, from $1$ through $d-1$ of them.

  1. The uniform distribution on the surface of the unit $d-1$ sphere $S^{d-1}\subset\mathbb{R}^d$ is the radial projection of the standard $d$-variate Normal distribution in $\mathbb{R}^d.$ See How to generate uniformly distributed points on the surface of the 3-d unit sphere?.

  2. Writing $(Z_1,Z_2,\ldots,Z_d)$ for such a Normal variate and $|Z| = \sqrt{Z_1^2+\cdots + Z_d^2}$ for its norm, $(1)$ means $(X_1,\ldots, X_d) = (Z_1/|Z|,\ldots,Z_d/|Z|)$ has a uniform distribution on $S^{d-1}.$

  3. By definition, $U^2=Z_1^2+Z_2^2$ has a $\chi^2(2)$ distribution and $V^2=Z_3^2 + \cdots + Z_d^2$ has a $\chi^2(d-2)$ distribution. (This is one place we must require $d\gt 2.$)

  4. Because the $Z_i$ are independent, $(Z_1,Z_2)$ is independent of $(Z_3,\ldots,Z_d),$ whence $U^2$ and $V^2$ are independent.

  5. The ratio $\frac{U^2}{U^2+V^2} = X_1^2+X_2^2$ has a Beta$(1,d/2-1)$ distribution.

  6. Because $(Z_1,\ldots,Z_d)$ is spherically symmetric, so is $(X_1,\ldots,X_d),$ whence (via projection) $(X_1,X_2)$ is circularly symmetric.

  7. $(5)$ and $(6)$ say that in polar coordinates $(R,\Theta)$ with $X_1=R\cos\Theta$ and $X_2=R\sin\Theta,$ $R^2$ has a Beta distribution and $\Theta$ is independently uniformly distributed on (say) the interval $[0,2\pi).$

  8. Conditional on $(X_1,X_2),$ the remaining coordinates $(X_3,\ldots,X_d)$ must be uniformly distributed on the slice of the sphere determined by $(X_1,X_2).$ That slice has radius $\sqrt{1-R^2}.$

We can immediately write down expressions for the distribution. Since the density of a Beta distribution is $f(t;\alpha,\beta) = t^{\alpha-1}(1-t)^{\beta-1}/B(\alpha,\beta),$ setting $t=r^2$ gives the density for $R$ as

$$f_R(r;d) = \frac{2}{B\left(1,\frac{d}{2}-1\right)}\,r(1-r^2)^{d/2-2} = (d-2)\,r\,(1-r^2)^{d/2-2}$$

for $0\le r\le 1$ and $d\gt 2.$

The joint density $f_{R,\Theta;d}$ is the product of $f_R$ and the density of $\Theta,$ given by $1/(2\pi)$ on the interval $[0,2\pi).$ Changing back to rectangular coordinates gives

$$f_{x_1,x_2}(x_1,x_2;d) = \frac{d-2}{2\pi}\,(1-x_1^2-x_2^2)^{d/2-2}\tag{*}$$

for $x_1^2+x_2^2\le 1.$ (See the example at https://stats.stackexchange.com/a/154298/919 for details.)


As an example, I generated a million draws from the uniform distribution on $S^{11}$ (so that $d=12$). Here is a scatterplot matrix of the first five components (showing just the first thousand observations for clarity).

Figure 1

The circular symmetry of each pair is apparent.

Next is a histogram of all million values of $R$ on which the root-Beta density function $f_R$ is plotted, with excellent agreement:

Figure 2

The $(X_i+1)/2$ have Beta$((d-1)/2,(d-1)/2)$ distributions, as shown at Distribution of scalar products of two random unit vectors in $D$ dimensions. This indeed is what one obtains from $(*)$ by integrating out one of the variables. Here is a histogram from the simulation with that Beta density overplotted:

Figure 3

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • Hi, thank you for your detailed response! I actually realized that my question is a special case of https://mathoverflow.net/questions/359643/marginal-density-of-uniform-spherical-distribution With this said, interestingly, it looks like the constants differ in your and Iosif's answer (the key dependence on $(1 - x_1^2 - x_2^2)^{d / 2 - 2}$ is the same). And I was wondering if there might be a difference in the derivation that lead to this? – student_t Apr 30 '21 at 19:06
  • @Student_t The Mathematics post doesn't explicitly give the constant, so there is no difference to note. The constant in my answer here is correct: look at the example and estimate the area under the curve. Clearly it's close to $1,$ as it should be. The fact that the curve agrees with the histogram shows the constant is correct. – whuber Apr 30 '21 at 19:11
  • Thanks for your fast reply! Sorry for the confusion and just to clarify, I was referring to the third answer to that question, and not the first, where there is an explicit form at the end of the answer. – student_t Apr 30 '21 at 20:10
  • @student_t Unfortunately, the order of answers varies for each user, so "third answer" doesn't identify any of them. Nevertheless, if you detect any difference in the constant, just perform the kinds of checks I have done here: integrate the density (numerically if you have to) and also compare it to simulated data. – whuber Apr 30 '21 at 21:00
  • Ah ok, btw I was referring to the answer that says it's: $$f_k(z_1,\dots,z_k) =\frac{2^{(k-n)/2}\Gamma(n/2)}{\pi^{n/2}\Gamma((n-k)/2)}(1-s_1)^{(n-k)/2-1}$$ for $s_1=\sum_1^k z_j^2\in(0,1)$ for k = 2. – student_t Apr 30 '21 at 22:14
  • Also, could you please say a bit more about how $(Z_1, ..., Z_d)$ spherically symmetric implies this for $X$'s in (6)? Thanks! This is needed to argue that $\theta$ is distributed uniformly over $[0, 2\pi)$ in (7). I initially thought that once we know the joint of the $Z$'s then we could use change of variables to obtain the joint of $X$'s, but $Z$ to $X$ is not one to one... – student_t May 04 '21 at 02:39
  • Something I'm not quite sure I understand is the case when $d = 2$ and the density is $\propto 0^{-1}$...which seems odd since things are well defined for the $d=2$ case. Any insights from others would be much appreciated! – student_t May 23 '21 at 00:49
  • 1
    I agree that's odd--and it's a good check to make. A review of the analysis indicates it applies only to $d\ge 3.$ I have now noted that in the answer. – whuber May 23 '21 at 18:38
  • Thanks and lastly, is there any geometrical intuition for why the relative density size "flips"? For $d=3$, it's saying the density is higher towards the perimeter (actually at infinity on the perimeter $0^{-1/2}$!), but then for high dimensions $d$, it's pretty much concentrated at the center of the circle. – student_t May 23 '21 at 23:31
  • This comes from observation $(8)$ and the dimensional scaling (volumes in $d-2$ space are proportional to the $d-2$ power of the size). That's why the factor $(1-r^2)^{(d-2)/2-1}$ appears in the density. – whuber May 24 '21 at 10:52
1

I don't think there is a closed form, but there are some observations below.

Let $Z_1,...,Z_d$ be iid standard normal.
Then, $\left(\frac{Z_1}{\sqrt{Z_1^2+...+Z_d^2}},...,\frac{Z_d}{\sqrt{Z_1^2+...+Z_d^2}} \right)$ is uniform on the $d$-dimensional unit sphere.
We want to find the joint distribution of $(X_1,X_2)=\left(\frac{Z_1}{\sqrt{Z_1^2+...+Z_d^2}},\frac{Z_2}{\sqrt{Z_1^2+...+Z_d^2}} \right)$.

First notice that $\left(\frac{1}{X_1^2},\frac{1}{X_2^2} \right)=\left(\frac{Z_1^2+...+Z_d^2}{Z_1^2},\frac{Z_1^2+...+Z_d^2}{Z_2^2} \right)=\left(1+\frac{Z_2^2+Z_3^2...+Z_d^2}{Z_1^2},1+\frac{Z_1^2+Z_3^2...+Z_d^2}{Z_2^2} \right)$.

Thus, $\left(\frac{1}{d-1}\left( \frac{1}{X_1^2}-1 \right),\frac{1}{d-1}\left( \frac{1}{X_2^2}-1 \right)\right)$ has the same distribution as $\left(\frac{(V_2+V_3)/(d-1)}{V_1},\frac{(V_1+V_3)/(d-1)}{V_2} \right)$ where $V_1,V_2,V_3$ are independent chi-square random variables with degrees of freedom 1, 1, and $d-2$ respectively.

$\frac{(V_2+V_3)/(d-1)}{V_1}$ and $\frac{(V_1+V_3)/(d-1)}{V_2}$ are dependent and each has an F-distribution with $d-1$ (numerator) and 1 (denominator) degrees of freedom. There are some other definitions of the bivariate F-distribution that have closed forms, but I don't think this one does. The density for $X_1$ or $X_2$ is $$f(x)=\frac{((d-2)(1 - x^2))^{d/2}(d-2 + x^2)^{(1 - d)/2}\Gamma((1 + d)/2)}{\sqrt{d-1}\sqrt{\pi}(1 - x^2)^2\Gamma(d/2)}$$ for $x \in (-1,1)$. I don't think it is possible to find the distribution of $X_2$ given $X_1$ in an easy form.

The following R functions generate random numbers and calculate the joint distribution function by numerical integration.

library(cubature)

rX1X2=function(n,d) {
  z1=rnorm(n)
  z2=rnorm(n)
  v3=rchisq(n,d-2)
  return(matrix(c(z1,z2)/sqrt(z1^2+z2^2+v3),byrow=F,ncol=2))
}

pX1X2=function(x1,x2,d) {
  f1=function(x,d,x1,x2) {
    den=sqrt(x[1]^2+x[2]^2+x[3])
    return(ifelse((x[1]/den)<x1 & (x[2]/den)<x2,
                     dnorm(x[1])*dnorm(x[2])*dchisq(x[3],d-2),0))
  }
 return(adaptIntegrate(f1,lowerLimit=c(-Inf,-Inf,0),upperLimit =c(Inf,Inf,Inf), 
                          d=d,x1=x1,x2=x2,maxEval=10000)$integral)
}

pX1X2(-0.2,0.3,5) #estimated probability that X1<-0.2 and X2<0.3
x=rX1X2(100000,5)
mean(x[,1]<(-0.2) & x[,2]<0.3) #estimated probability from simulation

y=(1/x[,1]^2-1)/4
plot(log(quantile(y,c(1:99)/100)),
       log(qf(c(1:99)/100,4,1))) #verify that y has an F(4,1) distribution

Q-Q plot (log-scale) verifying that the transformed variable has an F-distribution.

enter image description here

John L
  • 2,140
  • 6
  • 15
  • Spherical geometry will reveal a simple closed formula. – whuber Apr 20 '21 at 18:15
  • 1
    Here's another hint: $r^2 = Z_1^2+Z_2^2$ has a Beta distribution and $\arctan(Z_2/Z_1)$ has a uniform distribution. That fully describes the joint distribution. – whuber Apr 20 '21 at 20:02
  • 3
    @whuber OK, I can see that. I don't think I would have ever thought of using those transformations. – John L Apr 20 '21 at 20:33