9

The von Mises- Fisher distribution is defined as

$$ \frac{\kappa^{p/2-1}}{2\pi I_{p/2-1}(\kappa)}\exp(\kappa \mu^Tx) $$

It is defined over the unit sphere i.e. $||x||_2^2=1$. My question is what is $E(x)$. I got a feeling it's simply $\mu$ but how would you prove this?

I'm asking mostly because I do not know how to integrate over a sphere.

Sycorax
  • 76,417
  • 20
  • 189
  • 313
sachinruk
  • 1,113
  • 1
  • 9
  • 21

2 Answers2

8

You can derive the answer as follows. Start with the definition of the normalizing constant: $$ \int \exp(\kappa \mu^{T} x) dx = \frac{(2\pi)^{p/2-1} I_{p/2-1}(\kappa)}{\kappa^{p/2-1}} $$ (Note I have corrected an error in the question.) Let $y = \kappa \mu$ so that $y$ is an unconstrained vector with $\kappa = \sqrt{y^T y}$. It is easy to show that $d\kappa/dy = \mu$. $$ \int x \exp(y^T x) dx = \frac{d}{dy} \int \exp(y^T x) dx \\ = \frac{d\kappa}{dy} \frac{d}{d\kappa} \int \exp(y^T x) dx \\ = \mu \frac{d}{d\kappa} \frac{(2\pi)^{p/2-1} I_{p/2-1}(\kappa)}{\kappa^{p/2-1}} \\ = \mu \left(\frac{I'_{p/2-1}(\kappa)}{I_{p/2-1}(\kappa)} - \frac{p/2-1}{\kappa}\right) \frac{(2\pi)^{p/2-1} I_{p/2-1}(\kappa)}{\kappa^{p/2-1}} $$ $$ E(x) = \frac{\int x \exp(y^T x) dx}{\int \exp(y^T x) dx} = \mu \left(\frac{I'_{p/2-1}(\kappa)}{I_{p/2-1}(\kappa)} - \frac{p/2-1}{\kappa}\right) $$ Note $I'$ can be written in terms of $I$, as explained in wikipedia.

Tom Minka
  • 6,610
  • 1
  • 22
  • 33
1

This is more like an extended comment rather than a full answer. I am working on a problem that is, I think, somewhat related to the question so I share my thoughts.

In what sense are you interested in the expectation? For example, for $\kappa = 0$ I don't think you can define a sensible expectation which is location "on" the sphere.

Let \begin{equation} c_d(\kappa) = \frac{\kappa^{p/2-1}}{2\pi I_{p/2-1}(\kappa)} \text, \end{equation} so the normalization constant is easier to handle.

Note that the problem is symmetric for rotation, so you can take, say, $\mu = e_1 = (1, 0, 0, \ldots, 0)$. In this case the PDF is simply \begin{equation} P(x) = c_p(\kappa) \exp(\kappa x_1) \text. \end{equation}

If you then take the expectation of $x$ componentwise, you get for each $x_i$, $i = 2, 3, \ldots, p$ \begin{equation} \mathbb{E}[x_i] = \int_{x_i \in S^{p - 1}} x_i c_p(\kappa) \exp{\kappa x_1} \,dx_i \text, \tag{*} \end{equation} where $S^{p - 1} = \{x \in \mathbb{R}^p : |x| = 1\}$ the $(p - 1)$-sphere on which the $p$-variate vMF distribution is defined. Now divide $S^{p - 1}$ into two "hemispheres" $H_1$ and $H_2$, \begin{align} H_1 &= \{x \in S^{p - 1} : x_i \ge 0\} \text, \\ H_2 &= \{x \in S^{p - 1} : x_i < 0\} \text. \end{align} It doesn't really matter which inequality is strict, the integrand of (*) vanishes for $x_i = 0$ anyway.

Now let's write (*) using $H_1$ and $H_2$ (remember that $i \ne 1$, so $x_i$ will not appear in the argument of $\exp$), \begin{align} \mathbb{E}[x_i] &= \int_{x_i \in H_1 \cup H_2} x_i c_p(\kappa) \exp{\kappa x_1} \,dx_i \\ &= \int_{x_i \in H_1} x_i c_p(\kappa) \exp{\kappa x_1} + \int_{x_i \in H_2} x_i c_p(\kappa) \exp{\kappa x_1} \\ &= \int_{x_i \in H_1} x_i c_p(\kappa) \exp{\kappa x_1} + \int_{x_i \in H_1} -x_i c_p(\kappa) \exp{\kappa x_1} \\ &= 0 \text. \end{align}

Well, the situation for $x_1$ is considerably harder, as evaluating $\mathbb{E}[x_1]$ would result in and ugly mess of Gamma and modified Bessel functions... For $\kappa = 0$, we have the uniform distribution on $S^{p - 1}$, so $\mathbb{E}[x_1] = 0$. For $\kappa > 0$, intuitively, there is more probability mass on the $x_1 \ge 0$ "hemisphere" than on the $x_1 < 0$ "hemisphere", so $\mathbb{E}[x_1]$ should be $> 0$.

If we accept the handwaving above (or do the integration for real), we get $\mathbb{E}[x] = \mathbb{E}[x_1] e_1 = \mathbb{E}[x_1] \mu$. Combined with the fact that $\mathbb{E}[x_1] > 0$, this is vector pointing to the same direction as $\mu$. To get a vector that lies in $S^{p - 1}$, we can normalize to get \begin{equation} \frac{\mathbb{E}[x]}{|\mathbb{E}[x]|} = \frac{\mathbb{E}[x_1]\mu}{|\mathbb{E}[x_1]|} = \mu \text. \end{equation} If you remove the middle part of the equality above, the result will hold for any $\mu$ by rotation.

In this sense, yes, the expectation of the vMF distribution on the unit sphre is $\mu$ for $\kappa > 0$. For $\kappa = 1$, $\mathbb{E}[x] = 0$ (in the Euclidean sense) and the normalization will fail.


The distribution of $x_i$ (with $\mu = e_1$) is interesting for another reason, only slightly related to your question. It is the same thing for cosine similarity (or after a bit of scaling and shifting, cosine distance) and the vMF distribution that the $\chi$-distribution is for the Euclidean distance and the Gaussian distribution. That is, \begin{align} \chi &= |x - \mu| && \text{for $x \sim \mathcal{N}(\mu, 1)$,} \\ \text{like } x_1 &= \mu^T x = \frac{\mu^T x}{|\mu| \cdot |x|} && \text{for $x \sim \mathrm{vMF}(\mu, \kappa)$.} \end{align}

  • 4
    The expectation cannot be $\mu$ (assuming $\mu$ has unit length) because the expectation of $x$ is an average over points on the sphere (considered as a subset of $\mathbb{R}^d$) and therefore lies strictly within its interior. By symmetry the expectation is a multiple of $\mu$ and clearly that multiple lies in the interval $[0,1)$. The multiple can be found recursively because the expectation in $d$ dimensions is very simply related to that in $d-2$ dimensions. – whuber Sep 26 '14 at 17:24
  • Yes, but @Sachin_ruk wanted an expectation that lies "on" the sphere, therefore I suggested normalizing the expectation to obtain a vector on the sphere. Such vector, of course, will not be the expectation of the vMF distribution, but the direction towards which the expectation points. – Kristóf Marussy Sep 26 '14 at 17:46
  • 1
    "On the sphere" is not a phrase that appears in any form in the question. If you only wanted to establish that the expectation is parallel to $\mu$, you need merely observe that that is guaranteed by the rotational symmetry of the distribution around $\mu$: no calculations at all are needed. – whuber Sep 26 '14 at 19:16
  • He said "I got a feeling it's simply $\mu$ but how would you prove this?" -- that's of course cannot be true, only that it points in the same direction as mu. Yeah, now I see, the calculations were a bit overkill -- if fact, extremely overkill. I wanted to derive something a bit more complicated (the inner product of a vector $y \ne \mu$ and $x$) when I found this question, and I got a bit caught in the notation, and therefore was talking nonsense. :) – Kristóf Marussy Sep 26 '14 at 19:24
  • 1
    I did actually expect an answer on the sphere but it makes sense that it's not on the sphere. +1 for the effort though. – sachinruk Sep 27 '14 at 04:48