9

Assume a random vector $X = [x_1, x_2, ..., x_n]$ where the random variables are independent and Gaussian distributed. At the same time, they are not identically distributed; they can have arbitrary means and variances.

What is the probability that $x_k > x_i \forall i \neq k$? In other words, what is the probability that a realization of the random vector will yield $x_k$ as the maximum value in the vector? I'm looking for a closed form solution if one exists.

Here's as far as I got addressing this problem. Assume $k=1$ without loss of generality.

$P\left(x_1 > x_i \forall_{i > 1}\right) = \int_{-\infty}^\infty p(x_1)P\left(x_1 > x_i \forall_{i>1}|x_1\right)dx_1$

$P\left(x_1 > x_i \forall_{i > 1}\right) = \int_{-\infty}^\infty p(x_1)\prod_{i=2}^n P\left(x_1 > x_i|x_1\right)dx_1$

$P\left(x_1 > x_i \forall_{i > 1}\right) = \int_{-\infty}^\infty p(x_1)\prod_{i=2}^nF_i(x_1)dx_1$

where $F_i(x)$ is the cumulative distribution function for $x_i$.

I'm honestly not sure where to go from here. Any suggestions on whether there is a way forward or whether there is a better approach? Numerical integration is an option to move forward but I'd prefer a closed form solution if possible as it would open up other options for investigation in a larger problem I'm attacking.

Thanks!

UPDATE: The previous question-answer pair marked as providing an answer to this question is not for two reasons. (1) The question is seeking the probability that $x_k$ is the minimum, not the maximum and (2) no closed form solution is offered. If a closed form solution was derived, there might have been a similar approach in the answer that could have been leveraged here. But that is simply not present.

UPDATE-2: There is now a closed form solution proposed to the related problem for the minimum that provides the basis for solving the question posed here.

Chris
  • 191
  • 4
  • 1
    And just to be clear, the random variables are not identically distributed. – Chris Jan 22 '16 at 23:47
  • Just to be clear, the random variables are not necessarily identically distributed, right? – Alex R. Jan 22 '16 at 23:59
  • 2
    If they were identically distributed (Gaussian with same mean and variance), as well as independent, it would be $\frac1n$ – Henry Jan 23 '16 at 00:02
  • @AlexR. exactly right. they are not identically distributed – Chris Jan 23 '16 at 01:28
  • 2
    That information (that they're not identically distributed) belongs in the question, not in the comments. – Glen_b Jan 23 '16 at 09:43
  • 1
    Before trying the general case, suggest you first solve the $n = 3$ case, as per method here: http://stats.stackexchange.com/questions/29051/what-is-a-method-to-calculate-precisely-py-geq-x-y-leq-z-given-three-inde Adopting this approach will let you use Multivariate Normal software to derive the probabilities (basically numerical integration), since closed form solutions do not generally exist, other than for some special cases. – wolfies Jan 23 '16 at 17:46
  • The stated duplicate (posted by mods) does NOT answer the question at hand. FIRST, the question posed here is for a MAX (there it is for a min); SECOND, the question here is for arbitrary symbolic mean and variances; there it is for given numerical values; THIRD, here is it for open n; there it is for n = 5; FOURTH, the OP seeks a general symbolic method, whereas the linked question is just some numerical R code, and lacking in any theoretical rigour. In essence, higher standards are needed before marking questions as duplicates ... when important issues are still left open. – wolfies Jan 23 '16 at 20:21
  • 1
    @Glen_b moved the comment into the main question – Chris Jan 24 '16 at 07:49
  • 1
    @wolfies completely agree. this question is not a duplicate. edited the question to reflect that fact. – Chris Jan 24 '16 at 07:50
  • The max/min issue is perhaps not of central consequence due to symmetry but there at least some differences. However, the proposed duplicate contains a link to a question that comes [much closer](http://stats.stackexchange.com/questions/44139/what-is-px-1x-2-x-1x-3-x-1x-n) to being a duplicate, and which addresses the issue of a closed form solution. Please edit your question taking into account the information there. – Glen_b Jan 24 '16 at 07:54
  • @Glen_b will do. a friend just posted that answer after I asked him about this question. – Chris Jan 24 '16 at 08:08

2 Answers2

5

$\newcommand{\N}{\mathcal N}\newcommand{\tr}{\mathrm{tr}}$Restating the question: let $X \sim \N(\mu, \Sigma)$, where $\Sigma = \mathrm{diag}(\sigma_1^2, \dots, \sigma_n^2))$. What is $\Pr(\forall i \ne k, X_k \ge X_i)$?

Now, this happens if and only if the $(n-1)$-dimensional random vector $Y$, where we drop the $k$th component and subtract each other dimension from $X_k$, is componentwise positive. Define the $(n-1) \times n$ matrix $A$ by taking the negative of $(n-1)$ identity and inserting a column of all $1$s as the $k$th column, so that $Y = A X$: if $k = n$, this is $A = \begin{bmatrix}-I & 1\end{bmatrix}$.

Then $Y \sim \N(A \mu, A \Sigma A^T)$. $A \mu$ is easy, and $A \Sigma A^T$ ends up dropping the $k$th row and column from $\Sigma$ and then adding $\sigma_k^2$ to each entry: $\Sigma' + \sigma_k^2 1 1^T$, where $\Sigma'$ drops $k$ from $\Sigma$.

Now, the probability in question is known naturally enough as a "Gaussian orthant probability". In general, these are difficult to get in closed form (here are a few; there are a bunch of algorithms out there to approximate them).

But we have a special form of covariance matrix here (a particularly simple rank-1 plus diagonal), which may yield a reasonable solution. Below is an effort towards that, but spoiler warning: I don't get to a closed form.


The probability in question is: \begin{align} \Pr\left( Y > 0 \right) = \int_{y \in (0, \infty)^{n-1}} \left( 2 \pi \right)^{-\frac{n-1}{2}} \lvert A \Sigma A^T \rvert^{-\frac12} \exp\left( -\frac12 (y - A \mu)^T (A \Sigma A^T)^{-1} (y - A \mu) \right) \mathrm{d}y .\end{align} To avoid those pesky $A \mu$ terms, define $Z = Y - A \mu$, $\mathcal Z = \{ z : \forall i, z_i > (- A \mu)_i \}$. Then we care about \begin{align} \Pr\left( Z > - A \mu \right) = \int_{z \in \mathcal{Z}} \left( 2 \pi \right)^{-\frac{n-1}{2}} \lvert A \Sigma A^T \rvert^{-\frac12} \exp\left( -\frac12 z^T (A \Sigma A^T)^{-1} z \right) \mathrm{d}z .\end{align}

Applying the matrix determinant lemma: \begin{align} \lvert A \Sigma A^T \rvert &= \left\lvert \Sigma' + \sigma_k^2 1 1^T \right\rvert \\&= (1 + \sigma_k^2 1^T \Sigma^{-1} 1) \lvert \Sigma^{-1} \rvert \\&= \left( 1 + \sum_{i \ne k} \frac{\sigma_k^2}{\sigma_i^2} \right) \prod_{i \ne k} \frac{1}{\sigma_i^2} ,\end{align} so at least the normalization constant is easy enough.

To tackle the exponent, apply Sherman-Morrison: \begin{align} \left( A \Sigma A^T \right)^{-1} &= \left( \Sigma' + \sigma_k^2 1 1^T \right)^{-1} \\&= \Sigma'^{-1} - \frac{\sigma_k^2 \Sigma'^{-1} 1 1^T \Sigma'^{-1}}{1 + \sigma_k^2 1^T \Sigma'^{-1} 1} \\&= \Sigma'^{-1} - \frac{1}{\frac{1}{\sigma_k^2} + \sum_{i \ne k} \frac{1}{\sigma_i^2}} \left[ \frac{1}{\sigma_i^2 \sigma_j^2} \right]_{ij} \\&= \Sigma'^{-1} - \frac{1}{\tr(\Sigma^{-1})} \left[ \frac{1}{\sigma_i^2 \sigma_j^2} \right]_{ij} \\ z^T (A \Sigma A^T)^{-1} z &= \sum_i \frac{z_i^2}{\sigma_i^2} - \frac{1}{\tr(\Sigma^{-1})} \sum_{ij} \frac{z_i z_j}{\sigma_i^2 \sigma_j^2} \end{align} and then the integral (after pulling out constants) is \begin{align} \int_{z \in \mathcal{Z}} &\exp\left( - \tfrac12 z^T (A \Sigma A^T)^{-1} z \right) \mathrm{d}z \\&= \int_{z \in \mathcal{Z}} \prod_i \exp\left( - \frac{z_i^2}{2 \sigma_i^2} \right) \prod_{ij} \exp\left( \frac{1}{2 \tr(\Sigma^{-1})} \frac{z_i z_j}{\sigma_i^2 \sigma_j^2} \right) \mathrm{d}z .\end{align}

This integral seems amenable to something smarter than just blind numerical integration, but it's late now....

Danica
  • 21,852
  • 1
  • 59
  • 115
-1

I will use $P(X_k \geq X_i\;\;\forall i \neq k)$ to denote the probability that $X_k\geq X_i\;\;\forall i \neq k$

Let $X_1,...,X_n$ be an independent (but not identically) Gaussian distributed sample of random variables, i.e. $X_i \sim N(\mu_i,\sigma_i)$. Also let $Z_{k,i}=X_k - X_i$.

as @Dougal pointed out, the random variables $Z_{k,1},...,Z_{k,n}$ (with the omission of $Z_{k,k}$) are NOT independent distributed. Rather they follow a multivariate Gaussian where the $$ cov(Z_{k,i},Z_{k,j})=\begin{cases} \sigma_k^2 \;\; \mathrm{for}\;\; i \neq j \\ \sigma_k^2+\sigma_i^2 \;\; \mathrm{for}\;\; i = j \end{cases} $$ The marginal distributions are given by $$Z_{k,i} \sim N\bigg(\mu_k-\mu_i,\sqrt{\sigma_k^2+\sigma_i^2}\bigg)\;\;\forall i \neq k$$ Thus, $$ P(X_k \geq X_i\;\;\forall i \neq k) = P(Z_{k,i} \geq 0\;\;\forall i \neq k)=$$ $$ \int_0^{\infty} \int_0^{\infty} . . . \int_0^{\infty} \Phi((Z_{k,1},...,Z_{k,n})_{-k}|\mu_k,\Sigma_k)\prod_{\stackrel{i=1}{i \neq k}}^ndz_k $$

$\Phi$ being the multivariate Gaussian pdf, with $\mu_k$ and $\Sigma_k$ as the mean vector and covariance matrix formed from the above marginals and covariance.

The above multivariate distribution can be really tough to integrate over. Most statistical software packages have functions for it though. In R, for example, you may want to check out the mnormt or mvnorm libraries. However, I cannot speak to how accurate these methods are for very high dimensional problems.

Zachary Blumenfeld
  • 3,826
  • 1
  • 14
  • 21
  • 1
    "Note that for a fixed $k$, the random variables $Z_{k,1}, \dots, Z_{k,n}$ are also independent Gaussian distributed" – this isn't true. $\mathrm{Cov}(Z_{k,1}, Z_{k,2}) = \mathrm{Cov}(X_k - X_1, X_k - X_2) = \mathrm{Cov}(X_k, X_k) - \mathrm{Cov}(X_k, X_2) - \mathrm{Cov}(X_1, X_k) + \mathrm{Cov}(X_1, X_2) = \sigma_k^2 - 0 - 0 + 0 \ne 0$. This is why conditioning on an event of probability 1 is somehow supposedly changing your probabilities later in the answer. – Danica Jan 23 '16 at 09:34
  • @Dougal your right. So I guess I have to replace the product with a multivariate integral... – Zachary Blumenfeld Jan 23 '16 at 09:48