Theoretical formula
According to this calculating probabilities for two variables is simple:
$$P(A<B) = \Phi\left(\dfrac{\mu_B-\mu_A}{\sqrt{\sigma_A^2+\sigma_B^2}}\right)$$
If we look at the other link you will see a much more complicated formula for multiple variables (and it means that the maximum of variables are interdependent).
$$P(X_1 > \max X_i)
= \int_{-\infty}^{\infty}\prod_{i=2}^n \Phi\left(\frac{\alpha-\mu_i}{\sigma_i}\right)
\frac{1}{\sigma}\phi\left(\frac{\alpha-\mu_1}{\sigma_1}\right)\,\mathrm d\alpha$$
Taking into account that $\Phi$ is itself an integral without a theoretical solutions we have a very complicated formula with n integrals in it.
Computational benchmarks
As proposed in the second link the probabilities could be calculated using Monte-Carlo simulations.
Here are the benchmarks for integration of the theoretical formula, Monte-Carlo simulations and Ben's numerical calculation. All in Python:
Number of variables = 100.
Monte-Carlo simulations: 531 ms (100000 simulations) (gives correct answer)
Numerical calculation from Ben's answer: 22.1s (gives correct answer)
Integration according to the theoretical formula: 23.3s (gives wrong answer)
All algorithms are adjusted to have precision of around 1e-3 for the last value (99)
With the first approach I got the sum of probabilities equal to 0.92. Either it is a problem with integration in Python or the theoretical formula is wrong. And I was not able to vectorize the code so it should be significantly slower than C. With simulations I was able to write vector functions in numpy
so the time should be closer to C.
The accuracy of the simulation is calculated with standard deviations (for values 99,98 and 97 respectively):
mean probability = 7.2469e-01, std=1.13458614e-03
mean probability = 2.2255e-01, std=7.49851686e-04
mean probability = 4.6310e-02, std=1.07472256e-03
...
It may be possible to calculate the theoretical formula faster than the simulation by some good approximation of the integration. For example remove all variables that are 3 standard deviations from the maximum value, use piece-wise linear function for $\Phi$, and use few evaluations for numerical computation of the integral.
The code:
import numpy as np
from scipy.stats import norm
from scipy.integrate import quad
n=100
m = np.arange(100.)
sd = np.ones(100)
def func(x,m,sd,i):
mul = 1/sd[i]*norm.pdf((x-m[i])/sd[i])
for j in range(n):
if i!=j:
mul = mul*norm.cdf((x-m[j])/np.sqrt(sd[i]**2+sd[j]**2))
return mul
Measuring execution time of the theoretical integration formula in jupyter:
%%time
prob = np.array([quad(func,-np.inf,m[i],args=(m,sd,i),epsabs=1e-3)[0]+quad(func,m[i],np.inf,args=(m,sd,i),epsabs=1e-3)[0] for i in range(n)])
# 23.3 s
Measuring execution time of the Monte-Carlo simulations in jupyter:
%%time
nn=100000
a = np.random.normal(size=(n,nn))*sd[:,None]+m[:,None]
print((a==a.max(axis=0)).sum(axis=1)/nn)
# 531 ms
Calculating standard deviations:
nn=100000
a = np.random.normal(size=(n,nn,7))*sd[:,None,None]+m[:,None,None]
((a==a.max(axis=0)).sum(axis=1)/nn).std(axis=1)
Ben's numerical approach in R gives the same result as Monte-Carlo simulations. Here is the same code in Python that I used for benchmarking:
%%time
nn=25000
e_vec = np.random.normal(size=nn)
a = np.empty((n,nn))
i_range = np.arange(n)
for i in range(n):
a[i]=np.prod(norm.cdf(m[i]-m[np.arange(len(m))!=i][:,None]+e_vec[None,:]), axis=0)
print(a.mean(axis=1))
#22.1s
Probabilities for values 99,98,97:
0.727, 0.225, 0.047
Code to calculate the standard deviation of Ben's numerical answer:
nn=25000
a4std = np.empty((7,n))
for ii in range(7):
e_vec = np.random.normal(size=nn)
a = np.empty((n,nn))
i_range = np.arange(n)
for i in range(n):
a[i]=np.prod(norm.cdf(m[i]-m[np.arange(len(m))!=i][:,None]+e_vec[None,:]), axis=0)
a4std[ii] = a.mean(axis=1)
print(a4std.std(axis=0))
Standard deviation for values 99,98,97:
1.84e-3, 1.77e-3, 9.9e-4