7

Lets say I have a normal distribution $N(\mu, \sigma^2)$ from which I have drawn $n$ i.i.d. samples $x_1, \dots, x_n$.

Now, lets define a random variable $Y = max(x_1, \dots, x_n)$.

When $n=1$, the expected value of $Y$ is $\mu$. I would expect that as $n$ increases, the expected value of $Y$ should increase as well. Is it possible to determine the expected value of $Y$ for any value of $n$, in terms of $\mu$ and $\sigma$?

timleathart
  • 187
  • 1
  • 6
  • 2
    There is not a nice closed form. Some approximations discussed in https://math.stackexchange.com/questions/89030/expectation-of-the-maximum-of-gaussian-random-variables – khol May 02 '18 at 04:33

3 Answers3

4

First note that\begin{align}Y_n=\max\{X_1,\ldots,X_n\}&=\max\{\sigma\epsilon_1+\mu,\ldots,\sigma\epsilon_n+\mu\}\\&=\sigma\max\{\epsilon_1,\ldots,\epsilon_n\}+\mu\\&=\sigma\xi_n+\mu\end{align} hence that $(\mu,\sigma)$ is also a location-scale parameter for the maximum. Asymptotically, the Normal distribution belongs to the domain of attraction of the Gumbel distribution, meaning that $$\sqrt{2\log(n)}(\xi_n-d_n)\stackrel{{\cal L}}{\longrightarrow} G_0$$with $G_0(x)=\exp\{-\exp(-x)\}$ the Gumbel pdf and $$d_n = \sqrt{2\log(n)}-\dfrac{\log\log n + \log(4\pi)}{2\sqrt{2\log(n)}}$$

Xi'an
  • 90,397
  • 9
  • 157
  • 575
4

If we combine two of the answers here (Approximate order statistics for normal random variables), we have for the $r$th $\it{smallest}$ order statistic

$$E[r,n] \approx \mu + \sigma \ \Phi^{-1} \left( \frac{r-\frac{\pi}{8}}{n-\frac{\pi}{4}+1}\right) $$

For the largest value we want $r=n,$ so we have

$$E[Y] \approx \mu + \sigma \ \Phi^{-1} \left( \frac{n-\frac{\pi}{8}}{n-\frac{\pi}{4}+1}\right) $$

soakley
  • 4,341
  • 3
  • 16
  • 27
  • Thanks. This is an interesting result. However, I am not sure how to interpret the case when n tends to infinity. The posted answer gives: E[Y] = \mu + \sigma / \Phi. However, intuitively, I would expect the answer to be E[Y] = \m + 3 \sigma or something along those lines. Am I missing something here? – Amrinder Arora Aug 13 '20 at 21:24
  • 1
    As $n$ tends to infinity, you should expect the sample maximum to increase. And that is exactly what the formula predicts. The argument in parentheses in the last formula above tends towards 1 as $n$ increases. The inverse cdf will continue to increase as $n$ increases. – soakley Oct 06 '20 at 13:25
  • Certainly, as n increases, the sample maximum is expected to increase. No question there. My point though was that the increase seems to be too little. For example, even if you have just a million samples, one would expect that we will see max value at least 3 SDs away from mean. While the posted result only seems to suggest 1/phi SD away. Clearly there is a small disconnect in the result or its interpretation. – Amrinder Arora Oct 06 '20 at 16:29
  • And actually, based on some other reading, we may expect the E[Y] to diverge as n tends to infinity. – Amrinder Arora Oct 06 '20 at 16:32
  • 1
    That is not 1/phi. It is the inverse of the standard normal cdf. – soakley Oct 06 '20 at 18:56
  • 1
    For a million observations and using the formula above, I get a value of 4.85 standard deviations above the mean. – soakley Oct 06 '20 at 19:48
  • Aah. Thank you @soakley. OK, so the problem was afte rall with the interpretation. pebkac. – Amrinder Arora Oct 07 '20 at 01:14
0

EDIT:

I found this paper referenced in a thread on the maths stack exchange (Approximate order statistics for normal random variables), so I had a look. For the maximum, $r=n$.

"In a sample of size n the expected value of the rth largest order statistic is given by

$$E(r,n)=\frac{n!}{(r-1)!(n-r)!}\int_{-\infty}^{\infty}x\{1-\Phi(x)\}^{r-1}\{\Phi(x)\}^{n-r}\phi(x)dx,$$

where $\phi(x)=1/\sqrt(2\pi)exp(-\frac{1}{2}x^2)$ and $\Phi(x)=\int^x_{-\infty}\phi(z)dz.$"

  • Royston, J. P. (1982), 'Algorithm AS 177: Expected Normal Order Statistics (Exact and Approximate)', Journal of the Royal Statistical Society. Series C (Applied Statistics), 31(2):161-165.

So $Y$ is an order statistic. Let's label its density function $g_{(n)}(x)$, to indicate that it's the pdf of the variable in the nth position (i.e. its the pdf of the maximum in the sample). Let's also label the normal $N(\mu, \sigma^2)$ density function as $f(x)$. It's a standard result that $$g_{(n)}(x)=n[F(x)]^{n-1}f(x),$$ where $F(x)$ is the cumulative density function of $N(\mu, \sigma^2)$ (as a reference, I suggest Mathematical Statistics (7th ed.) by Wackerly, Mendenhall, and Scheaffer, p.333).

It is at this point that I'm unable to proceed - I don't know how to evaluate the expected value of $Y$, given that it has such a strange pdf. However, I'd advise you to search for "expected value of order statistic" - in particular, I found a thread on this topic on the maths stack exchange site:

EDIT: As pointed out by Khol, thread is for a uniform distribution, not a normal distribution. The uniform is apparently more straightforward to deal with. Apologies for the partial answer!

https://math.stackexchange.com/questions/751229/order-statistics-finding-the-expectation-and-variance-of-the-maximum?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa

notebook
  • 121
  • 6
  • Linked thread is for a uniform random sample, which is distinctly simpler than a normal random sample. – khol May 02 '18 at 04:36