3

From https://en.wikipedia.org/wiki/Irwin–Hall_distribution:

The generation of pseudo-random numbers having an approximately normal distribution is sometimes accomplished by computing the sum of a number of pseudo-random numbers having a uniform distribution; usually for the sake of simplicity of programming. Rescaling the Irwin–Hall distribution provides the exact distribution of the random variates being generated.

It is the last sentence I do not understand. How would you rescale, and what does “variate” mean here? A normal distribution has infinite support (theoretically if not practically) so it does not seem possible to rescale easily?

mdewey
  • 16,541
  • 22
  • 30
  • 57
Single Malt
  • 504
  • 1
  • 5
  • 15

1 Answers1

2

The normal distribution with the shape $e^{-x^2}$ has an infinite support but it is also used as a model for distributions that occur in nature (or in statistics, like distribution of sample means) that do not have infinite support.

For instance one of the first uses of the normal distribution was the approximation of a binomial distributed variable by deMoivre in the 18th century (see also Can a variable be normally distributed on finite interval?).

You could actually see the Irwin-Hall distribution (a sum of uniform distributed variables) as analogous to the binomial distribution (a sum of Bernoulli distributed variables).

So when you have a sum of variables then you do not have exactly a normal distributed variable but instead an approximate normal distributed variable.


Another way to see this is that often a normal distribution is not the goal.

The thing is, that for all variables that are an average of several i.i.d.* variables (with limited support) these variables will approach a normal distribution.

* i.i.d. = identical and independent distributed. And note that the statement can be generalized (see the central limit theorem)

It is those emperical distributions that we wish to model. (With emperical distributions I mean distributions that describe things in nature that are not exactly the same as model distributions such as the normal distribution.)

Since all means of i.i.d. variables gravitate towards a normal distribution, we do not need to use a normal distribution to do the approximation, but instead can use one of those other variables that gravitate towards the normal distribution.


Scaling

The Irwin-Hall distribution has mean $n/2$ and variance $n/12$. If you have a variable $X$ that is distributed according to an Irwin-Hall distribution with parameter $n$ than a shifted and scaled parameter $Y=a+b\frac{X-n/2}{\sqrt{n/12}}$ will have mean $a$ and variance $b^2$.

The scaling is done to match the mean and variance of the target distribution.

The Bates distribution is an example of a scaled Irwin-Hall distribution. In this case the scaling and shifting is done to match the interval [0,1] of the support (rather than matching mean and variance).

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • Brilliant. 1. “The scaling is done to match the mean and variance of the target distribution”. The target is a normal distribution with specific mean and variance? 2. Say you wanted one sample from a normal distribution with mean 150. You could add 300 $Beta(1,1)$ distributions. Or you could add say just 20 and then use the rescaling by $a$ and $b$ to get the mean of 150. Thus this shortcut to getting the desired mean is the reason for the scaling with the bonus of tailoring the variance as well? – Single Malt Sep 23 '20 at 08:28
  • 1
    @SingleMalt to be honest, this is actually the first time that I am hearing of generating a (approximate) normal distributed variable in this way. It is a very straightforward approach but the faster (and more accurate) alternatives are the [Box-Muller transform](https://en.m.wikipedia.org/wiki/Box%E2%80%93Muller_transform) or using the [quantile function](https://en.m.wikipedia.org/wiki/Quantile_function#Applications). In the Wikipedia article about the Bates distribution it is mentioned that this distribution can be used as a model for a distribution with negative excess kurtosis. – Sextus Empiricus Sep 23 '20 at 08:40