10

I'm interested in the following type of case: there are 'n' continuous random variables which must sum to 1. What then would be the PDF for any one individual such variable? So, if $n=3$, then I am interested in the distribution for $\frac{X_1}{X_1+X_2+X_3}$, where $X_1, X_2$, and$ X_3 $are all uniformly distributed. The mean of course, in this example, is $1/3$, as the mean is just $1/n$, and though it is easy to simulate distribution in R, I do not know what the actual equation for the PDF or CDF is.

This situation is related to the Irwin-Hall distribution (https://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution). Only Irwin-Hall is the distribution of the sum of n uniform random variables, whereas I would like the distribution for one of n uniform r.v's divided by the sum of all $n$ variables.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user3593717
  • 101
  • 5
  • 1
    If the $n$ continuous uniform random variables sum to $1$, then with $n=3$, $X_1+X_2+X_3 = 1$ and so the distribution of $\frac{X_1}{X_1+X_2+X_3} = X_1$ is the same as the distribution of $X_1$, right? – Dilip Sarwate Oct 05 '15 at 04:14
  • 1
    I should correct myself: the N uniform distributions don't sum to 1. I am assuming they are each uniform between 0 and 1, and so their sum may be anything from 0 to N. I am thinking of taking each uniform variable and dividing it by the sum of all N uniform variables to get a set of N random variables which sum to 1 and have expected value 1/N. Note: I removed the word 'uniform' from my first sentence. The distribution I'm looking for isn't uniform, but is derived from dividing one of N uniform variables by the sum of all N uniform variables, somehow. I'm just not sure how. – user3593717 Oct 05 '15 at 04:35
  • Where the $X_i$ are exponentially distributed, the vector of normalised variables has a Dirichlet distribution. This may be of interest in itself, but looked into might also provide tactics for this type of situation. – conjectures Oct 14 '15 at 22:40

4 Answers4

5

The breakpoints in the domain make it somewhat messy. A simple but tedious approach is to build up to the final result. For $n=3,$ let $Y=X_2 + X_3,$ $W = {{X_2 + X_3} \over X_1},$ and $T = 1 + W.$ Then $Z = {{1} \over {T}}={{X_1} \over {X_1 + X_2 + X_3}}.$

The breakpoints are at 1 for $Y,$ 1 and 2 for $W,$ 2 and 3 for $T,$ and $1/3$ and $1/2$ for $Z.$ I found the complete pdf to be

$$f(z) = \begin{cases} \ \ \ \ \ {{1} \over {(1-z)^2}} \ , & \text{if} \ {0} \leq z \leq {1/3} \\\\ {{3z^3-9z^2+6z-1} \over {3z^3(1-z)^2}} \ , & \text{if} \ {1/3} \leq z \leq {1/2} \\\\ \ \ \ \ \ \ \ {{1-z} \over {3z^3}} \ , & \text{if} \ {1/2} \leq z \leq {1} \end{cases}$$

The cdf can then be found as $$F(z) = \begin{cases} \ \ \ \ \ \ \ \ \ \ \ {{z} \over {(1-z)}} \ , & \text{if} \ {0} \leq z \leq {1/3} \\\\ {{1} \over {2}}+{{-18z^3+24z^2-9z+1} \over {6z^2(1-z)}} \ , & \text{if} \ {1/3} \leq z \leq {1/2} \\\\ \ \ \ \ \ \ \ \ {{5} \over {6}} + {{2z-1} \over {6z^2}} \ , & \text{if} \ {1/2} \leq z \leq {1} \end{cases}$$

soakley
  • 4,341
  • 3
  • 16
  • 27
  • +1 Nice. Also, your density [agrees beautifully](http://i.imgur.com/42IXO8w.png) with simulation. – Glen_b Oct 14 '15 at 23:43
2

Let $Y=\sum_{i=2}^n X_i$. We can find the cdf of $X_1/\sum_{i=1}^n X_i$ by calculating \begin{align*} P(\frac{X_1}{\sum_{i=1}^n X_i} \leq t) &= P(X_1 \leq t\sum_{i=1}^n X_i) \\ &= P((1-t)X_1 \leq t\sum_{i=2}^n X_i) \\ &= P(X_1 \leq \frac t{1-t}Y)\\ &= \int_0^1 P(x_1 \leq \frac t{1-t}Y)\ dx_1\\ &= \int_0^1 (1-F_Y(\frac{1-t}{t}x_1))\ dx_1\\ &= 1-\int_0^1 F_Y(\frac{1-t}{t}x_1)\ dx_1\\ \end{align*} We then differentiate and substitute the Irwin-Hall pdf to obtain the desired pdf: \begin{align*} f(t) &= \int_0^1 f_Y(\frac{1-t}{t}x_1)\cdot \frac{x_1}{t^2}\ dx_1\\ &= \frac{1}{t^2}\int_0^{1\wedge \frac{(n-1)t}{1-t}} \sum_{k=0}^{\lfloor \frac{1-t}{t}x_1\rfloor}\frac1{(n-2)!}(-1)^k\binom{n-1}k(\frac{1-t}{t}x_1-k)^{n-1} x_1\ dx_1 \end{align*} From here it gets a little messy, but you should be able to interchange the integral and summation and then perform a substitution (e.g, $u=\frac{tx_1}{1-t}-k$) to evaluate the integral and hence obtain an explicit formula for the pdf.

Brent Kerby
  • 2,303
  • 11
  • 11
1

Assuming

"the N uniform distributions don't sum to 1."

This is how I started(it's incomplete):

Consider $Y = \sum_{i=1}^n X_i$ and let $X=X_i$ by a slight abuse of notation.

Consider, $U = \frac{X}{Y}$ and $V =Y$:

$$ X=UV\\ Y=V $$

Then following transformation of variables:

$$ J = \begin{bmatrix} V & U\\ 0 & 1 \end{bmatrix} $$

The joint probability function of $(U,V)$ is given by:

$f_{U,V}(u,v) = f_{X,Y}(uv,v)|J|$

Where $X \sim U(0,1)$ and $Y \sim IrwinHall$

$$ f_X(x) = \begin{cases} 1 & 0 \leq x\leq 1\\ 0 & otherwise \end{cases} $$

And, $$ f_Y(y) = \frac{1}{2(n-1)!}\sum_{k=0}^n(-1)^k {n\choose k}(x-k)^{n-1} sign(x-k) $$

Thus, $$ f_{U,V}(u,v) = \begin{cases} \frac{1}{2(n-1)!}\sum_{k=0}^n(-1)^k {n\choose k}(uv-k)^{n-1} sign(uv-k) & 0 \leq uv \leq 1\\ 0 & otherwise \end{cases} $$

and $f_U(u) = \int f_{U,V}(u,v) dv$

rightskewed
  • 3,040
  • 1
  • 14
  • 30
0

Suppose we already know sum of $U(0,1)$ has a Irwin-Hall distribution. Now your question changes to find the pdf (or CDF) of $\frac{X}{Y}$ when X had a $U(0,1)$ distribution and $Y$ has a Irwin-Hall distribution.

First we need to find he joint pdf of $X$ and $Y$.

Let $Y_1=X_1\\Y_2=X_1+X_2\\Y_3=X_1+X_2+X_3$

Then

$X_1=Y_1\\X_2=Y_2-Y_1\\X_3=Y_3-Y_2-Y_1$

$\therefore$

$J=\begin{vmatrix} \frac{\partial X_1}{\partial Y_1} & \frac{\partial X_1}{\partial Y_2} &\frac{\partial X_1}{\partial Y_3} \\ \frac{\partial X_2}{\partial Y_1} & \frac{\partial X_2}{\partial Y_2} &\frac{\partial X_2}{\partial Y_3} \\ \frac{\partial X_3}{\partial Y_1} & \frac{\partial X_3}{\partial Y_2} &\frac{\partial X_3}{\partial Y_3} \end{vmatrix}=-1$

Since $X_1, X_2, X_3$ are i.i.d with $U(0,1),$ therefore, $f(x_1,x_2,x_3)=f(x_1)f(x_2)f(x_3)=1$

The joint distribution with $y_1,y_2,y_3$ is

$g(y_1,y_2,y_3)=f(y_1,y_2,y_3)|J|=1$

Next let us integrate out the $Y_2$ and we can get the joint distribution of $Y_1$ and $Y_3$ i.e the joint distribution of $X_1$ and $X_1+X_2+X_3$

As suggested by whuber now I changed the the limits

$$h(y_1,y_3)=\int_{y_1+1}^{y_3-1} g(y_1,y_2,y_3)dy_2=\int_{y_1+1}^{y_3-1} 1 dy_2=y_3-y_1-2 \tag{1}$$

Now, we know the joint pdf of $X,Y$ i.e joint pdf $X_1$ and $X_1+X_2+X_3$ is $y_3-y_1-2$.

Next let find the pdf of $\frac{X}{Y}$

We need another transformation:

Let $Y_1=X\\Y_2=\frac{X}{Y}$

Then $X=Y_1\\Y=\frac{Y_1}{Y_2}$

Then

$J=\begin{vmatrix} \frac{\partial x}{\partial y_1} & \frac{\partial x}{\partial y_2}\\ \frac{\partial y}{\partial y_1} & \frac{\partial y}{\partial y_2} \end{vmatrix}= \begin{vmatrix} 1 & 0\\ \frac{1}{y_2} & -\frac{y_1}{y_2^2} \end{vmatrix}=-\frac{y_1}{y_2^2}$

we already the joint distribution of $X,Y$ from above steps ref (1).

$\therefore$

$g_2(y_1,y_2)=h(y_1,y_3)|J|=(y_3-y_1-2)\frac{y_1}{y_2^2}$

Next, we integrate the $y_1$ out we get the pdf of $y_2$ then we get the pdf of $\frac{X}{Y}$

$$h_2(y_2)=\int_0^1(y_3-y_1-2)\frac{y_1}{y_2^2}dy_1=\frac{1}{y_2^2}(\frac{y_3}{2}-\frac{1}{3}-1)\tag{2}$$

This is the pdf of $X/Y$ i.e $\frac{X_1}{X1+X_2+X_3}$

We are not finish yet, what is $y_3$ in (2) then?

We know that $Y_3=X_1+X_2+X_3$ from the first transformation.

So at least we know $Y_3$ has a Irwin-Hall distribution.

I wonder can we plug the Irwin-Hall for $n=3$ pdf to (2) to get a explicit formula? or can we do some simulations from here as Glen suggested?

Deep North
  • 4,527
  • 2
  • 18
  • 38