Distribution given sum

Question

I'm stuck on an exercise (it's not homework, but preparation for finals). It goes like this: $X_1, \dots, X_n$ are iid Exponential($\lambda$) (with parametrization $f(x)=\lambda e^{-\lambda x}$). What is the pdf $f(x_n|Y)$, where $Y=\sum_{i=1}^n X_i$? I know that $Y\sim Gamma(n, \lambda^{-1})$ with pdf $$ f(y)=\frac{\lambda^n}{\Gamma(n)}y^{n-1}e^{-y\lambda}. $$

So I have the marginals, but I'm not sure how to proceed. Maybe it's easier to get $f(y|X_n)$ and then multiply by the ratio of the marginals? Any help that puts me in the right direction is appreciated!

Edit: I think I solved it. Since all are iid, the joint is simply $f(x_n, t_{n-1})$ where $T_{k}=\sum_{i=1}^kX_i$. This comes from a product of convolutions. The joint of $(x_2, x_1)$ is just the product of the densities. The joint of $$ f(x_3, t_2)=f_{X_3}(x_3)f_{T_2}(t_3-x_3) $$ and so on. So finally $$ f(x_n, t_{n-1})=f_{X_n}(x_n)f_{T_{n-1}}(t_n-x_n). $$ The former is exponential, the latter is gamma, so the conditional is $$ f(x_n|T_n)=\frac{f_{X_n}(x_n)f_{T_{n-1}}(t_n-x_n)}{f_{T_n}(t_n)}. $$

A subtle point: it appears that the $X$ rv _is also included_ in $Y$. So the joint is $f(x_n,y) = f(x_n, x_n + \sum^{n-1}x_i)$. If not, you should clarify that. — Alecos Papadopoulos, Dec 21 '16 at 15:12
To which of these variables does "$x$" refer to in the expression "$f(x|Y)$"? — whuber, Dec 21 '16 at 15:28
@AlecosPapadopoulos Yes, precisely. But $f(x_n, \sum_{i=1}^nx_i)$ is just the same as $f(x_n, \sum_{i=1}^{n-1}x_i)$; loosely speaking, having the sum up to $n$ shouldn't make a difference as $x_n$ then appears twice. This shouldn't be controversial, but maybe I was vague here. I essentially just used the approach in Sasha's answer here: http://math.stackexchange.com/questions/335894/conditional-distribution-of-random-variable-given-its-sum-with-another-random-va. — jacknick, Dec 21 '16 at 19:47
@whuber: It refers to any of $n$ $x$-variables in the sum. They are all iid, so which specific $i$ we consider is of no importance (rather than perhaps notational convenience). — jacknick, Dec 21 '16 at 19:50
@jacknick $x_n$ is independent from $\sum^{n-1}$, but it is not independent of $\sum^{n}$. Also the marginal distribution of $\sum^{n-1}$ is not the same as that of $\sum^{n}$ (they may be of the same family, but they will have different parameter values). In what sense then the two joint densities are "just the same"? I am missing something here. — Alecos Papadopoulos, Dec 21 '16 at 19:53
I wanted to make sure it didn't actually refer to something else you forgot to mention! It would help to be explicit about the meaning of your notation. BTW, you might consider proving a generalization of this result (which is simple and well known): the ratio $X/(X+Y)$ where $X$ and $Y$ are independent Gamma distributions (with the same scale but possibly different shape parameters) is Beta: https://en.wikipedia.org/wiki/Beta_distribution#Derived_from_other_distributions. — whuber, Dec 21 '16 at 19:59
@AlecosPapadopoulos I think it is easier to grasp for discrete variables. Suppose you want $P(X_1=x_1, \sum_{i=1}^nX_i=t)$. Then this is the same as $P(X_1=x_1, \sum_{i=1}^nX_i-X_1=t-x_1)=P(X_1=x_1, \sum_{i=1}^{n-1}X_i=t-x_1)=P(X_1=x_1)P(\sum_{i=1}^{n-1}X_i=t-x_1)$. I guess that's the idea of what I used above. — jacknick, Dec 21 '16 at 21:06
I think the point is essentially that if you have the sum (bar $X_1$), but have that separately, then obviously you know the sum (including $X_1$). — jacknick, Dec 21 '16 at 21:10

whuber · Accepted Answer · 2021-07-15T14:49:55.100

9

It can be instructional and satisfying to work this out using basic statistical knowledge, rather than just doing the integrals. It turns out that no calculations are needed!

Here's the circle of ideas:

The $X_i$ can be thought of as waiting times between random events.
When the waiting times have independent identical exponential distributions, the random events are a Poisson process.
When normalized by the last time (given by $Y=X_1+X_2+\cdots + X_n$), these events therefore look like $n-1$ independent uniform values in $[0,1]$.
The values $0 \le X_1/Y \le X_1/Y+X_2/Y \le \cdots \le (X_1/Y+\cdots+X_{n-1}/Y) \le 1$ therefore are the order statistics for $n-1$ iid uniform variables.
The $k^\text{th}$ order statistic has a Beta$(k, n-k)$ distribution.
The PDF of a Beta$(k,n-k)$ distribution is proportional to $x^{k-1}(1-x)^{n-k-1}$ for $0\le x \le 1$, with constant of proportionality equal to (of course!) the reciprocal of the Beta function value $B(k,n-k)$.
Since $Y$ is invariant under any permutation of the $X_i$ and the $X_i$ are exchangeable, all the conditional distributions $f(X_i|Y)$ are the same.

Thus, the distribution of any of the $X_i$ conditional on $Y$ must be $Y$ times a Beta$(1,n-1)$ distribution. Scaling the Beta PDF by $y$ gives the conditional probability element

$$f_{X_1|Y=y}(x)\, \mathrm{d}x = \frac{1}{B(1,n-1)}\left(1-\frac{x}{y}\right)^{n-2}\frac{\mathrm{d}x}{y}$$

for $0 \le X_i \le y$.

This reasoning further implies the $n$-variate distribution of the $X_i$ conditional on $Y$ is $Y$ times a symmetric Dirichlet distribution.

Reference

Balakrishnan, N. and A. Clifford Cohen, Order Statistics and Inference. Academic Press, 1991.

edited Jul 15 '21 at 14:49

answered Dec 21 '16 at 20:28

whuber

281,159
54
637
1,101

I am not sure how the answer concludes that the Beta marginals imply a joint distribution which is a Dirichlet (my main question is: will the joint distribution be unique?), and I am not sure how to relate the Dirichlet distribution to order statistics as mentioned in the answer. In this answer (https://stats.stackexchange.com/questions/36093/construction-of-dirichlet-distribution-with-gamma-distribution), you talk about a similar question, but I can't wrap my head around how marginals are enough to specify the joint distribution here. – BlackHat18 Dec 22 '20 at 14:03
@Black You may be overthinking this: set $Y=1$ and immediately draw your conclusion. – whuber Dec 22 '20 at 14:06
I guess my question now boils down to this: does the random variable $\frac{X_i}{X_1+\cdots+X_{k+1}}$ have the same distribution as the random variable $X_{i} | X_1+\cdots+X_{k+1} = 1$? It sounds true intuitively, but is there a mathematical way to see this? – BlackHat18 Dec 22 '20 at 14:10
1

@Black Consider editing the question you posted to focus on this issue (which I think may be the heart of the matter). – whuber Dec 22 '20 at 14:12
Edited the question but it is still closed. Should I post a new question? Sincere apologies for not knowing the rule for a closed question. – BlackHat18 Dec 22 '20 at 14:24
1

@Black Please take a few minutes to review our [help]. It will explain how your question goes into a review queue for the community to vote on. – whuber Dec 22 '20 at 14:35
@whuber What is $dx$ in your final equation? – Adrian Keister Jul 15 '21 at 14:47
@Adrian It's a differential form. I also should have included it on the left hand side: I'll edit that. – whuber Jul 15 '21 at 14:49
I guess I was more asking what its meaning is: why does it show up in the equation at all? – Adrian Keister Jul 15 '21 at 14:52
1

@Adrian In this case, to emphasize that the variable of integration is $x$ (and not $y$ or $(x,y),$ both of which would otherwise be reasonable interpretations of the notation). In other cases, such as when transforming densities, it ensures the transformation is correctly computed. See our posts referencing [probability elements](https://stats.stackexchange.com/search?tab=votes&q=%22probability%20element%22). – whuber Jul 15 '21 at 14:54

score 1 · Answer 2 · edited Jul 16 '21 at 18:39

This might be the most "text book" answer on $f_{X_1|Y}(x_1|y)$.

Let $Z = X_2 + ... + X_n$. Then $Y = Z + X_1$.

First the joint distribution of $(Z, X_1)$ which is $f_{Z,X_1}(z,x_1) = \dfrac{\lambda^{n-1}}{\Gamma(n-1)}\,z^{n-2}\,e^{-z\lambda}\,\lambda \times e^{-x_1 \lambda}$ for $z \ge 0, x_1 \ge 0$.

Next we get the joint distribution of $(Y, X_1)$, which is $f_{Y,X_1}(y,x_1) = f_{Z,X_1}(z,x_1)$ since the absolute determinant of the Jacobian matrix is 1 (details below).

Therefore, $f_{Y,X_1}(y,x_1) = \dfrac{\lambda^n}{\Gamma(n-1)}\,(y-x_1)^{n-2}\,e^{-y\lambda}$ for $x_1 \ge 0, y \ge x_1$, and $0$ otherwise.

Finally, $f_{X_1|Y}(x_1|y) = \dfrac{f_{Y,X_1}(y,x_1)}{f_Y(y)}$. The rest follows easily from here.

Edit: I didn't see OP's "edit" part at first. So OP has solved it in the same "text book" way. I will leave my post here, in case it should be useful to others.

Edit 2: Details on the Jacobian.

We are doing variable transformations, from $Z, X_1$ to $Y, X_1$, defined as $Y=Z+X_1, X_1 = X_1$. So Jacobian matrix is 2-by-2, with the elements being (going across rows) $\dfrac{\partial Y}{\partial Z}, \dfrac{\partial Y}{\partial X_1}, \dfrac{\partial X_1}{\partial Z}, \dfrac{\partial X_1}{\partial X_1}$, i.e. $1,1,0,1.$ The absolute determinant of this matrix is $1.$

It would be very illustrative to me if you could please flesh out the Jacobian part of your argument a bit more. Which are the functions you're differentiating? And with respect to which variables? Thanks! — Adrian Keister, Jul 15 '21 at 17:50

Distribution given sum

2 Answers2

Reference

Linked