4

In reference to this post, the pdf for dependent random variables $X_1+X_2$ is given by:

$$f_{X_1+X_2}(z) = \int_{-\infty}^{\infty} f_{X_1,X_2}(x,z-x) \mathrm dx$$

How does this formula extend to the multivariate case $f_{X_1+...+X_n}(z)$?

emcor
  • 1,143
  • 1
  • 10
  • 19
  • 2
    It's immediate from $X_1+\cdots+X_{n-1}+X_n = (X_1+\cdots+X_{n-1})+X_n$, leading to a recursion. – whuber Dec 29 '14 at 16:58

1 Answers1

6

Let's follow Dilip Sarwate's instructions in the post you reference:

... the formula is obtained by writing $F_{X_1+X_2}(z)$ as a double integral of the joint density function over the specified region and then "differentiating under the integral sign."

Let the joint density of random variables $(X_1, X_2, \ldots, X_n)$ be given by $f$. Then, by definition, the density of the sum $X=s(X)=X_1 + X_2 + \cdots + X_n$ is

$$F_X(x) = \Pr(X \le x) = {\int \cdots \int}_{s(\mathbf x) \le x} f(\mathbf x)d\mathbf x.$$

Assuming $F_X$ is differentiable at $x$, it has a density there given by

$$f_X(x) = \frac{d}{dx}F_X(x).$$

To obtain a formula like the quoted one, apply Fubini's Theorem to express $F_X$ as a repeated integral,

$$F_X(x) = \int_{-\infty}^\infty \cdots \int_{-\infty}^\infty \int_{-\infty}^{x-(x_2+x_3+\cdots+x_n)}f(x_1,x_2,\ldots,x_n) dx_1 dx_2 \cdots dx_n,$$

differentiate under the integral, and apply the (first) Fundamental Theorem of Calculus to obtain

$$\eqalign{ f_X(x) &= \int_{-\infty}^\infty \cdots \int_{-\infty}^\infty \left(\frac{d}{dx}\int_{-\infty}^{x-(x_2+x_3+\cdots+x_n)}f(x_1,x_2,\ldots,x_n) dx_1\right) dx_2 \cdots dx_n \\ &= \int_{-\infty}^\infty \cdots \int_{-\infty}^\infty f(x-(x_2+x_3+\cdots+x_n),x_2,\ldots,x_n) dx_2 \cdots dx_n. }$$

Any of the variables can play the role of $x_1$, yielding $n$ formulas for the sum.


Alternatively, define $Y_{i} = X_1 + X_2 + \cdots + X_i$ and apply the two-variable formula (as written just above for the case $n=2$) recursively via the relation

$$Y_i = X_1+\cdots+X_{i-1}+X_i = (X_1+\cdots+X_{i-1})+X_i = Y_{i-1}+X_i$$

for $i=n, n-1, \ldots, 2$ to obtain

$$\eqalign{ f_X(x) &= \int_{-\infty}^\infty f_{Y_{n-1},X_n}(x-x_n, x_n) dx_n \\ &= \int_{-\infty}^\infty \int_{-\infty}^\infty f_{Y_{n-2},X_{n-1},X_n}(x-x_{n}-x_{n-1}, x_{n-1}, x_n) dx_{n-1} dx_n \\ &\cdots \\ &= \int_{-\infty}^\infty \cdots \int_{-\infty}^\infty f(x-x_n-x_{n-1}-\cdots-x_2,x_2,\ldots,x_n) dx_2 \cdots dx_n, } $$

giving the same result.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • And what about $Y = X_1 + X_2$ where $X = \{x_1, x_2\}$ i.e. $X$ is a bivariate random variable and we want to compute a distribution of its sum $Y=\{y_1 ,y_2\}$ which is also bivariate? – Confounded Apr 29 '19 at 18:02
  • @Confounded I cannot make sense of that, because I understand the sum of $X$ to be $x_1+x_2,$ which is univariate. – whuber Apr 29 '19 at 19:03
  • Let $\mathbf{X_1} = [X_{1,1}, X_{1,2}]$ be a bivariate RV (e.g., a bivariate normal with covariance $\Sigma$). Let $\mathbf{X_1} \sim \mathbf{X_2}$. How do we calculate the distribution of a random variable $\mathbf{Y}$ which is given by a sum of two bivariate RVs, i.e. $\mathbf{Y} = \mathbf{X_1} + \mathbf{X_2} = [X_{1,1}, X_{1,2}] + [X_{2,1}, X_{2,2}] = [X_{1,1}+X_{2,1}, X_{1,2}+X_{2,2}] = [Y_1, Y_2]?$ What is the formula for the convolution in this case? – Confounded Apr 30 '19 at 11:38
  • @Confounded Because $Y_1=X_{1,1}+X_{2,1}$ (and similarly for $Y_2$) it's the same formula. – whuber Apr 30 '19 at 12:12
  • But what about the joint distribution (pdf) of $[Y_1, Y_2]$? If we calculate separately for $Y_1$ and $Y_2$, we only get their marginals, no? – Confounded Apr 30 '19 at 12:22
  • @Confounded I'm sorry, I had imagined you were asking about expectations and they don't care about joint distributions. To obtain the full distribution you use the multivariate generalization of the convolution. – whuber Apr 30 '19 at 13:36
  • Which I though what is thread was about since the title is "multivariate convolution formula"... – Confounded May 02 '19 at 09:22
  • @Confounded A close look at the question itself reveals that "multivariate" is intended in the sense of finding the distribution of the sum of $n$ *univariate* variables. – whuber May 02 '19 at 12:31