2

I am trying to calculate the expectation of absolute value of a linear combination of independent Bernoulli random variables, $E(|a_1x_1 + a_2x_2 + ...+ a_nx_n|)$. In particular, $(a_1, ..., a_n)$ are the coordinates of a unit vector.

Of course if all $a_i = 1$ we got Binomial distribution but I am wondering for general coefficients $a_1, ..., a_n$ whether the sum $a_1x_1 + a_2x_2 + ...+ a_nx_n$ would follow any distribution that I can easily get the expectation of absolute value.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Jane
  • 23
  • 3
  • 2
    Some closely related posts with relevant information are https://stats.stackexchange.com/questions/5347 and https://stats.stackexchange.com/questions/41247. – whuber Dec 04 '20 at 14:27
  • 1
    They are very informative indeed, thank you! – Jane Dec 04 '20 at 22:58

2 Answers2

3

Let $Y = \sum a_iX_i$.

In general the cost of computing $\mathbb{E}(|Y|)$ exactly will likely grow exponentially with $n$, because calculating $\mathbb{P}(|Y| = y)$ involves deciding whether there exists a subset of the $a_i$ whose sum is either $y$ or $-y$, which is the NP-complete 0-1 knapsack problem.

There are variants of the Central Limit Theorem for non-identically distributed independent random variables like your $a_iX_i$, such as the Lyapunov CLT. Based on this, for large $n$ you could try approximating $Y$ with a normal distribution with mean $\sum{a_ip}$ and variance $\sum{a_i^2p(1-p)}$, and $|Y|$ with the corresponding folded normal distribution.

But this will be innaccurate if most of the variance in $Y$ comes from only a few of the $a_i$ - for example, if one of the $a_i$ is much larger in magnitude than all the others then the distribution of $Y$ will be bimodal rather than normal. (If your unit vector is uniformly chosen from the $n$-dimensional unit sphere this is unlikely to happen for large $n$. A random unit vector can be chosen by taking coordinates from $n$ i.i.d. normal distributions and normalising the results, and since normal distributions have thin tails, it's unlikely that a small number of these coordinates will be significantly bigger than the rest.)

It might be easiest to just approximate the expectation using Monte Carlo (simulate $|Y|$ many times and take the mean of the simulations).

fblundun
  • 3,732
  • 1
  • 5
  • 18
0

You might try the saddlepoint approximation, see How does saddlepoint approximation work?. For that we need the mfg (moment generating function) and its logarithm, the cgf (cumulant generating function.) The mgf for a single Bernoulli variable $p$, is $(1-p)+p e^t$. So the mgf for $a_i X_i$ is $(1-p)+p e^{a_i t}$.

But we are interested in the absolute value of the sum, so let us split the sum in parts after the sign of the $a_i$'s. Let $I_+, I_-$ be the sets of indices with positive and negative coefficients. Then $$ | Y | = Y_+ - Y_- $$ where $Y_+=\sum_{i\in I_+} a_i X_i$ and $Y_- =\sum_{i\in I_-} a_i X_i$. We get the mgf's $$ M_+(t) =\prod_{i\in I_+}\left( (1-p)+p e^{a_i t} \right) \\ M_-(t) =\prod_{i\in I_+}\left( (1-p)+p e^{a_i t} \right) $$The cgf $M_\pm(t)$ are the logarithms of the mgf's, and finally the cgf of the absolute value is $$ K_{|Y|}(t)= K_+(t) + K_-(-t). $$

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467