1

This question is very similar to: Distribution of the ratio of dependent chi-square random variables

But the big difference is what happens when we don't have standard normal variables.


I want to know the expectation (and potentially other moments/distribution) of:

$$ W = \frac{X_1^2}{X_1^2 + \sum_{i=2}^{n} X_i^2} $$

where $X_i \sim N(0, \sigma_i^2)$ for $i = 1, ..., n$.

If all the variances of the $X$-s are equal, then this reduces to a ratio of chi-square random variables and the distribution of $W$ is a beta distribution. However, when the variances are not the same, I can't see a way to use that result.

Is the answer a simpe textbook result, or does it need to be worked out from scratch? If it's the latter, what's the best way to go about it? I haven't done this before.

Another way to phrasing it is what is the distribution of $W = \frac{U}{U+ V}$ where $U \sim \Gamma(\frac{N -1}{2}, 2\sigma^2)^{\text{is this correct?}}$, but then what is $V$? I guess that's another question: if $V = \sum_i \sigma_i ^2 Z_i^2$ is the sum of a bunch of scaled standard normal variables, what is its distribution? Is it some generalised form of the Gamma distribution?


Given some commenters suggested that this may not have a closed form solution, here is an alternative question. With large $n$, $\sum_{i=2}^{n} X_i^2$ tends towards normally distributed. So the alternative question is this:

If we have

$$ W_2 = \frac{Y^2}{Y^2 + a + bZ}$$

where $Y \sim N(0, 1)$ and $Z \sim N(0, 1)$ are independent and standard normal, and $a$ and $b$ are positive constants, does this have a closed form solution?

Marses
  • 291
  • 2
  • 11
  • 1
    I do not think there is a closed form solution in the general case. – Xi'an Jul 19 '21 at 09:20
  • @Xi'an Yeah the expectation looks like a complicated integral, hard to put it into a symbolic integrator since it has an arbitrary number of dimensions. I updated my question with a simplified form (if we assume n is large and the second chi-square tends to a normal distribution). Do you have any thoughts on that? – Marses Jul 19 '21 at 12:31
  • Re "with large $n$ ... normally distributed:" Not in general. You need to make assumptions about the $\sigma_i^2.$ – whuber Jul 19 '21 at 12:46
  • Yeah I know. But firstly, if the variances are finite and if let's say they all go as $\sigma_i^2 / N$ for some "roughly similar" $\sigma_i$, then as $N \rightarrow \infty$ the CLT works right? And anyway, this is the only useful approximation I could think of. Tried running the integral in Wolfgram, but it says the compute time was too long and I needed a pro account, so I assume this probably means that even the approximation doesn't have a closed form solution >.<.> – Marses Jul 19 '21 at 13:04
  • Where does the correlation come into play? Your initial description of the issue doesn't mention correlation and one might think that all of the $X_i^2$ random variables are independent. – JimB Jul 19 '21 at 15:58
  • The title is bad, I meant correlated in the sense that the denominator and numerator are correlated. But I didn't know what to call the ratio; neither the numerator or denominator are really chi-square variables, they are some weighted sums of chi-squared variables, but that's a complicated title. – Marses Jul 19 '21 at 16:33

1 Answers1

2

I don't see where you've described any correlation structure (despite the title using the term "correlated"). So maybe starting small with $n=2$ and assuming independence will be a start.

Using Mathematica (or probably just basic algebra) one can find the distribution of

$$W_2=X_1^2/(X_1^2+X_2^2)$$

where $X_i \sim N(0,\sigma_i^2)$.

dist = TransformedDistribution[x^2, x \[Distributed] NormalDistribution[0, s]];
distw2 = TransformedDistribution[x1/(x1 + x2), 
  {x1 \[Distributed] dist /. s -> \[Sigma][1], 
   x2 \[Distributed] dist /. s -> \[Sigma][2]}];
Mean[distw2]

with the result being

$$\frac{\sigma_1}{\sigma_1+\sigma_2}$$

That suggests that maybe for general $n$, there might be a simple formula for the mean. Either further manipulations or simulations could give support or squash that hope.

The variance of $W_2$ is $\frac{\sigma_1 \sigma_2}{2 (\sigma_1+\sigma_2)^2}$.

Addition:

The density of $W_3=X_1^2/(X_1^2+X_2^2+X_3^2)$ is

$$-\frac{\sigma_1^2 \sigma_3 E\left(-\frac{(w-1) \sigma_1^2 \left(\sigma_2^2-\sigma_3^2\right)}{\left((w-1) \sigma_1^2-w \sigma_2^2\right) \sigma_3^2}\right)}{\pi \sqrt{w \left(\sigma_2^2 w-\sigma_1^2 (w-1)\right)} \left(\sigma_1^2 (w-1)-\sigma_3^2 w\right)}$$

where $\sigma_2 \leq \sigma_3$, $E(.)$ is the complete elliptic integral, and found with the following Mathematica code:

dist2 = TransformedDistribution[x2 + x3, 
  {x2 \[Distributed] GammaDistribution[1/2, 2 s[2]^2], 
   x3 \[Distributed] GammaDistribution[1/2, 2 s[3]^2]},
   Assumptions -> 0 < s[2] <= s[3]];

dist123 = TransformedDistribution[x1/(x1 + x23), 
  {x1 \[Distributed] GammaDistribution[1/2, 2 s[1]^2],
   x23 \[Distributed] dist2}];
Simplify[PDF[dist123, w], Assumptions -> s[1] > 0]

I have not found a closed-form representation for the mean when $n=3$ but when the variances are known, then numerical integration should easily find the mean and other desired moments.

Special case

Suppose $\sigma_1^2=\sigma^2_U$ and $\sigma_i^2=\sigma^2_V$ for $i=2,\ldots,n$. Then $W=U/(U+V)$. The mean of that distribution can be found with

dist = TransformedDistribution[
  u/(u + v), {u \[Distributed] GammaDistribution[1/2, 2 \[Sigma]u^2],
   v \[Distributed] GammaDistribution[(n - 1)/2, 2 \[Sigma]v^2]}]

mean = FullSimplify[Mean[dist], Assumptions -> {\[Sigma]u > 0, \[Sigma]v > 0}]

$$\frac{(n+1) \sigma_u^2 \, _2F_1\left(-\frac{1}{2},\frac{n}{2};\frac{n+2}{2};1-\frac{\sigma_v^2}{\sigma_u^2}\right)-\left((n-1) \sigma_v^2+\sigma_u^2\right) \, _2F_1\left(\frac{1}{2},\frac{n}{2};\frac{n+2}{2};1-\frac{\sigma_v^2}{\sigma_u^2}\right)}{n \sigma_u \sigma_v}$$

where $_2 F_1$ is the hypergeometric function.

Specifying values of $n$ results in great simplifications:

FullSimplify[mean /. n -> 2, Assumptions -> {\[Sigma]u > 0, \[Sigma]v > 0}]

$$\frac{\sigma_U}{\sigma_U+\sigma_V}$$

FullSimplify[mean /. n -> 6, Assumptions -> {\[Sigma]u > 0, \[Sigma]v > 0}]

$$\frac{\sigma_U^2 (3 \sigma_U+\sigma_V)}{3 (\sigma_U+\sigma_V)^3}$$

JimB
  • 2,043
  • 8
  • 14
  • 1
    Ah yeah sorry the title is misleading; I mean correlated in the sense that the denominator as a whole is correlated with the numerator; though really neither the numerator or denominator are chi-square distributed. – Marses Jul 19 '21 at 16:21
  • That's interesting and a bit surprising actually, since my first (incorrect) intuition for this problem was that the answer would be the ratio of the variances (i.e. $\frac{\sigma_1^2}{\sigma_1^2 + \sum \sigma_i^2}$). Since asking my question, I ran numerical tests with $\sum_{i=1} \sigma_i^2 = 1$, a fixed $\sigma_1$ and all the other $\sigma_i^2 = \frac{1 - \sigma_1^2}{N-1}$ for $i = 2, ..., N$, and the results "look" like they tend to an asymptote, and numerically it looks like it doesn't depend on N as long as N is large, which is why I thought the second ratio would be easier. – Marses Jul 19 '21 at 16:30
  • By "second ratio" I mean $ W_2 = \frac{Y^2}{Y^2 + a + bZ}$, which is what you get if you assume that $X_i$ all have the same variance for i != 1, and that N is large so the chi-square sum tends toward a normal distribution. – Marses Jul 19 '21 at 16:38
  • Just tried a few simulations for $n=3$ and the hoped for simple formula for the mean doesn't work. Bummer. Maybe some other pattern will appear. – JimB Jul 19 '21 at 16:38
  • 1
    For a large $n$, you might consider using a gamma distribution with the same mean and variance as the sum of the $X_i^2$ ($i=2,\ldots,n$) random variables. That will result in an explicit approximation for the mean (as I'm not so sure that you'll get that assuming a normal distribution). – JimB Jul 19 '21 at 18:15
  • Not sure I understood that last part. I thought the Gamma distribution is exactly the same distribution as a sum of $X_i^2$ (when the Gamma shape is $(N-1)/2$ and all the $X_i$ have the same variance). Or do you mean to use a Gamma to approximate $X_i^2$ with different variances? Also what is an explicit approximation? – Marses Jul 20 '21 at 07:06
  • 1
    Yes, it is to use a gamma rather than a normal to estimate the sum of gammas with different variances. It would be a gamma that has the same mean and variance as the sum of the gammas with different variances. – JimB Jul 20 '21 at 12:02
  • The special case result looks very promising actually. I'll test it out when I get the time. What is $F_1$ in the result actually? Is it a special beta distribution or another known distribution? – Marses Jul 20 '21 at 12:30
  • 1
    Sorry. Just added a definition of the $_2 F_1$ hypergeometric function. – JimB Jul 20 '21 at 12:36
  • Thanks, that's actually something. I'll accept this answer. If the answer to the non-approximated distribution even exists and if it's posted, I'll switch to accepting that one (it seems that's unlikely though). But thanks for taking the time. Also, I compared your analytic result to the numerical test and it looks good (of course). – Marses Jul 20 '21 at 12:42
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/127714/discussion-between-marses-and-jimb). – Marses Jul 20 '21 at 13:42