2

Suppose that there're two independent random variables X and Y, both of which are normal. X has parameters N(-1, 1) while Y has parameters N(1, 1).

And for the third random variable Z, its value is equal to either X or Y, with both probabilities as 50%. Is Z normally distributed? If it is what is its variances?

Yujian
  • 517
  • 3
  • 13
  • Related: https://stats.stackexchange.com/questions/16608/what-is-the-variance-of-the-weighted-mixture-of-two-gaussians/16609#16609. For more similar questions, please search our site: https://stats.stackexchange.com/search?tab=votes&q=mixture%20moment. – whuber Jan 08 '18 at 23:29
  • You could plot the density function of $Z$ at several points, e.g., $-3, -2.9, -2.8, ..., 3$ to get an idea of what it looks like. This would provide you with a lot of information! – jbowman Jan 08 '18 at 23:44
  • @whuber Thank you! For my question, can RV Z be expressed as Z = 0.5X + 0.5Y? For all "weighted mixtures" Z, can they be expressed as Z = X*Px + Y*Py? – Yujian Jan 09 '18 at 00:05
  • @jbowman Thank you jbowman! I did plot that with R and it was normal. But I want a mathematical prove. The link above is helpful. – Yujian Jan 09 '18 at 00:07
  • @Glen_b The wikipedia page solved my last confusion. Thank you Glen_b! – Yujian Jan 09 '18 at 00:13

1 Answers1

11

The random variable $Z$ is NOT $0.5X + 0.5Y$.

The expression $0.5X + 0.5Y$ describes averaging the values of $X$ and $Y$, not choosing one of them with probability $\frac12$.

With mixtures, it is instead the density (and cdf) that are averaged:

$$f_Z(z) = 0.5 f_X(z) + 0.5 f_Y(z)\,.$$

See Wikipedia's page on mixture distributions for basic information on this. (For means $\mu_i$ and sd's $\sigma_i$ with probability weights $w_i$, it gives the variance of the mixture as $\sum _{i=1}^n w_i((\mu _i-\mu )^2+\sigma_i^2)$ (where $\mu$ is the mean of the mixture). The results for mean and variance there are pretty easy to show.

If you plot the density of Z correctly you will see that it is not normal.

mixture density, and normal density with same mean and variance; we see the mixture has a "flat top" with bigger shoulders- higher density near 1 standard deviation from the mean

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Distributions like this have two modes. – Michael R. Chernick Jan 09 '18 at 00:45
  • 2
    This one only has one mode but a larger separation in means (ceteris paribus) will be bimodal. – Glen_b Jan 09 '18 at 01:38
  • 1
    @MichaelChernick I simulated this, and it looks normal (granted, I did it once, but with n = 1000): https://gist.github.com/markhwhiteii/201c6bafbec5d54e5081535508bc9c8e If it looks normal, is it still normal, even if we know it comes from a mixture of two similar distributions? Is that too theoretical? If it looks normal, I would say, "It is a normal distribution", but strictly speaking, it is not... since, by definition, it is a mixture. – Mark White Jan 09 '18 at 02:29
  • 1
    @MarkWhite I will draw the density so that you can see how it compares to a normal. – Glen_b Jan 09 '18 at 04:45
  • 1
    @Mark It's there now; you can see it's noticeably different. – Glen_b Jan 09 '18 at 04:56
  • @Glen_b +1, thanks. And I assume the peak is a plateau because it is spreading out evenly across the two means, -1 and 1? And the further away from one another they are, the more bimodal it becomes? – Mark White Jan 09 '18 at 05:05
  • Imagine we put the centers of the two components at $\pm k$. When $k=0$ you of course have a standard normal. In the neighborhood of the peak the each component is approximately quadratic, so when you have $k\approx\epsilon$ for some small $\epsilon$ the average of the densities will also be approximately quadratic near the center, but as you move them further apart that become flatter and flatter. Now at 1sd from the mean the normal curve is approximately linear (second derivative is 0 there) and, combined with symmetry tells us that when we get $k=1$ the average of the densities ...ctd – Glen_b Jan 09 '18 at 05:25
  • ctd... will be flat at the center; We can check that it's still actually unimodal by looking at the behaviour of the second derivative. But as soon as $k>1$ it dips down in the center giving a bimodal mixture. We're right at the dividing line between bimodal and unimodal. As $k$ grows further the modes become more distinct and separate. – Glen_b Jan 09 '18 at 05:25
  • @MarkWhite I simulated and plotted this, and it doesn't quite look normal; you can see the rather flat spot at the top if you overlay a normal density function over the histogram. IMO, because we were told the density function, there's no point debating if it's unimodal or bimodal. We know it's bimodal. Maybe a better question is: if we encountered this in practice, without knowing the density function, how would we treat it? A single normal density would probably be acceptable. – Weiwen Ng Jan 09 '18 at 14:25
  • 1
    @Weiwen This distribution is unimodal, not bimodal. Modes refer to properties of the distribution, not to how it is expressed. So although this one is constructed from two distributions that have two separate modes, it nevertheless is unimodal. You can verify this by finding the zeros of the derivative of its density, which is a positive multiple of $x+1+e^{2x}(x-1)$: it has a unique zero at $x=0$. Whether approximating it with a Normal distribution is "acceptable" will depend on the application. – whuber Jan 09 '18 at 15:29