2

What is the sampling distribution of the variance of a collection of variables that follow a multivariate normal distribution? Specifically, assume that the $n-$dimensional vector $\boldsymbol{x} \sim \mathcal{N}(\boldsymbol{\mu},\boldsymbol{\Sigma})$, where $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ are known. Denote the sample mean by $\bar{x} = \sum_{i=1}^n x_i/n$. What is the distribution of $s^2_x \equiv \sum_{i=1}^n (x_i - \bar{x})^2/(n-1)$?

I know that the sample variance of a collection of independent and identically distributed normal variables follows a chi-squared distribution, but have been unable to find an extension to the case of correlated normal variables.

I have posed this question more generally, but I am specifically interested in the case of exchangeable variables $x_i$ which marginally have the same variance but are positively correlated with each other. I have simulated the problem with various variance and correlation parameters and suspect that the sample variance is chi-squared in this instance as well, but would like a reliable reference for this result if true.

It seems that a transformation of a multivariate normal distribution would be useful here. As far as I understand, we can write $\boldsymbol{x} = \boldsymbol{\mu} + \boldsymbol{A}\boldsymbol{z}$, where $\boldsymbol{z}$ is an $n-$dimensional vector of independent and identically distributed unit normal variables and $\boldsymbol{A}$ is the Cholesky decomposition of $\boldsymbol{\Sigma}$ such that $\boldsymbol{A}\boldsymbol{A}' = \boldsymbol{\Sigma}$.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Oak Wall
  • 35
  • 5
  • Exchangeability implies they all have equal means, so you might as well assume $\mu=0.$ That considerably simplifies the problem. – whuber Aug 30 '18 at 19:01
  • 2
    I agree, in the exchangeable case we can set $\boldsymbol{\mu} = 0$ without loss of generality. I figured it would be best to pose the question in its more general form for future people searching for answers. – Oak Wall Aug 31 '18 at 16:26

1 Answers1

3

I guess you are considering each vector $\boldsymbol{x}$ as one sample. I would write this vector as $x$ below just for convenience.

$\sum_{i=1}^n (x_i - \bar x)^2 = (x-\frac{1}{n} 1_n1_n'x)'(x-\frac{1}{n} 1_n1_n'x)=x'(I_n-\frac{1}{n} 1_n1_n')(I_n-\frac{1}{n} 1_n1_n')x$ where $1_n$is the vector of ones with length n.

$I_n-\frac{1}{n} 1_n1_n'$ is idempotent matrix.

We get this : $\sum_{i=1}^n (x_i - \bar x)^2 =x'(I_n-\frac{1}{n} 1_n1_n')x$

And as you suggested using $\boldsymbol{x} = \boldsymbol{\mu} + \boldsymbol{A}\boldsymbol{z}$, the RHS of the above equation is

$x'(I_n-\frac{1}{n} 1_n1_n')x=(\mu+AZ)'(I_n-\frac{1}{n} 1_n1_n')(\mu+AZ)$.

We know that $A$ is not singular and we get

$(n-1)s_x^2=(A^{-1}\mu+Z)'A'(I_n-\frac{1}{n} 1_n1_n')A(A^{-1}\mu+Z)$.

This result would have noncentral$\chi^2$ distribution with degrees of freedom rank $((I_n-\frac{1}{n} 1_n1_n'))=n-1$ if and only if $(I_n-\frac{1}{n} 1_n1_n')AA'(I_n-\frac{1}{n} 1_n1_n')=(I_n-\frac{1}{n} 1_n1_n')$ with noncentral parameter $\lambda=\mu'(I_n-\frac{1}{n} 1_n1_n')\mu$.

edit 1. It is the result of the more general distribution theory. When $x\sim N(\mu,\Sigma)$, for a matrix $B$ with rank $r$, $x'Bx$ has a $\chi^2$ distribution with non-central parameter $\mu'B\mu$ and degrees of freedom $r$ if and only if $B\Sigma B=B$.

Oak Wall
  • 35
  • 5
KDG
  • 758
  • 5
  • 17
  • Thank you. Do you have a reference for the more general distribution theory you noted in "edit 1"? – Oak Wall Aug 29 '18 at 21:15
  • Hogg , Craig, and McKean, Introduction to Mathematical statistics. – KDG Aug 29 '18 at 22:52
  • 1
    General result discussed at https://stats.stackexchange.com/q/188626/119261. – StubbornAtom Jun 25 '20 at 18:15
  • To complete the answer, are we saying that the expectation of $(n-1)s^2$ is $r+\lambda=n-1+\mu′B\mu$, i.e. expectation of sample variance is the sample variance of $\mu$ + 1? How does this relate to the iid case? Also, you write if and only if $B\Sigma B=B$, is this not always the case? Thanks a lot!!! – Matifou Aug 18 '20 at 18:30
  • The link provided by StubbornAtom above is excellent and provides what is needed for the more general answer. – Oak Wall Feb 11 '21 at 13:03