Scaled sample variance as sum of squares of normal variables

Question

I want to prove that $(n-1)S^2 = \sum_{i=1}^{n} (X_i - \bar{X})^2$ can be written as $\sum_{i=2}^{n} Y_i^2$, with $Y_i = N(0,\sigma^2)$, $X_i$ and $Y_i$ $i.i.d$.

I managed to do it by taking a big detour. I used the fact that $S^2$ and $\bar{X}$ are independent (Basu's lemma), then showed that the moment generating function for $(n-1)S^2/\sigma^2$ is that of a Chi square with $n-1$ degrees of freedom, so it must be written as a sum of squares of $Z_i$ variables.

But isn't there a more direct approach where you can actually compute/figure out each of those $Y_i$'s?

My answer [here](https://math.stackexchange.com/a/2953961/321264) may help. — StubbornAtom, Aug 03 '20 at 19:45
That's exactly what I was looking for. But can you provide an example of such matrix Q (orthogonal and with first row constant equal to 1/sqrt(n)) — J.333, Aug 03 '20 at 22:26
You may find it enlightening to look at the sample variance as half the average squared distance between pairs of values. — Glen_b, Aug 04 '20 at 12:03

BruceET · Answer 1 · 2020-08-03T20:39:03.603

Normal data with $\mu$ and $\sigma$ both unknown. If you know that the population is $\mathsf{Norm}(\mu=0,\sigma),$ then there is no point estimating $\mu$ by $\bar X.$ Furthermore, if you then ignore $\bar X,$ the distribution theory as a little different, using $\mathsf{Chisq}(\nu=n)$ instead of $\mathsf{Chisq}(\nu=n-1).$

Normal data with $\mu$ known and $\sigma$ unknown. In general, for data from $\mathsf{Norm}(\mu=\mu_0,\sigma),$ with $\mu_0$ known, you have $Z_i = \frac{X_i-\mu_0}{\sigma},$ thus $Z_i^2 = \frac{(X_i-\mu_0)^2}{\sigma^2}$ $\sim \mathsf{Chis}(\nu = 1),$ and $$\sum_{i=1}^n Z_i^2 = \sum_{i=1}^n\frac{(X_i-\mu_0)^2}{\sigma^2} \sim\mathsf{Chisq}(\nu=n).$$

Then with $\mu_0 = 0$ and $V =\frac{1}{n}\sum_{i=1}^n X_i^2,$ you have $\frac{nV}{\sigma^2} \sim\mathsf{Chisq}(\nu=n).$

Examples. As a practical application, suppose you have $n=10$ observations from $\mathsf{Norm}(\mu=0,\sigma=5),$ you don't know either $\mu$ or $\sigma,$ and you want a 95% confidence interval for $\sigma^2.$ [Using R.]

set.seed(803)
x = rnorm(10, 0, 5)
a = mean(x);  s = sd(x)
a;  s
[1] -0.460746
[1] 6.010824

If you don't know either $\mu$ or $\sigma,$ then a 95% CI for $\sigma^2$ is of the form $\left(\frac{9S^2}{U},\, \frac{9S^2}{L}\right) =$ $(17.1,120,4),$ where $L$ and $U$ cut 2.5% of the probability from the lower and upper tails, respectively, of $\mathsf{Chisq}(9).$ The corresponding 95% CI for $\sigma$ is $(4.13, 10.97),$ which does happen to include $\sigma = 5.$

CI = 9*s^2/qchisq(c(.975,.025), 9); CI
[1]  17.09373 120.41598
sqrt(CI)
[1]  4.134456 10.973422

However, if you know that $\mu = 0$ and want a 95% CI for $\sigma^2,$ then the CI is of the form $\left(\frac{10V}{U},\, \frac{10V}{L}\right) =$ $(16.0,100.8),$ where $L$ and $U$ cut 2.5% of the probability from the lower and upper tails, respectively, of $\mathsf{Chisq}(10).$ The corresponding 95% CI for $\sigma$ is $(4.00, 10.04).$ Notice that knowing $\mu=0$ provides relevant information, so that this CI is a little shorter than the one for unknown $\mu.$

v = sum(x^2)/10;  c
[1] 7.814728
CI = 10*v/qchisq(c(.975,.025), 10);  CI
[1]  15.97862 100.79941
sqrt(CI)
[1]  3.997327 10.039891

Simulations. In the upper plot below, the the density function for $\mathsf{Chisq}(9)$ fits then histogram for simulated values of $9S^2/\sigma^2.$ In the lower plot, the density of $\mathsf{Chisq}(10)$ fits the histogram for simulated values of $10V/\sigma^2,$ but the density of $\mathsf{Chisq}(9)$ does not. [R code below figure.]

set.seed(2020)
s = replicate(10^5, sd(rnorm(10,0,5)))
q.9 = 9*s^2/25
v = replicate(10^5, mean(rnorm(10,0,5)^2))
q.10 = 10*v/25

mx = max(q.10,q.9)
par(mfrow=c(2,1))
 hist(q.9, prob=T, br=30, xlim=c(0,mx), col="skyblue", main="CHISQ(9)")
  curve(dchisq(x,9), add=T, lwd=2)
 hist(q.10, prob=T, br=30, xlim=c(0,mx), col="skyblue", main="CHISQ(10)")
  curve(dchisq(x,10), add=T, lwd=2)
  curve(dchisq(x,9), add=T, col="red", lwd=3, lty="dotted")
par(mfrow=c(1,1))

score 1 · Answer 2 · answered Aug 04 '20 at 07:32

You could look at your problem from a slightly different perspective.

Theorem. Let $y\sim N_n(0,\sigma^2 I_n)$ and let $Q=\sigma^{-2}y′Ay$ for a symmetric matrix $A$ of rank $r$. Then if $A$ is idempotent, $Q$ has a $\chi^2(r)$ distribution.

Proof. See Mathai & Provost, Quadratic forms in random variables, New York: Marcel Dekker, 1992, p.~196, or Distribution of a quadratic form, non-central chi-squared distribution, accepted answer.

Now: $$\sum_{i=1}^n\frac{(X_i-\overline{X}_n)^2}{\sigma^2}=\sum_{i=1}^n\frac{((X_i-\mu)-(\overline{X}_n-\mu))^2}{\sigma^2}=\sum_{i=1}^n(Y_i-\overline{Y}_n)^2$$ where $Y_i=(X_i-\mu)/\sigma\sim N(0,1)$.

Let $\mathbf{1}=(1,1,\dots,1)'$, $\mathbf{J}=\mathbf{11}'$. \begin{align*} \sum_{i=1}^n(Y_i-\overline{Y}_n)^2&=\sum_{i=1}^n Y_i^2-\frac1n\left(\sum_{i=1}^nY_i\right)^2\\ &=\mathbf{Y}'\mathbf{IY}-\frac1n \mathbf{Y}'\mathbf{JY}=\mathbf{Y}'\left(\mathbf{I}-\frac1n \mathbf{J}\right)\mathbf{Y} \end{align*} $\mathbf{D}=\mathbf{I}-\frac1n \mathbf{J}$ is a symmetric and idempotent matrix, and its diagonal elements are $(n-1)/n$, for example:

sage: One = matrix([[1],[1],[1]]); One                                                                                              
[1]
[1]
[1]
sage: J = One * One.transpose(); J                                                                                                  
[1 1 1]
[1 1 1]
[1 1 1]
sage: I = identity_matrix(3)                                                                                                        
sage: D = I - J/3; D                                                                                                                
[ 2/3 -1/3 -1/3]
[-1/3  2/3 -1/3]
[-1/3 -1/3  2/3]
sage: D * D                                                                                                                         
[ 2/3 -1/3 -1/3]
[-1/3  2/3 -1/3]
[-1/3 -1/3  2/3]

therefore $\text{rank}(\mathbf{D})=\text{trace}(\mathbf{D})=n-1$, and $\sum_i(Y_i-\overline{Y}_n)^2=\sigma^{-2}\sum_i(X_i-\overline{X}_n)^2\sim\chi^2(n-1)$.

Scaled sample variance as sum of squares of normal variables

2 Answers2