4

Suppose we have a $p$-dimensional Gaussian distribution, and we take $n$ observations from that distribution.

This answer states that when $p > n$, then the sample variance covariance matrix is singular and has rank $\leq n-1$. Does this hold when $p = n$? This question on PCA seems to imply that even when $p = n$, the variance covariance matrix cannot have rank higher than $n-1$.

tchainzzz
  • 1,016
  • 3
  • 11
Yandle
  • 743
  • 2
  • 12
  • 2
    The rank of the covariance matrix is the dimension of the *affine* subspace generated by the data columns. Euclid informs us that this dimension cannot exceed $n-1,$ *QED.* – whuber Jun 14 '20 at 16:11

1 Answers1

8

Yes, the sample variance-covariance matrix is singular if the dimension $p$ is equal to sample size $n$.

Proof:

Let $x_1, .., x_p$ be $p$ vectors of $\mathbb{R}^p$. Let's denote their mean vector by $\bar{x} = \frac{1}{n}\sum_i x_i $ .

Define their sample variance-covariance matrix $$S = \frac{1}{n}\sum_i (x_i - \bar{x}) (x_i - \bar{x}) ^ T = U U^T$$ where $U$ is the (square) matrix whose columns are $x_i - \bar{x}$.

Since $U$ is squared, $$\mathrm{det}(S) = \mathrm{det}(U)^2$$ so if $U$ is singular, $S$ is. Now we can see that $U$ is singular since : $$U \times \left( \begin{array}[c] \;1\\ \vdots\\ 1 \end{array}\right) = \left(\begin{array}[c] 1 \sum_i (x_{i,1} - \bar{x}_1)\\ \;\;\;\;\;\;\vdots\\ \sum_i (x_{i,p} - \bar{x}_p) \end{array}\right) = \left( \begin{array}[c] \; 0\\ \vdots\\ 0 \end{array}\right)$$

Pohoua
  • 2,003
  • 2
  • 15
  • 1
    +1 Very clear and direct. – whuber Jun 14 '20 at 16:11
  • 1
    So to show that the rank of $S$ is at most $n-1$, can I argue that $rank(S) = rank(UU^T)=rank(U)$ and that $U$ only has $n-1$ linearly independent columns because we subtracted the $\bar x$ from each $x_i$ (which forces one of the columns of $U$ to be a linear combination of the other columns)? The rank is equal to $n-1$ if $x_1,...,x_p$ is linearly independent, otherwise the rank equals the number of linearly independent predictor? – Yandle Jun 14 '20 at 18:32