0

In a book I'm reading (Probabilistic Machine Learning: An Introduction) the author suggested that in high dimensions, the MLE estimate for the covariance matrix for multivariate gaussian is often poorly conditioned.

I'm trying to understand - is there a mathematical explanation as to why MLE for high dimensional multivariate Gaussian covariance matrix is likely to be ill-conditioned? Is this even the case?

I couldn't find any evidence for this online other than people encountering ill-conditioned matrixes while fitting a multivariate Gaussian.

  • 3
    The simplest explanation is that the number of parameters needed to specify the (full covariance) of a Gaussian is $p(p+1) / 2$, i.e. order $p^2$, where p is the dimension; so if the number of observations $n$ is of the same order as (or smaller than) as $p$ you end up with close to singular (or singular) empirical covariance matrix. – dr.ivanova Jan 12 '22 at 13:27
  • @dr.ivanova This looks like an answer rather than a comment. – Christian Hennig Jan 12 '22 at 14:33
  • @dr.ivanova The author said something similar in the book but I still don't see it! Why does the fact that there are many parameters mean the matrix will be close to singular? – user346500 Jan 12 '22 at 16:54

1 Answers1

0

A covariance matrix is ill-conditioned when it is singular or near-singular.

Suppose you have data $X\in\mathbb{R}^{n\times p}$ which you wish to model as multivariate Gaussian of $p$ dimensions; $n$ here is the number of observations.

A full covariance matrix has $p(p+1)/2$ free parameters, so if the number of observations $n$ is the same order as (or smaller than) $p$, then you end up with close to singular (or singular) empirical covariance matrix.

To understand why, without loss of generality, suppose $X$ is mean 0. Then the empirical covariance matrix is $\hat{\Sigma}=1/nXX^T\in\mathbb{R}^{p\times p}$. The rank of the empirical covariance matrix is at most $\min(p, n)$ (this is because $rank(\hat{\Sigma})=rank({XX^T})=rank(X)\leq min(n, p)$. Hence if $n<p$, then the empirical covariance is singular.

I found this related question with a very detailed answer which you will find helpful.

dr.ivanova
  • 378
  • 2
  • 10