0

So I want to sample from multivariate normal distribution and have this code where mean is 0 and I added covariance matrix with all entries to 1 implying all random variables are equally correlated.

import numpy as np
X = [0,1,2]
samples = np.random.multivariate_normal([0,0,0], [[1,1,1],[1,1,1],[1,1,1]]) 
print(samples)

>> samples [-0.89635305 -0.89635305 -0.89635305]

The question is for computing the trinormal distribution the cholesky decomposition of the covariance matrix has to be done, but here the rank of the matrix is 1, so why the code works and doesnt throw error?

It only gives warning if the covariance matrix is following:

 [[1,0,1],[0,1,0],[1,1,1]] 

Any explanation for this?

GENIVI-LEARNER
  • 720
  • 4
  • 13
  • 1
    If $x \sim \mathcal{N}(0,1^2)$ and $y$ is perfectly correlated with $x$, what value do you suppose that $y_1$ takes when you know $x_1=-0.896$? How can you use this information to construct a sampler for $y$? – Sycorax Feb 06 '20 at 16:15
  • both the values will be same. the sampler of y shall be same as x. But thats why I am confused, according to formulation the sampler has to perform cholesky decomposition of the matrix with rank 1, which shall throw an error. [Reference](https://juanitorduz.github.io/multivariate_normal/) – GENIVI-LEARNER Feb 06 '20 at 16:21
  • even as per here $\exp(-\frac12 \vec x^T\Sigma^{-1}\vec x)$ we cant compute the inverse of rank 1 co variance matrix – GENIVI-LEARNER Feb 06 '20 at 16:23
  • BTW i had a typo in my post, instead of "it only gives warning" I had written "it also gives warning" – GENIVI-LEARNER Feb 06 '20 at 16:25
  • 1
    The recipe you're using assumes that your covariance matrix is full rank. Cholesky decomposition is only unique if $\Sigma$ is positive definite. And your comment about inverting a rank-deficient matrix should make it clear why this is a degenerate case. I'm not sure exactly what `numpy` is doing, but clearly it's smart enough to detect that $\Sigma$ is not full rank and use an alternative strategy. You can read the source to find out how `numpy` works. – Sycorax Feb 06 '20 at 16:26
  • Conceptually it works as well as in code. But the formulation of the multivariate normal distribution implies it wont or shouldnt work "as it requires matrix to be invertable), that is what I am trying to understand. – GENIVI-LEARNER Feb 06 '20 at 16:30
  • well now this makes good sense. So you are implying that we don't define a distribution over random vectors if every thing is fully correlated. The distribution is "only" defined if the vectors are correlated to some degree such as ```[[1,0.5],[0.5,1]]``` right? I think this is the last hurdle for me to fully understand what you meant. – GENIVI-LEARNER Feb 06 '20 at 17:01
  • Technically, when $\Sigma$ is singular, you don't define a distribution over random vectors. Perfect correlation implies the vectors lie in a *plane*, which has volume 0, i.e. its $\mathcal{L}^3$ measure is 0, so it's not a distribution on $\mathbb{R}^3$. So your intuition about it "not working" because $\Sigma$ is singular is correct. The `numpy` implementation is playing fast-and-loose with the definition of "random vector" and inferring that, loosely, you want 3 values that are perfectly correlated. This is well-defined in 1 dimension, so it's just replicating the single value 3 times. – Sycorax Feb 06 '20 at 17:21
  • 1
    Correcting my previous comment. Your comment is correct -- we can have correlated vectors up to a certain degree, but if the vectors are so correlated that $\Sigma$ is singular, then we don't have a random vector in the technical sense. – Sycorax Feb 06 '20 at 17:23
  • Well now it makes perfect sense! This could have been an accepted answer if posted as an answer :) Cheers! – GENIVI-LEARNER Feb 06 '20 at 17:39
  • 3
    Does this answer your question? [Generate normally distributed random numbers with non positive-definite covariance matrix](https://stats.stackexchange.com/questions/63817/generate-normally-distributed-random-numbers-with-non-positive-definite-covarian) (I probably should have searched for duplicates before writing my answer but here we are...) – Sycorax Feb 06 '20 at 17:45
  • Well your contribution was quite helpful. It is good to have unbiased multiple views. As far as the linked post it concerned, yes it does provide additional context. You did explain well the concern in its entirety. – GENIVI-LEARNER Feb 06 '20 at 17:50

2 Answers2

2

The recipe you're using assumes that your covariance matrix is full rank. Cholesky decomposition is only unique if $\Sigma$ is positive definite. And your comment about inverting a rank-deficient matrix should make it clear why this is a degenerate case. I'm not sure exactly what numpy is doing, but clearly it's smart enough to detect that $\Sigma$ is not full rank and use an alternative strategy. You can read the source to find out how numpy works.

Technically, when $\Sigma$ is singular, you don't define a distribution over random vectors. Perfect correlation implies the $n$-vectors lie in a $n$-plane, which has volume 0, i.e. its $\mathcal{L}^n$ measure is 0, so it's not a distribution on $\mathbb{R}^n$. So your intuition about it "not working" because $\Sigma$ is singular is correct. The numpy implementation is playing fast-and-loose with the definition of "random vector" and inferring that, loosely, you want 3 values that are perfectly correlated. This is well-defined in 1 dimension, so it's just replicating the single value 3 times.

Putting this all together, if the vectors are so correlated that $\Sigma$ is singular, then we don't have a random vector in the technical sense, but it's possible to write software to "side-step" the technical issue so that it "works" in a certain sense.

Sycorax
  • 76,417
  • 20
  • 189
  • 313
  • The software would have to perform lots of checks in such as where a scenario like in Gaussian Process where say taking 3000 samples and where say $x_{200}$ is perfectly correlated with $x_{1500}$ and various permutations of the concept! Since you clarified some doubts, I shall be looking into how is this dealt with in GP – GENIVI-LEARNER Feb 06 '20 at 17:55
  • 1
    I'm not sure that how hard it is to check this. It probably checks if the matrix is full rank, and if it's not, falls back to using an alternative strategy. Factorizing a very large matrix is expensive, but... that's the nature of working with large matrices. – Sycorax Feb 06 '20 at 17:57
  • It is not that hard, see my answer. – marnix Apr 29 '20 at 21:44
  • @marnix cool! Looks like you implemented what my comment describes. – Sycorax Apr 29 '20 at 21:50
1

+1 for the accepted answer. For reference, here is some code that implements drawing samples from a multivariate normal distribution with a possibly rank-deficient covariance matrix:

import numpy as np

def eigh_sample(mean, cov, small=1e-9): 
    s, v = np.linalg.eigh(cov)
    s[abs(s) < small] = 0
    return mean + v * np.sqrt(s) @ np.random.standard_normal(len(mean))

>>> eigh_sample([0,0,0], [[1,1,1],[1,1,1],[1,1,1]])
array([-0.57289804, -0.57289804, -0.57289804])

This is a simple modification of this snippet.

marnix
  • 155
  • 8