0

A sample from a multivariate normal distribution $X$ can be constructed to have a covariance $C$ even for positive semi-definite covariances according to this technique involving an SVD. Furthermore, $C$ can be constructed from a correlation matrix $R$ and a diagonal matrix with the corresponding standard deviations in the diagonal $Q$ according to $C = QRQ$.

For the matrix-variate case this technique should be easily extendable, e.g. the Cholesky decompositions of sample covariance $S = AA^{T}$ and feature covariance $F = B^{T}B$ are both applied to yield a sample $Y = AXB$. Extrapolating this to the SVD based approach mentioned above and keeping its notation is something along the line of $Y = (\mathbb{V}_{F}\mathbb{D}_{F})\ X\ (\mathbb{V}_{S}\mathbb{D}_{S})$?

However, when estimating and checking the standard deviations of each component they are not close to those specified in $Q_{F}$ or $Q_{S}$ for the SVD based approach. They are fine for the multivariate normal case, e.g. $\mathbb{V}_{F}\mathbb{D}_{F}X$.


Example:

import numpy as np

stdsA = np.diag([4.0, 2.5, 1.5, 1.0])
stdsB = np.diag([1.0, 0.9, 0.8, 0.7, 1.0])

corrA = np.asarray([[1.0, -1, 0, 0],
                    [-1, 1, 0, 0],
                    [0, 0, 1, 1],
                    [0, 0, 1, 1]])

corrB = np.asarray([[1.0, 1, 1, 0, 0],
                    [1, 1, 1, 0, 0],
                    [1, 1, 1, 0, 0],
                    [0, 0, 0, 1, -1],
                    [0, 0, 0, -1, 1]])

AAt = np.dot(np.dot(stdsA, corrA), stdsA)
BtB = np.dot(np.dot(stdsB, corrB), stdsB)

uA, sA, vtA = np.linalg.svd(AAt)
uB, sB, vtB = np.linalg.svd(BtB)

sA[sA < 1.0e-8] = 0.0
sB[sB < 1.0e-8] = 0.0

Y1_, Y2_ = [], []
for i in xrange(10000):
    X = np.random.standard_normal((4, 5))
    Y1 = np.dot(np.dot(uA, np.diag(np.sqrt(sA))), X)
    Y2 = np.dot(Y1, np.dot(uB, np.diag(np.sqrt(sB))))
    Y1_.append(Y1)
    Y2_.append(Y2)
Y1 = np.asarray(Y1_)
Y2 = np.asarray(Y2_)

est_stdsA_Y1 = np.asarray([np.std(Y1[:, 0, :].ravel()), np.std(Y1[:, 1, :].ravel()), np.std(Y1[:, 2, :].ravel()), np.std(Y1[:, 3, :].ravel())])
est_stdsA_Y2 = np.asarray([np.std(Y2[:, 0, :].ravel()), np.std(Y2[:, 1, :].ravel()), np.std(Y2[:, 2, :].ravel()), np.std(Y2[:, 3, :].ravel())])
est_stdsB_Y2 = np.asarray([np.std(Y2[:, :, 0].ravel()), np.std(Y2[:, :, 1].ravel()), np.std(Y2[:, :, 2].ravel()), np.std(Y2[:, :, 3].ravel()), np.std(Y2[:, :, 4].ravel())])

print est_stdsA_Y1
print np.diag(stdsA)
print
print est_stdsA_Y2
print np.diag(stdsA)
print
print est_stdsB_Y2
print np.diag(stdsB)

Output:

[ 3.99356564  2.49597853  1.50897736  1.00598491]
[ 4.   2.5  1.5  1. ]

[ 6.16868733  3.85542958  0.84674523  0.56449682]
[ 4.   2.5  1.5  1. ]

[  8.21223336   8.16629974   0.00000000   0.00000000   0.00000000]
[ 1.   0.9  0.8  0.7  1. ]
baf84b4c
  • 123
  • 8

2 Answers2

1

I am not exactly sure what your desired properties are. However below is an outline of how to get specified expected inner and outer products.

From your Wikipedia link, we have $$X\sim\mathrm{MN}_{ n\times p}[0,A,B]\implies \mathbb{E}\left[XX^T\right]=\beta A \text{ , } \mathbb{E}\left[X^TX\right]=\alpha B \text{ , } \alpha=\mathrm{tr}[A] \text{ , } \beta=\mathrm{tr}[B]$$ where $\mathrm{tr}[]$ is the trace operator, which transforms a matrix to a scalar.

So if $X$ has a (reduced) SVD of $$X=USV^T \text{ , } U^TU=V^TV=I \text{ , } S=\mathrm{diag}[\sigma]$$ then $$\mathbb{E}\left[US^2U^T\right]=\beta A \text{ , } \mathbb{E}\left[VS^2V^T\right]=\alpha B$$

If we fix the SVD first, and define $$A=\tfrac{1}{\|\sigma\|}US^2U^T \text{ , } B=\tfrac{1}{\|\sigma\|}VS^2V^T$$ then by design we have $$\alpha=\beta=\|\sigma\|$$ and $$X=USV^T \implies XX^T=\beta A \text{ , } X^TX=\alpha B$$

For the matrix square roots, we have $$\mathsf{A}=U\mathsf{S} \text{ , } \mathsf{B}=\mathsf{S}V^T \text{ , } \mathsf{S}=\tfrac{1}{\sqrt{\|\sigma\|}}S \implies A=\mathsf{A}\mathsf{A}^T \text{ , } B=\mathsf{B}^T\mathsf{B} $$

So in terms of your formula for generating samples $$X=\mathsf{A}Z\mathsf{B} \text{ , }Z\sim\mathrm{MN}_{ n\times p}[0,I,I] \implies X\sim\mathrm{MN}_{ n\times p}[0,A,B]$$


Note that from the above, the SVD's of $A$ and $B$ are given by $$ A \equiv U_A S_A V_A^T = U\mathsf{S}^2U^T \text{ , } B \equiv U_B S_B V_B^T = V\mathsf{S}^2V^T $$ In particular, note that

  • both decompositions are symmetric: $V_A=U_A=U$ and $V_B=U_B=V$
  • their singular values are identical: $S_A=S_B=\mathsf{S}^2$

So, to the extent that I understand your desired properties, I believe you may have less degrees of freedom than you think you do.

GeoMatt22
  • 11,997
  • 2
  • 34
  • 64
  • I'll have to think about your post. See here for a somewhat more concise exposition of the problem: http://math.stackexchange.com/questions/1958330/how-to-sample-a-singular-matrix-variate-normal-distribution – baf84b4c Oct 07 '16 at 18:08
  • 1
    I may be wrong, as I played a little loose with the expectations in my answer. However at the very least, from the Wikipedia article you link to, the [expected $X$ covariances](https://en.wikipedia.org/wiki/Matrix_normal_distribution#Expected_values) are not the same as the "covariance matrices" that parameterize the PDF. Note the trace factors that multiply the $X$ covariances. – GeoMatt22 Oct 07 '16 at 18:18
0

This seems to be expected behavior. As mentioned by @GeoMatt22, due to inherent constraints, the expected standard deviations are not necessarily equal to those put in via $R$. See this section on the wiki article about the matrix-variate normal, e.g. specifically the product with the trace($S$) and trace($F$) respectively.

baf84b4c
  • 123
  • 8