Ways to modify data minimally while the variables to follow the desired covariances

Question

Let $\bf X$ be the p-variate dataset. I want to modify the data to p-variate data $\bf Y$ so that its variables satisfy precisely a given SSCP (or covariance, or correlation) matrix $\bf R$: $\bf Y'Y=R$, and the requirement is that the modification of values is as small as possible (in terms of squared error): $\mathbf {\|Y-X\|}^2=\text{min}$.

What methods can you recommend?

I customarily use this way (which I discovered myself years ago):

Compute eigenvalues and eigenvectors of $\bf R$ and obtain PCA loadings from them (loading vector is eigenvector scaled up to the respective eigenvalue): $\bf A$.
Likewise obtain loadings from $\bf X'X$: $\bf A_x$.
Procrustes-rotate $\bf A$ to $\bf A_x$: $\text{svd}\bf(A_x'A) = USV'$; $\bf Q=VU'$ (orthogonal rotation matrix); $\bf A_q=AQ$ is the rotated $\bf A$.
Compute unit-scaled principal component scores of $\bf X$: $\bf U_x=X(A_x')^{-1}$.
Get $\bf Y$ the way we restore data "back" from components in PCA, but use $\bf A_q$ in place of $\bf A$: $\bf Y=U_xA_q$.

The idea behind is simple: orthogonally rotated loadings restore the target matrix as well as the unrotated ones: $\bf A_qA_q'=AA'=R$, but $\bf A_q$ is closer to $\bf A_x$ than $\bf A$ is. Thence, $\bf Y$ is "rather close" to $\bf X$ (while $\bf Y'Y=R$). If sums-of-squares in columns of $\bf X$ is initially already close to the diagonal of $\bf R$, this method looks especially handsome.

Remembering that unit-scaled pr. components are just left eigenvectors, the whole idea is shorter to express via svd equation: $\bf X= U_x[S_xV_x']$ where the bracketed term is $\bf A_x$. And we compute $\bf Y= U_x[S_rV_r']Q$ where the bracketed term is $\bf A$ from decomposing $\bf R$, and $\bf Q$ is orthogonal rotation (procrustes is this instance).

Is really $\mathbf {\|Y-X\|}^2=\text{min}$ of all by the method above? I don't know.

Can you suggest a method that is a stronger minimizer? Or computationally more efficient method? Or interesting from some other point of view? Whichever you prefer. Linear as well as nonlinear/iterative approaches are welcome to consider. If you have what to suggest, please show the algorithm or link to where it is described.

See also theoretically partly related, later question raised by @amoeba: https://stats.stackexchange.com/q/333493/3277. — ttnphns, Mar 16 '18 at 09:40
A reader might want also to look in https://stats.stackexchange.com/q/15011/3277, a little bit related topic. It is about how to generate or to transform _one_ variable for it to show the specified covariances/correlations with a set of _other_ variables. — ttnphns, Mar 16 '18 at 10:03

score 2 · Answer 1 · answered Mar 16 '18 at 10:05

This is a form of Procrustes problem and can be solved as follows.

Given $\mathbf X$ that is $n\times p$ with $n\ge p$ and positive semi-definite $\mathbf R$ that is $p\times p$, you want to find $\mathbf Y$ minimizing $$\|\mathbf X-\mathbf Y\|^2\:\:\text{s.t.}\:\:\mathbf Y^\top\mathbf Y = \mathbf R.$$

Any $\mathbf Y$ such that $\mathbf Y^\top\mathbf Y = \mathbf R$ can be written as $\mathbf Y=\mathbf Z\mathbf R^{1/2}$ where $\mathbf Z$ has orthonormal columns, i.e. $\mathbf Z^\top\mathbf Z=\mathbf I$. Indeed, then $$\mathbf Y^\top\mathbf Y = \mathbf R^{1/2}\mathbf Z^\top \mathbf Z \mathbf R^{1/2} = \mathbf R^{1/2} \mathbf R^{1/2} = \mathbf R.$$

So we can re-write the problem as follows: minimize $$\|\mathbf X-\mathbf Z\mathbf R^{1/2}\|^2\:\:\text{s.t.}\:\:\mathbf Z^\top\mathbf Z = \mathbf I.$$

Now writing the squared norm as the trace, we get: \begin{align} \|\mathbf X-\mathbf Z\mathbf R^{1/2}\|^2 &= \operatorname{tr}(\mathbf X-\mathbf Z\mathbf R^{1/2})^\top(\mathbf X-\mathbf Z\mathbf R^{1/2}) \\ &= \|\mathbf X\|^2 + \operatorname{tr}(\mathbf R) - 2\operatorname{tr}(\mathbf X^\top\mathbf Z\mathbf R^{1/2}) \\ &= \mathrm{const} - 2\operatorname{tr}(\mathbf Z\mathbf R^{1/2}\mathbf X^\top). \end{align}

This reduces the problem to maximizing $$\operatorname{tr}(\mathbf Z\mathbf R^{1/2}\mathbf X^\top)\:\:\text{s.t.}\:\:\mathbf Z^\top\mathbf Z = \mathbf I,$$ which is solved in Find a matrix with orthonormal columns with minimum Frobenius distance to the given matrix. The solution is to do SVD of $\mathbf X \mathbf R^{1/2} = \mathbf{USV}^\top$ and then setting $\mathbf Z = \mathbf{UV}^\top$. The final answer is $$\mathbf Y = \mathbf{UV}^\top \mathbf R^{1/2}.$$

I did not check if your proposed solution is equivalent to this.

Yes, it is the same as my solution (which is also openly procrustes-based, only that it was discovered conceptually - through the notion of loadings - rather than algebraically [I'm a poor mathematician]). Intuitively I feel that your algebra must be connected with the fact that PCA whitening with subsequent procrustes rotation towards the initial data is the operation identical with ZCA whitening, isn't it? — ttnphns, Mar 16 '18 at 10:46
Well, if it is the same then you can take my answer as a proof that you have been doing everything correctly :-) Regarding PCA/ZCA whitening, yes; ZCA whitening is basically a certain form of Procrustes. — amoeba, Mar 16 '18 at 10:50
I'm completely satisfied with the proof that among linear tricks that is the best method, and am gladly upvoting. Thank you! I'm not "accepting the answer" in the wait that other people maybe come with other proposals (maybe interesting nonlinear ones) as well. — ttnphns, Mar 16 '18 at 10:57
I'm actually not sure what you mean by "linear/nonlinear" in this case. If your problem is formulated as it currently is, then there is 1 unique minimizer. Unless you modify the problem in some way, there is no other "nonlinear" answer. — amoeba, Mar 16 '18 at 11:00

Ways to modify data minimally while the variables to follow the desired covariances

1 Answers1

Linked