Let $\bf X$ be the p
-variate dataset. I want to modify the data to p
-variate data $\bf Y$ so that its variables satisfy precisely a given SSCP (or covariance, or correlation) matrix $\bf R$: $\bf Y'Y=R$, and the requirement is that the modification of values is as small as possible (in terms of squared error): $\mathbf {\|Y-X\|}^2=\text{min}$.
What methods can you recommend?
I customarily use this way (which I discovered myself years ago):
Compute eigenvalues and eigenvectors of $\bf R$ and obtain PCA loadings from them (loading vector is eigenvector scaled up to the respective eigenvalue): $\bf A$.
Likewise obtain loadings from $\bf X'X$: $\bf A_x$.
Procrustes-rotate $\bf A$ to $\bf A_x$: $\text{svd}\bf(A_x'A) = USV'$; $\bf Q=VU'$ (orthogonal rotation matrix); $\bf A_q=AQ$ is the rotated $\bf A$.
Compute unit-scaled principal component scores of $\bf X$: $\bf U_x=X(A_x')^{-1}$.
Get $\bf Y$ the way we restore data "back" from components in PCA, but use $\bf A_q$ in place of $\bf A$: $\bf Y=U_xA_q$.
The idea behind is simple: orthogonally rotated loadings restore the target matrix as well as the unrotated ones: $\bf A_qA_q'=AA'=R$, but $\bf A_q$ is closer to $\bf A_x$ than $\bf A$ is. Thence, $\bf Y$ is "rather close" to $\bf X$ (while $\bf Y'Y=R$). If sums-of-squares in columns of $\bf X$ is initially already close to the diagonal of $\bf R$, this method looks especially handsome.
Remembering that unit-scaled pr. components are just left eigenvectors, the whole idea is shorter to express via svd equation: $\bf X= U_x[S_xV_x']$ where the bracketed term is $\bf A_x$. And we compute $\bf Y= U_x[S_rV_r']Q$ where the bracketed term is $\bf A$ from decomposing $\bf R$, and $\bf Q$ is orthogonal rotation (procrustes is this instance).
Is really $\mathbf {\|Y-X\|}^2=\text{min}$ of all by the method above? I don't know.
Can you suggest a method that is a stronger minimizer? Or computationally more efficient method? Or interesting from some other point of view? Whichever you prefer. Linear as well as nonlinear/iterative approaches are welcome to consider. If you have what to suggest, please show the algorithm or link to where it is described.