Assume I have a centered $n \times d$ matrix $X$ with $n$ data points in $d$ dimensions. If I wish to transform it to $Y$ such that
- $Y$ has identity covariance
- $\|X - Y \|^2$ is minimized
According to this post and section 6.1 of this paper, I should use ZCA transformation. Point #2 can be rephrased as minimizing the quantity $c$ subject to $Y$ having identity covariance
$c=\sum_{i=1}^n \sum_{j=1}^d {\left( x_{ij} - y_{ij} \right) }^2$
The procedure can be 'reversed': Given $Y$ is whitened it can be transformed by the inverse of ZCA to have any arbitrary covariance, $R$, creating matrix $Z$, which minimizes $\|Y - Z \|^2$. (However, I am unsure if that also minimize $\|X - Z \|^2$ if able to transform $X$ to $Z$ directly in one step.)
The main question I have is how to use ZCA in a slightly generalized fashion. The optimization problem I needed to solve is equivalent to transforming $X$ into $Z$ but subject to
- $Z$ has given covariance $R$
- $c^*=\|X - Z \|_W^2$ is minimized
Where it is a weighted least squares sum,
$c^*=\sum_{i=1}^n \sum_{j=1}^d \frac{{\left( x_{ij} - z_{ij} \right) }^2}{s_{ij}}$
I see how to do the two-step procedure above to get $Z$ with covariance $R$, first from whitened $Y$, but unsure if that minimizes $\|X - Z \|^2$. The weighted least squares criteria is not met though.
(I technically have a more difficult problem: $X$ has mean $u_x$, and wish $Z$ to have mean $u_z$ and covariance $R$ given the weighted least squares sum minimized; I believe the means do not matter in unweighted least squares - however, they may in this case as not only weighted but in my problem they depend on $x_{ij}$ individually $r_{ij}=2x_{ij}(1-x_{ij})$)