I am working through the derivations for solving CCA in A Tutorial on Canonical Correlation Methods. Right now, I am trying to solve CCA using SVD (bottom of page 95:7). For completeness, I include the calculations here:
Let $C_{aa}$ and $C_{bb}$ be the covariance matrices and $C_{ab}$ be the cross-covariance matrix for data $X_a \in \mathbb{R}^{n \times p}$ and $X_b \in \mathbb{R}^{n \times q}$. Perform SVD s.t.:
$$ C_{aa}^{-1/2} C_{ab} C_{bb}^{-1/2} = U^{\top} SV $$
Then the linear projections of our data are:
$$ W_a = C_{aa}^{-1/2} U \qquad W_b C_{bb}^{-1/2} V $$
Despite all my calculations looking correct to me, I am not getting canonical correlations that match my expectations. Importantly, I expect:
- The singular values in $S$ to be the canonical correlations, e.g. $S_{ii} = Z_b^i \cdot Z_a^i$ where $Z$ are the images or embeddings of the data.
- I expect each embedding $Z_{a,b}^i \in \mathbb{R}^{\min(p, q)}$ to have unit length.
- I expect the embeddings to be orthogonal.
Here is my code. I have sprinkled asserts throughout to verify my expectations, but the results do not hold (see asserts at the very end).
def fit(self, Xa, Xb):
"""Fits CCA model parameters using SVD.
:param Xa: Observations with shape (p_dim, n_samps).
:param Xb: Observations with shape (q_dim, n_samps).
:return: None.
"""
N, p = Xa.shape
N, q = Xb.shape
# Uurtio: "Throughout this tutorial, we assume that the variables are
# standardised to zero mean and unit variance.
Xa -= Xa.mean(axis=0)
Xa /= Xa.std(axis=0)
Xb -= Xb.mean(axis=0)
Xb /= Xb.std(axis=0)
# Compute sample covariance matrices. See Uurtio, eq. 2.
Caa = np.cov(Xa.T)
Cbb = np.cov(Xb.T)
assert Caa.shape == (p, p)
assert Cbb.shape == (q, q)
Caa_sqrt = np.linalg.cholesky(Caa)
Cbb_sqrt = np.linalg.cholesky(Cbb)
assert np.allclose(Caa, np.matmul(Caa_sqrt, Caa_sqrt.T))
assert np.allclose(Cbb, np.matmul(Cbb_sqrt, Cbb_sqrt.T))
Caa_sqrt_inv = np.linalg.inv(Caa_sqrt)
Cbb_sqrt_inv = np.linalg.inv(Cbb_sqrt)
assert np.allclose(np.eye(p), np.matmul(Caa_sqrt, Caa_sqrt_inv))
assert np.allclose(np.eye(q), np.matmul(Cbb_sqrt, Cbb_sqrt_inv))
Cab_block = np.cov(Xa.T, Xb.T)
Cab = Cab_block[:p, p:]
assert Cab.shape == (p, q)
# See Uurtio, eq. 12.
A = np.matmul(np.matmul(Caa_sqrt_inv, Cab), Cbb_sqrt_inv)
U, S, Vh = np.linalg.svd(A, full_matrices=False)
assert np.allclose(A, np.dot(U[:, :A.shape[1]] * S, Vh))
# See Uurtio, eq. after eq. 12 (not numbered).
Wa = np.matmul(Caa_sqrt_inv, U)
Wb = np.matmul(Cbb_sqrt_inv, Vh)
assert Wa.shape[0] == p
assert Wb.shape[0] == q
Za = np.matmul(Xa, Wa)
Zb = np.matmul(Xb, Wb)
assert Za.shape[0] == N
assert Za.shape == Zb.shape
# WHERE THE CALCULATIONS DO NOT BEHAVE AS EXPECTED:
# ================================================
# These should be unit vectors.
assert np.allclose(1, np.linalg.norm(Za, 2, axis=0))
assert np.allclose(1, np.linalg.norm(Zb, 2, axis=0))
# I expect the singular values of the matrix S to correspond to the
# canonical correlations.
assert np.isclose(S[0], np.dot(Za[:, 0], Zb[:, 0]))
# I expect the canonical projections to be orthogonal.
assert np.isclose(0, np.dot(Za[:, 0], Zb[:, 1]))