3

I am working through the derivations for solving CCA in A Tutorial on Canonical Correlation Methods. Right now, I am trying to solve CCA using SVD (bottom of page 95:7). For completeness, I include the calculations here:

Let $C_{aa}$ and $C_{bb}$ be the covariance matrices and $C_{ab}$ be the cross-covariance matrix for data $X_a \in \mathbb{R}^{n \times p}$ and $X_b \in \mathbb{R}^{n \times q}$. Perform SVD s.t.:

$$ C_{aa}^{-1/2} C_{ab} C_{bb}^{-1/2} = U^{\top} SV $$

Then the linear projections of our data are:

$$ W_a = C_{aa}^{-1/2} U \qquad W_b C_{bb}^{-1/2} V $$

Despite all my calculations looking correct to me, I am not getting canonical correlations that match my expectations. Importantly, I expect:

  1. The singular values in $S$ to be the canonical correlations, e.g. $S_{ii} = Z_b^i \cdot Z_a^i$ where $Z$ are the images or embeddings of the data.
  2. I expect each embedding $Z_{a,b}^i \in \mathbb{R}^{\min(p, q)}$ to have unit length.
  3. I expect the embeddings to be orthogonal.

Here is my code. I have sprinkled asserts throughout to verify my expectations, but the results do not hold (see asserts at the very end).

def fit(self, Xa, Xb):
    """Fits CCA model parameters using SVD.

    :param Xa: Observations with shape (p_dim, n_samps).
    :param Xb: Observations with shape (q_dim, n_samps).
    :return:   None.
    """
    N, p = Xa.shape
    N, q = Xb.shape

    # Uurtio: "Throughout this tutorial, we assume that the variables are
    # standardised to zero mean and unit variance.
    Xa -= Xa.mean(axis=0)
    Xa /= Xa.std(axis=0)
    Xb -= Xb.mean(axis=0)
    Xb /= Xb.std(axis=0)

    # Compute sample covariance matrices. See Uurtio, eq. 2.
    Caa = np.cov(Xa.T)
    Cbb = np.cov(Xb.T)
    assert Caa.shape == (p, p)
    assert Cbb.shape == (q, q)

    Caa_sqrt = np.linalg.cholesky(Caa)
    Cbb_sqrt = np.linalg.cholesky(Cbb)
    assert np.allclose(Caa, np.matmul(Caa_sqrt, Caa_sqrt.T))
    assert np.allclose(Cbb, np.matmul(Cbb_sqrt, Cbb_sqrt.T))

    Caa_sqrt_inv = np.linalg.inv(Caa_sqrt)
    Cbb_sqrt_inv = np.linalg.inv(Cbb_sqrt)
    assert np.allclose(np.eye(p), np.matmul(Caa_sqrt, Caa_sqrt_inv))
    assert np.allclose(np.eye(q), np.matmul(Cbb_sqrt, Cbb_sqrt_inv))

    Cab_block = np.cov(Xa.T, Xb.T)
    Cab = Cab_block[:p, p:]
    assert Cab.shape == (p, q)

    # See Uurtio, eq. 12.
    A = np.matmul(np.matmul(Caa_sqrt_inv, Cab), Cbb_sqrt_inv)

    U, S, Vh = np.linalg.svd(A, full_matrices=False)
    assert np.allclose(A, np.dot(U[:, :A.shape[1]] * S, Vh))

    # See Uurtio, eq. after eq. 12 (not numbered).
    Wa = np.matmul(Caa_sqrt_inv, U)
    Wb = np.matmul(Cbb_sqrt_inv, Vh)
    assert Wa.shape[0] == p
    assert Wb.shape[0] == q

    Za = np.matmul(Xa, Wa)
    Zb = np.matmul(Xb, Wb)
    assert Za.shape[0] == N
    assert Za.shape == Zb.shape

    # WHERE THE CALCULATIONS DO NOT BEHAVE AS EXPECTED:
    # ================================================

    # These should be unit vectors.
    assert np.allclose(1, np.linalg.norm(Za, 2, axis=0))
    assert np.allclose(1, np.linalg.norm(Zb, 2, axis=0))

    # I expect the singular values of the matrix S to correspond to the
    # canonical correlations.
    assert np.isclose(S[0], np.dot(Za[:, 0], Zb[:, 0]))

    # I expect the canonical projections to be orthogonal.
    assert np.isclose(0, np.dot(Za[:, 0], Zb[:, 1]))  
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
jds
  • 1,402
  • 1
  • 13
  • 24
  • 1
    Did you look in answers about CCA found on this site? There are several giving all the necessary formulas (including answers by user amoeba and by myself). – ttnphns Jun 25 '18 at 19:40
  • Yes. I have not seen an _implementation_ of the SVD method on this site. If you can point me to a full implementation, I'm happy to close this. – jds Jun 25 '18 at 19:54
  • 3
    Asking for code in a specific language/program is off-topic here. Algebra of one way (quite efficient) to do CCA is here: https://stats.stackexchange.com/a/77309/3277 (I implemented it following the formulas and were having completely correct results). There are other theads, too. – ttnphns Jun 25 '18 at 20:04
  • Thanks. I'll take a look at your answer. Can you point me to the rules that say that asking for code is off topic here? I see code as a useful learning modality. – jds Jun 25 '18 at 20:12
  • @gwg https://stats.stackexchange.com/help/on-topic – Sycorax Oct 23 '20 at 02:05

0 Answers0