I am used to think of correspondence analysis (CA) as dissecting the weighted departure from independence through singular value decomposition, but I cannot relate this to constrained correspondence analysis.
Say I want to analyse a n-by-p count matrix Y with samples in the rows and variables in the columns. Then R is a diagonal matrix with row sums of Y on the diagonal, and K a diagonal matrix with column sums of Y on the diagonal. $E = R11^TK/q$ with $1$ a properly sized vector of ones and $q$ the total sum of Y then represents the expected counts of Y under row-column independence. I would then pursue with a singular value decomposition of $$R^{-1/2}(Y-E)K^{-1/2} = U \Sigma V^T$$, which gives me a decomposition of how Y departs from E, weighted appropriately.
However, I want to perform constrained or canonical correspondence analysis (CCA), with a n-by-d constraining matrix Z with environmental variables. Based on verbal descriptions, I would expect I have to regress $R^{-1/2}(Y-E)K^{-1/2}$ on the rows of Z, with weights equal to the row totals of Y. I would then use the fitted values $$F = Z (Z^T R Z)^{-1}ZRR^{-1/2}(Y-E)K^{-1/2}$$ to obtain the part of departure from independence that can be explained by Z. I suppose the weighing occurs because samples with more counts are supposed to carry more information. I would expect to use the singular value decomposition of F directly to make a biplot to represent these departures in few dimensions. However Simple and Canonical Correspondence Analysis Using the R Package anacor and History of canonical correspondence analysis do not use this $R^{-1/2}(Y-E)K^{-1/2}$ matrix but work directly on Y. They decompose $$F' = (Z^T R Z)^{-1/2}Z^TYK^{-1/2} = P \Sigma Q^T$$ and set the column scores to be $S = K^{-1/2}Q$ and the variable scores to $B = (Z^T R Z)^{-1/2}P\Sigma$.
There are many things I do not really understand about CCA, but my main question is:
"Why is there no departure from independence term (in the trend of $X-E$) present in this formula anymore?"
It seems to be analyzing the count matrix Y directly rather than the departure from independence. How are the row and column scores then to be interpreted? I am looking for an intuitive explanation, supported by matrix equations.