Given two sets of variables and the objective of finding correlations among the variables in the two sets, is there any simple examples or explanation, for a group of biologists knowing only basic statistics, to illustrate the benefit of canonical correlation analysis (CCA) over Pearson's correlation between pairs of variables from the two sets?
Asked
Active
Viewed 448 times
2
-
Could help https://stats.stackexchange.com/q/65692/3277 – ttnphns Jan 05 '20 at 08:10
-
Thanks @ttnphns! While that's a great answer to `what` and `how` of CCA, I'm looking for the `why`, esp. to the objective in my question, something similar to Section 5.1 of this [article](https://www.cs.cmu.edu/~tom/10701_sp11/slides/CCA_tutorial.pdf). – blueskyddd Jan 07 '20 at 14:39
-
In your article sec. 5.1 the example is not very instructive because it is degenerate: the canonical correlation is 1. Instead, I would recommend maybe to turn to my answer with the pics and turn to the pic with multiple regression. Multiple regression _is_ CCA with one of the two sets (X and Y), the Y set, consisting just on _one_ variable; X of two (X1, X2). – ttnphns Jan 07 '20 at 15:39
-
(cont.) You see that - _as long as we agree to dismiss individual variable's identities_ and instead agree to have their _linear constructs_ as their representatives or proxies, - correlation between Y and Y' (the latter being a linear construct of X1 and X2 since it lies in their plane) _is "better"_ than two separate correlations: Y /w X1 and Y /w X2. First, it is one value instead of two; second, it is higher correlation than any of the two. Y' is the "best" proxy of the pair X1,X2 in that sense that it has the maximal possible correlation with Y. – ttnphns Jan 07 '20 at 15:39
-
(cont.) It is thus the maximal possible relation which can be found between set Y and set X. All the said is exactly true in general CCA where both sets X and Y consist of more than one variable. If you add Y2 to Y set (so that there is Y1 and Y2 forming set Y now) - there will be two constructs found - Vx (representing X) and Vy (representing Y) correlation between which will me maximal possible (see pic 3 in my post). – ttnphns Jan 07 '20 at 16:01
-
Thanks, @ttnphns! It helps a lot! – blueskyddd Jan 09 '20 at 15:25
-
Well it ain't pretty, but it is [understandable](https://www.statisticssolutions.com/canonical-correlation/). – Carl Feb 25 '20 at 05:28