I want to perform regularized canonical correlation between two matrices with more variables than observations (same subjects), one of which is very large (~18000 columns). The only r package that could handle the matrix dimension was PMA (tried mixOmics, RGCCA...).
The problem is that the output of the function CCA gives only the canonical variates and correlations, and the canonical weights. For a good interpretation of the results the canonical scores are necessary, but this is (weirdly) not provided from the CCA function. I am not very proficient with matrix algebra (and statistics in general) and I am not sure about how to compute this.
Example from the package documentation:
### create matrices
> u <- matrix(c(rep(1,25),rep(0,75)),ncol=1)
> v1 <- matrix(c(rep(1,50),rep(0,450)),ncol=1)
> v2 <- matrix(c(rep(0,50),rep(1,50),rep(0,900)),ncol=1)
> x <- u%*%t(v1) + matrix(rnorm(100*500),ncol=500)
> z <- u%*%t(v2) + matrix(rnorm(100*1000),ncol=1000)
### perform canonical correlation (3 canonical variates)
> out <- CCA(x,z,typex="standard",typez="standard",K=3)
> print(out,verbose=TRUE)
Call: CCA(x = x, z = z, typex = "standard", typez = "standard", K = 3)
Num non-zeros u's: 59 88 75
Num non-zeros v's: 180 154 164
Type of x: standard
Type of z: standard
Penalty for x: L1 bound is 0.3
Penalty for z: L1 bound is 0.3
Cor(Xu,Zv): 0.9578624 0.93371 0.9418701
Component 1 :
Row Feature Name Row Feature Weight
1 1 0.112
2 2 0.080
3 3 0.124
4 4 0.165
5 5 0.087
........
........
Column Feature Name Column Feature Weight
1 10 0.006
2 15 -0.027
3 25 -0.025
4 28 0.030
5 35 0.035
........
........
similar for Components 2 and 3.
The other doubt is about the significance of the results: are the non zero variables given already computed as "significantly different than 0"? Although a significance threshold is not asked.