Relationship n_components and Y array dimension - Canonical Correlation Analysis (CCA)

Question

Background

My system tries to classify among three classes. At first, my labeling for CCA had a single dimension {1, 2, 4}, but then I found out that to get more components, I need more dimensions in Y: as dim Y = 1, I could only set n_components = 1.

So, I switched to OneHot labeling instead {[0 0 1],[0 1 0],[1 0 0]} (dim Y = 3) and the CCA works fine with n_components <= 3.

To improve my results (pretty mediocre right now), I tried increasing my number of components to at least the number of classes + 1. So I changed dim Y to 4: {[0 0 0 1],[0 0 1 0],[0 1 0 0]}. Now I get this error randomly:

y_score = next(col for col in Y.T if np.any(np.abs(col) > eps))

which probably means "Yk is full of zeros, which means that Yk was successively deflated to a matrix of rank 0... which means that we asked for too many components, maybe?".

Question

Overall, I want to know how is the number of classes correlated to the number of components. Do I need to have n_components =< n_classes? Can I increase my number of components without "deflating the martix to rank 0"?

https://stats.stackexchange.com/a/190821/3277 – ttnphns Apr 22 '21 at 17:55 — ttnphns, Apr 22 '21 at 17:55

score 0 · Answer 1 · answered Apr 22 '21 at 22:10

0

CCA can give you only as many components as the number of variables in X or Y (whatever is smaller), basically for the same reason why PCA will give you only as many components as you have variables and not more. By adding zeros or using one-hot encoding, you are only adding colinear columns, so you are not actually increasing the rank of your matrix and the CCA solution is not unique.

answered Apr 22 '21 at 22:10

rep_ho

6,036
1
22
44

But how can I improve my labeling without using OneHot encoding, then? If I just use the unidimensional {1, 2, 4} it returns `FutureWarning: As of version 0.24, n_components(3) should be in [1, min(n_features, n_samples, n_targets)] = [1, 1]. n_components=1 will be used instead. In version 1.1 (renaming of 0.26), an error will be raised.` (I', already using v0.24, so it simply crashes) – mgmussi Apr 23 '21 at 17:13

Relationship n_components and Y array dimension - Canonical Correlation Analysis (CCA)

Background

Question

1 Answers1