How are row principal and supplement coordinates calculated in correspondence analysis (CA)? Specifically, I am looking for a simple example as how to derive them using linear combinations of the row profiles, masses, and/or contribution statistics. Everything that I have read has been in matrix algebra, which I do not understand. The matrix notation can be found at the bottom of the second page here.
For example, in principal components analysis, its easy to see that the first principal component of a set of features $X_1, X_2,...,X_p$ is the normalized linear combination of the features $Z_1=a_{11}X_1+a_{21}X2+...+a_{p1}X_p$. Normalization means that the sum of the variance of the $a_{p1}$ is equal to $1$.
I believe these $a_{p1}$ may be analogous to relative or absolute contributions, and the $X_p$ are the centered (demeaned) row profiles weighted by their masses in CA. But I am not sure.
I ran correspondence analysis on the matrix below using the ca package in R. Rows 1-9 are supplementary points. Rows 10-12 are the contingency table.
Image1 Image2 Image3 Image4 Image5
1_B1 1 0 1 1 0
1_B2 1 1 0 0 1
1_B3 1 0 0 1 1
2_B1 1 0 1 1 1
2_B2 1 1 1 1 1
2_B3 0 0 1 1 0
3_B1 1 1 1 0 0
3_B2 0 0 1 1 1
3_B3 0 0 1 0 0
B1 3 1 3 2 1
B2 2 2 2 2 3
B3 1 0 2 2 1
which produced the results below:
Principal inertias (eigenvalues):
dim value % cum% scree plot
1 0.088763 71.6 71.6 ******************
2 0.035277 28.4 100.0 *******
-------- -----
Total: 0.124040 100.0
Rows:
name mass qlt inr k=1 cor ctr k=2 cor ctr
1 | (*)1_B1 | <NA> 925 <NA> | 608 863 <NA> | 162 62 <NA> |
2 | (*)1_B2 | <NA> 960 <NA> | -952 824 <NA> | 387 136 <NA> |
3 | (*)1_B3 | <NA> 128 <NA> | -60 6 <NA> | -271 122 <NA> |
4 | (*)2_B1 | <NA> 430 <NA> | 166 195 <NA> | -182 235 <NA> |
5 | (*)2_B2 | <NA> 825 <NA> | -267 787 <NA> | 59 39 <NA> |
6 | (*)2_B3 | <NA> 705 <NA> | 762 533 <NA> | -433 172 <NA> |
7 | (*)3_B1 | <NA> 811 <NA> | -284 87 <NA> | 820 725 <NA> |
8 | (*)3_B2 | <NA> 939 <NA> | 121 28 <NA> | -694 911 <NA> |
9 | (*)3_B3 | <NA> 252 <NA> | 845 250 <NA> | 84 2 <NA> |
10 | S_B1 | 370 1000 227 | 164 352 112 | 222 648 518 |
11 | S_B2 | 407 1000 408 | -348 974 555 | -57 26 37 |
12 | S_B3 | 222 1000 365 | 365 653 333 | -266 347 445 |
Columns:
name mass qlt inr k=1 cor ctr k=2 cor ctr
1 | Idl1 | 222 1000 130 | 90 111 20 | 254 889 407 |
2 | Idl2 | 111 1000 350 | -595 906 443 | 192 94 116 |
3 | Idl3 | 259 1000 133 | 252 996 185 | 16 4 2 |
4 | Idl4 | 222 1000 130 | 202 562 102 | -179 438 201 |
5 | Idl5 | 185 1000 256 | -346 696 249 | -228 304 274 |
The rows and columns table of results is given in a standard format, where quantities are either multiplied by 1000 or expressed in permills (thousandths):
- the mass (mass) of each point (x1000), the quality of display in the solution subspace of nd dimensions,
- the inertia (int) of the point (in permills of the total inertia), and
- then for each dimension (K=1 or 2) of the solution the principal coordinate (x1000),
- the (relative) contribution COR of the principal axis to the point inertia (x1000) and
- the (absolute) contribution CTR of the point to the inertia of the axis (in permills of the principal inertia).
- For supplementary points, masses, inertias and absolute contributions (CTR) are not applicable, but the relative contributions (COR) are valid as well as their sum over the set of chosen nd dimensions (QLT).
Is there a simple linear combination of these metrics and the row profiles that gives the row principal coordinates?
Just in case:
Row profile table:
Image1 Image2 Image3 Image4 Image5 Total
B1 0.30 0.10 0.30 0.20 0.10 1
B2 0.18 0.18 0.18 0.18 0.27 1
B3 0.17 0.00 0.33 0.33 0.17 1
Ave 0.22 0.11 0.26 0.22 0.19 1