0

I have a cross-correlation matrix, $C_{nm}$, between two sets of variables, and I would like to establish the correspondence between the row and the column variables.

My current approach is to convert the correlation matrix to a connectivity matrix by setting: $$ M_{nm} = \begin{cases} 1, \text{ if } C_{nm} \geq f,\\ 0, \text{ if } C_{nm} < f \end{cases} $$ I then use a home-made algorithm to find all the connected clusters, which fall into the following categories:

  1. row or column variables not connected to anything
  2. pairs of row and column variables with one-to-one correspondence
  3. row/column variable corresponding to several column/row variables
  4. clusters relating several row and column variables.

I am looking for:

  • A standard algorithm for performing such a decomposition. (I encountered Dulmage-Mendelsohn one of the answers to this question, but I am not sure if it does exactly what I need. Overall, this question seems similar to what I would like to do, except that it deals with the auto-correlation)
  • A way to choose the cutoff $f$, to optilize the splitting into the categories cited above - i.e., minimizing misclassification due to statistical errors (the correlation matrix originates from averaging over a hundred of samples)
  • Perhaps a better approach to analyzing (and visualizing) the relations between the two sets of variables.
Roger Vadim
  • 1,481
  • 6
  • 17

0 Answers0