1

I am applying ward hierarchical clustering on a data set for which I have pairwise similarities. Since hierarchical clustering need a dissimilarity matrix, I am trying to convert my similarity matrix into a dissimilarity one. Besides, ward algorithm needs a euclidian dissimilarity, so I tried several conversions like the one proposed here Warning: ward's linkage specified with non-Euclidean dissimilarity matrix

And, when I compute the cophenet, I get a poor value (around 0.39, smaller than all the values I previously got).

How can I convert my spearman similarity matrix into a euclidian dissimilarity one?

bigTree
  • 739
  • 1
  • 9
  • 21
  • 1
    It is geometrically correct to convert angular similarity (such as Pearson correlation) to euclidean distance by the law of cosines, as your link shows. But Spearman rho is Pearson r for _ranks_, so euclidean distance will also express ranked data, not original data. BTW, rho's traditional computational [formula](http://stats.stackexchange.com/q/89121/3277) is explicitly a conversion of the distance back into the similarity. – ttnphns Apr 03 '14 at 13:20
  • Saying 0.39 cophenetic r is "poor" is just because fear has big eyes. Your clustering structure isnt' too strong this time. So what? Keep on living. May try other clustering method or other clustering criterion. – ttnphns Apr 03 '14 at 13:26
  • Is this conversion (distance back into similarity) valid only when there are no ties between X and Y? (the distance D appears in the second formula given in the link) – bigTree Apr 03 '14 at 13:30
  • More general formula of rho is [here, page 641](ftp://public.dhe.ibm.com/software/analytics/spss/documentation/statistics/20.0/en/client/Manuals/IBM_SPSS_Statistics_Algorithms.pdf). But computing rho as Pearson r, after ranking, is the also valid equivalent formula. It conversion to euclidean d by "law of cosines" is also ever valid. – ttnphns Apr 03 '14 at 13:50
  • @ttnphns: do you want to post your comment(s) as an answer? [Better to have a short answer than no answer at all.](https://stats.meta.stackexchange.com/a/5326/) Anyone who has a better answer can post it. – Sycorax Aug 12 '18 at 03:23

0 Answers0