1

We are trying to interpret a heatmap that looks like this:

enter image description here

... and plotted with this plotting code:

heatmap(ourdata,
    col=cluster_colors,
    distfun = function(x) dist(x, method="euclidean"),
    hclustfun = function(x) hclust(x, method="complete"),
    margins=c(8,18),
);

Visually, it seems quite clear that there are different "heights" between the branches in the dendrogram, which we assume would correspond to relative distances between the columns and/or clusters.

Question 1: Is this correct: do the heights correspond to clustering distances?

Question 2: Where can we find evidence of this? We haven't been able to find the answer from help(heatmap) nor help(hclust) (nor help(dist), although that is not expected either).

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Samuel Lampa
  • 111
  • 2
  • 1
    `do the heights correspond to clustering distances?` They should. They should correspond normally to the colligation coefficient computed as the linkage with that linkage method (in your example - complete method). Read [this](http://stats.stackexchange.com/a/217742/3277), especially "Dendrogram" paragraph. However, what is plotted by your specific function should be known from its documentation, _always_ read documentation attentively. – ttnphns Sep 07 '16 at 13:31
  • 2
    `We haven't been able to find the answer from...` If documentation is scarce try to do the same hierarchical clustering by another, better documented function/package, and compare the looks of the dendrogram; the looks (relative branch levels) is expected to be the same (left-right sequence of objects, however, might vary a bit). – ttnphns Sep 07 '16 at 13:35

1 Answers1

1

With complete linkage, it is supposedly the maximum of the pairwise distances from one cluster to the other.

With single it would be the minimum between the two clusters.

Has QUIT--Anony-Mousse
  • 39,639
  • 7
  • 61
  • 96