I am a biology student investigating a new method of creating a dichotomous identification key. I have created a dendrogram using data I have collected from a survey on how people rate how similar pictures of plant leaves are. I used ward's method to link the clusters. In the resulting dendrogram, I have a y-axis that ranges between 0 and about 50. I know that this axis represents at which the objects are joined in a cluster, thus how far they are from other objects, but I was wondering what exactly does the numeric value represent?
Asked
Active
Viewed 3,355 times
3

gung - Reinstate Monica
- 132,789
- 81
- 357
- 650

radellin
- 31
- 1
- 2
-
Do you have a reference for this graph? – Andy Jul 31 '15 at 17:24
-
1Please read the documentation for the clustering program you used. – ttnphns Aug 01 '15 at 14:57
1 Answers
2
I'm going to, ahem, go out on a limb here, ahem, and guess that you built your tree via the hclust
function in base R with method = "ward.D2"
, which is Ward's original method. If you type ?hclust
and look for height
in the value
(output) section, it says "The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration." In this case, Ward's criterion is the total within-cluster error sum of squares, which increases as you go up the tree and make the clusters bigger.

eric_kernfeld
- 4,828
- 1
- 16
- 41