A simple example:
plot(hclust(dist(c(1:3)),method = "ward"))
I would like to know which calculations (in R) can reproduce the distance of 3 from {1,2} to be 1.67
Thanks.
A simple example:
plot(hclust(dist(c(1:3)),method = "ward"))
I would like to know which calculations (in R) can reproduce the distance of 3 from {1,2} to be 1.67
Thanks.
It is in fact (in words) the absolute distance from the extreme value to the overall mean, plus two times the absolute distance from the mean of the two moderate values to the overall mean, minus a third of the absolute distance from one of the moderate values to mean of the two moderate values, minus a third of the absolute distance from the other moderate value to the mean of the two moderate values.
Try this with
plot(hclust(dist(c(0,18,126)),method = "ward"))
and the absolute distance from 126 to 48, plus twice the absolute distance from 9 to 48, minus a third of the absolute distance from 18 to 9, minus a third of the absolute distance from 0 to 9, gives $78 + 2\times 39 - 9/3 -9/3 =150$.
The distance between two clusters is calculated using the Lance-Williams update formula, see the Wikipedia entry. It holds that: $$ 2/3*\text{abs}(2-3)+2/3*\text{abs}(1-3)-1/3*1 = 1.666667 $$