Questions tagged [dendrogram]

A dendrogram (or tree diagram) is a graph used to represent relationships in hierarchical clustering.

61 questions
73
votes
7 answers

Where to cut a dendrogram?

Hierarchical clustering can be represented by a dendrogram. Cutting a dendrogram at a certain level gives a set of clusters. Cutting at another level gives another set of clusters. How would you pick where to cut the dendrogram? Is there something…
Eduardas
  • 2,239
  • 4
  • 23
  • 22
33
votes
3 answers

How to interpret the dendrogram of a hierarchical cluster analysis

Consider the R example below: plot( hclust(dist(USArrests), "ave") ) What exactly does the y-axis "Height" mean? Looking at North Carolina and California (rather on the left). Is California "closer" to North Carolina than Arizona? Can I make this…
Richi W
  • 3,216
  • 3
  • 30
  • 53
33
votes
1 answer

Comparing hierarchical clustering dendrograms obtained by different distances & methods

[The initial title "Measurement of similarity for hierarchical clustering trees" was later changed by @ttnphns to better reflect the topic] I am performing a number of hierarchical cluster analyses on a dataframe of patient records (e.g. similar to…
Wouter
  • 2,102
  • 3
  • 17
  • 26
9
votes
3 answers

How to plot a fan (Polar) Dendrogram in R?

I'm referring to something like this: suggested dataset for showing a solutions: data(mtcars) plot(hclust(dist(mtcars)))
Tal Galili
  • 19,935
  • 32
  • 133
  • 195
7
votes
1 answer

A measure to describe the distribution of a dendrogram

Could anyone suggest some statistical measures to describe the distribution of a dendrogram? If I have two dendrograms, how could can I quantify their structural differences?
Eduardas
  • 2,239
  • 4
  • 23
  • 22
6
votes
4 answers

Anyone know of a simple dendrogram visualizer?

I've written a small hierarchical clustering algorithm (for better or for worse). I'd like a quick way of visualizing it, any tooling ideas?
Hans
  • 163
  • 1
  • 4
5
votes
1 answer

Purpose of dendrogram and hierarchical clustering

This is likely a very naive question. I've lately been reading about hierarchical clustering algorithms, and various discussions about how to interpret dendrograms or find optimal heights for cutting a dendrogram. I've also played around with some…
siler
  • 51
  • 2
4
votes
0 answers

How to randomize the tips of a functional trait dendrogram?

I have generated a functional trait dendrogram using species x trait and plots x species matrices through dbFD in package FD. I want to randomize the tips of the trait dendrogram to check if community assembly of the trees in my study area is random…
4
votes
1 answer

How to convert a dendrogram back into a distance matrix?

Example code: our_dist <- dist(USArrests[1:4,]) dend <- as.dendrogram(hclust(our_dist , "ave")) plot(dend ) I would now wish to have a "dend2dist" function which will turn dend back to our_dist. Of course, it is not possible to do in full (AFAIK),…
Tal Galili
  • 19,935
  • 32
  • 133
  • 195
4
votes
1 answer

Cluster analysis in R produces reversals on dendrogram

I'm attempting to perform hierarchical agglomerative cluster analysis in R. However, when I use particular clustering methods, I get reversals (upward branching) in the resulting tree, which violates the ultrametric property. The two methods are:…
jslefche
  • 165
  • 1
  • 7
4
votes
2 answers

Applying Ward's method for calculating linkage

For an assignment, I have used iPython to create the dendrogram below, using Ward's method and Euclidean distance, from the following data: $$a=(0,0)$$ $$b=(1,2)$$ $$c=(3,4)$$ $$d=(4,1)$$ $$e=(2,2)$$ where dist({a},{b,e}) = 2.88, and…
Harr
  • 61
  • 5
4
votes
1 answer

Community detection and modularity

I am reading the book "Network science" of Barabasi and in particular the chapter on community detection. If I understand correctly, modularity is a goodness factor of partition calculated by a certain algorithm: the greater the value of modularity…
marielle
  • 181
  • 1
  • 7
4
votes
0 answers

A high cophenetic correlation coefficient but dendrogram seems bad

I have 2 results for the same dataset. One is hierarchical clustering using Ward's method and I got 0.75 cophenetic correlation coefficient. The second is average method and I got 0.91 cophenetic correlation coefficient. I used "euclidean distance"…
Emrah Bilgiç
  • 289
  • 2
  • 7
  • 14
4
votes
1 answer

Get k most diverse objects from dendrogram (hierarchical clustering)

I have a dendrogram which groups similar object in a hierarchical order. The problem I try to solve is based on a dendrogram how to get k most diverse objects. E.g. We start with some random (?) object, the next object we choose is the one with…
Sebastian Widz
  • 123
  • 1
  • 7
3
votes
1 answer

Which similarity coefficient should I use with Ward linkage?

I just attempted implementations of Ward linkage and UPGMA linkage, as well as Pearson and Euclid similarity coefficients. To my surprise, both similarity coefficients gave the same clustering with the Ward linkage. Should this be the case? Is…
1
2 3 4 5