3

I have made a heatmap based upon a regular data matrix in R, the package I use is pheatmap. Regular clustering of my samples is performed by the distfun function within the package.

Now I want to attach a precomputed distance matrix (generated by Unifrac) to my previously generated matrix/heatmap. Is this possible?

chl
  • 50,972
  • 18
  • 205
  • 364
Lars Behrendt
  • 31
  • 1
  • 2
  • @Lars What do you want to modify: the heatmap or the dendrogram? – chl May 05 '11 at 09:50
  • I would like to modify the heatmap in such a way that my columns get clustered according to the pre-computed distance matrix – Lars Behrendt May 05 '11 at 09:56
  • @Lars I don't see any way to have different clustering scheme (distance + method) for rows and columns, after a quick look at the code. Some suggestions were provided on this [related question](http://stats.stackexchange.com/questions/6890/plotting-a-heatmap-given-a-dendrogram-and-a-distance-matrix-in-r); otherwise, the simplest solution is to play with the code and add your own modifications. I can show you how to do it if you want. – chl May 05 '11 at 10:03
  • If you could help me with this I would be thrilled ! – Lars Behrendt May 05 '11 at 10:07
  • Would it be easier to use the mixOmics package instead ? I have it installed, however the looks are not as nice as in the pheatmap package – Lars Behrendt May 05 '11 at 10:09
  • @Lars `pheatmap` relies on `grid` plotting functionalities, which is the reason why it looks so nice. – chl May 05 '11 at 11:05

2 Answers2

4

Ok, so you can just look at the code by typing the name of the function at the R prompt, or use edit(pheatmap) to see it in your default editor.

Around line 14 and 23, you'll see that another function is called for computing the distance matrices (for rows and columns), given a distance function (R dist) and a method (compatible with hclust for hierarchical clustering in R). What does this function do? Use getAnywhere("cluster_mat") to print it on screen, and you soon notice that it does nothing more than returning an hclust object, that is your dendrogram computed from the specified distance and linkage options.

So, if you already have your distance matrix, change line 14 (rows) or 23 (columns) so that it reads, e.g.

tree_row = hclust(my.dist.mat, method="complete")

where my.dist.mat is your own distance function, and complete is one of the many methods available in hclust (see help(hclust)). Here, it is important to use fix(pheatmap) and not edit(pheatmap); otherwise, the edited function will not be callable in the correct environment/namespace.

This is a quick and dirty hack that I would not recommend with larger package. It seems to work for me at least, that is I can use a custom distance matrix with complete linkage for the rows.

In sum, assuming your distance matrix is stored in a variable named dd,

library(pheatmap)
fix(pheatmap)
# 1. change the function as you see fit
# 2. save and go back to R
# 3. if your custom distance matrix was simply read as a matrix, make sure
#    it is read as a distance matrix
my.dist.map <- dd  # or as.dist(dd)

Then, you can call pheatmap as you did but now it will use the results of hclust applied to my.dist.map with complete linkage. Please note that you just have to ensure that cluster_rows=TRUE (which is the default). Now, you may be able to change

  • the linkage method
  • choose between rows or columns

by editing the package function appropriately.

chl
  • 50,972
  • 18
  • 205
  • 364
  • I will try this at once and let you know how it went. Thanks for your huge effort- greatly appreciate it . – Lars Behrendt May 05 '11 at 14:15
  • 1
    +1 awesome answer and great reference for how to bend source code to suit your needs. This type of thing comes up on SO all the time, I wish there was a way to have this available in the R-tag on SO as a FAQ. – Chase May 05 '11 at 14:21
1

As of version 1.0.10, the input parameter clustering_distance_rows and clustering_distance_cols can take a dist object to impose the precomputed distance for clustering.

foehn
  • 111
  • 2