8

I have a graph instance with weighted directed edges which values can be in range [-1,1]. I need to do clustering on this graph, in order to find out groups in which vertices are more correlated.

I searched for several clustering or community detection graph based algorithms, but most of them don't work because the negative weights. Up to now I have applied spinglass (it is so called in igraph library, it is an algorithm based on Potts model) algorithm which seems to work with both positive and negative weights.

Are there any other algorithms for doing clustering or community detection on graphs which have negative and positive edge weights?

Update: the edge weights represent correlations, 1 means that two vertices are strongly correlated, -1 that are inversely correlated and 0 means that are indipendent.

Ewybe
  • 181
  • 1
  • 4
  • What do the weights represent? – eliasah Oct 18 '15 at 16:43
  • @eliasah I did an update in order to explain that – Ewybe Oct 18 '15 at 16:54
  • Did you try using another scale? That might be a good solution using a regular clustering method based on betweenness centrality algorithm per example. – eliasah Oct 18 '15 at 17:05
  • @eliasah Scaling these data imay be not so easy because I am interested in preserve the meaning of the correlation – Ewybe Oct 18 '15 at 17:39
  • Unless you are using a log scale or a log-log scale, i don't see how you are actually loosing the meaning of your correlations, but ok. – eliasah Oct 18 '15 at 17:41
  • 2
    For the clustering, is the *sign* of the correlation really needed? Inverse correlation is a pretty *strong* relationship, too. See my answer below. – Has QUIT--Anony-Mousse Oct 18 '15 at 22:24
  • What do you want to achieve? Suppose you have four nodes A B C D with these correlations. A-B 1, C-D 1, all other correlations -1 (A B are perfectly correlated, so are C and D, all other pairs are anti-correlated). How is the clustering supposed to work? – micans Oct 19 '15 at 15:52
  • I think that your example is a border line case, because the correlation have the same value for all vertices. However if I have A-B = 1 and C-D: -1 and for all other correlation an abs value minor that 1, I would like to get two clusters: {A,B}, {C,D}. Instead if I have A-B = 1 and C-D: 1 and for all other correlation an abs value minor that 1, I would like to get one clusters: {A,B,C,D}. – Ewybe Oct 20 '15 at 12:41

5 Answers5

2

Have you tried mapping the values to [0;2]?

Then many algorithms may work.

Consider e.g. Dijkstra: it requires non-negative edge weights, but if you know the minimum a of the edges, you can run it on x-a and get the shortest cycle-free path.

Update: for correlation values, you may either be interested in the absolute values abs(x) (which is the strength of the correlation!) or you may want to break the graph into two temporarily: first cluster on the positive correlations only, then on the negative correlations only if the sign is that important for clustering & the previous approaches don't work.

Has QUIT--Anony-Mousse
  • 39,639
  • 7
  • 61
  • 96
1

Yes, there is an algorithm called 'Affinity Propagation' that works with negative weights; I believe this is implemented in sklearn (see the documentation here). A reference for what is going on behind the scenes can be found here.

Hope that's what you're looking for!

squattyroo
  • 111
  • 3
  • I did not know this algorithm, it seems to be a good solution. I certainly will try it. Thank you. – Ewybe Oct 18 '15 at 17:40
  • As far as I know Affinity Propagation clustering will not be able to simultaneously consider positive and negative correlations whilst also separating them. For that matter, I think the task is contradictory. – micans Oct 19 '15 at 15:47
0

It seems to me the problem you describe is known as the Correlation Clustering Problem. This information should help you find some implementations, such as:

Note some community detection algorithms have also been modified in order to process signed networks, e.g. Amelio'13, Sharma'12, Anchuri'12, etc. However, I couldn't find any publicly available implementation.

Vincent Labatut
  • 599
  • 1
  • 6
  • 14
0

Take a look at this code, it is quite scalable, works with positive and negative edges, and solves Correlation Clustering (CC) as a special case (r = 0). However, for the case of CC (maximizing positive links and minimizing negative links inside clusters), I would suggest other methods that are specialized in solving this objective.

To illustrate, Correlation Clustering (unlike what Community Detection literature pursues) does not take the positive density of clusters into account, so when a network has no or few negative ties (most real-world cases), all the network is put into one big cluster.

Esmailian
  • 556
  • 2
  • 6
0

We sometimes call these graphs 'signed graphs' or 'signed networks' as links can be positive and negative (or even weighted, as in your case), where a negative link means the nodes are anti-correlated and thus should be repulsed iso attracted. In a social network setting for example a negative link could model a situation of two users that are enemies.

Take a look at Traag's Leidenalg, who integrated repulsion between negative ties in various graph clustering (community detection) algorithms, all code available through the python leidenalg module, easy to use.

https://leidenalg.readthedocs.io/en/stable/multiplex.html#negative-links https://www.nature.com/articles/s41598-019-41695-z

some other useful refs (using the algorithm)