Quantcast
Channel: V.A. Traag
Viewing all articles
Browse latest Browse all 10

Easy, flexible and fast framework for community detection.

$
0
0

\(\)
Good news! In an earlier post, I talked about my flexible implementation of the Louvain algorithm in Python. Now, it is also available in a C++ implementation, making it run much (really much) faster. Whereas the previous version already began to stagger for a thousand nodes, this implementation can easily be applied to graphs of millions of nodes (although you still might need to be patient when your graph is really that big). Even better, you can now simply install it using pip: sudo pip install louvain. For Windows users, you can refer to the binary installer at the PyPi repository. The source code is also available from GitHub.

Using it is really straightforward. To start, make sure to import the packages:

import louvain
import igraph as ig

We’ll create a random graph for testing purposes:

G = ig.Graph.Erdos_Renyi(100, 0.1);

For simply finding a partition use:

part = louvain.find_partition(G, method='Modularity');

Notice that part now contains an additional variable, part.quality which stores the quality of the partition as calculated by the used method. You can always get the quality of the partition using another method by calling

part.significance = louvain.quality(G, partition, method='Significance');

You can also find partitions for multiplex graphs. For each layer you then specify the objective function, and the overall objective function is simply the sum over all layers, weighted by some weight. If we denote by \(q_k\) the quality of layer \(k\) and the weight by \(w_k\), the overall quality is then \(q = \sum_k w_k q_k\). This can also be useful in case you have negative links . In principle, this could also be used to detect temporal communities in a dynamic setting .

For example, assuming you have a graph with positive weights G_positive and a graph with negative weights G_negative, and you want to use Modularity for finding a partition, you can use

membership, quality = louvain.find_partition_multiplex([
louvain.Layer(graph=G_positive, method='Modularity', layer_weight=1.0),
louvain.Layer(graph=G_negative, method='Modularity', layer_weight=-1.0)])

Notice the negative layer weight is -1.0 for the negative graph, since we want those edges to fall between communities rather than within. One particular problem when using negative links, is that the optimal community is no longer guaranteed to be connected (it may be a multipartite partition). You may therefore need the option consider_comms=ALL_COMMS to improve the quality of the partition. Notice that this runs much slower than only considering neighbouring communities (which is the default).

Various methods, such as the Potts model from , or CPM support a (linear) resolution parameter, which can be effectively bisected . You can do this by calling:

res_parts = louvain.bisect(G, method='CPM', resolution_range=[0,1]);

Notice this may take some time to run, as it effectively calls louvain.find_partition for various resolution parameters (depending on the
settings possibly hundreds of times).

Then res_parts is a dictionary containing as keys the resolution, and as values a NamedTuple with variables partition and bisect_value, which contains the partition and the value at which the resolution was bisected (the value of the bisect_func of the bisect function). You could for example plot the bisection value of all the found partitions by using:

import pandas as pd
import matplotlib.pyplot as plt
res_df = pd.DataFrame({
         'resolution': res_parts.keys(),
         'bisect_value': [bisect.bisect_value for bisect in res_parts.values()]});
plt.step(res_df['resolution'], res_df['bisect_value']);
plt.xscale('log');

The result should look something like this for a random graph.
bisect
















Mucha, Peter J, Thomas Richardson, Kevin Macon, Mason A Porter, and Jukka-Pekka Onnela. 2010. “Community Structure in Time-Dependent, Multiscale, and Multiplex Networks.” Science (New York, N.Y.) 328 (5980): 876–78. https://doi.org/10.1126/science.1184819. Cite
Reichardt, Jörg, and Stefan Bornholdt. 2006. “Statistical Mechanics of Community Detection.” Physical Review E 74 (1): 016110+. https://doi.org/10.1103/PhysRevE.74.016110. Cite
Traag, V. A., and Jeroen Bruggeman. 2009. “Community Detection in Networks with Positive and Negative Links.” Physical Review E 80 (3): 036115. https://doi.org/10.1103/PhysRevE.80.036115. Cite
Traag, V. A., Gautier Krings, and Paul Van Dooren. 2013. “Significant Scales in Community Structure.” Scientific Reports 3: 2930. https://doi.org/10.1038/srep02930. Cite
Traag, V. A., P. Van Dooren, and Y. Nesterov. 2011. “Narrow Scope for Resolution-Limit-Free Community Detection.” Physical Review E 84 (1): 016114. https://doi.org/10.1103/PhysRevE.84.016114. Cite

Share


Viewing all articles
Browse latest Browse all 10

Trending Articles