#### DMCA

## Detecting highly overlapping community structure by greedy clique expansion (2010)

Citations: | 33 - 6 self |

### Citations

3607 |
Social Network Analysis: Methods and Applications. Cambridge Univ
- Wasserman, Faust
- 1994
(Show Context)
Citation Context ...h refer to maximal cliques simply as cliques. This choice of seeds is motivated by the observation that, on the one hand, cliques are one of the characteristic structures contained within communities =-=[18]-=-, while on the other hand—if one discards smaller cliques that are highly embedded in larger cliques— they are rare structures. We note that other CAAs exploit these properties of cliques [9, 16, 19],... |

3316 |
Collective dynamics of “small-world” networks
- Watts, Strogatz
- 1998
(Show Context)
Citation Context ...of triangles and cliques that have been observed in empirical graphs. Empirical networks show a strong tendency for transitivity, i.e., for two neighbors of a given node to be connected to each other =-=[20, 21]-=-, a process which leads to higher clustering coefficients and more cliques than one would expect to find in correspondingly sparse Erdős-Rényi graphs. Thus, the fact that GCE performs well on these s... |

1510 |
Community structure in social and biological networks. The National Academy of Sciences of the United
- Girvan, Newman
(Show Context)
Citation Context ...e one hand, there is a lack of large empirical datasets where the a priori or ground truth communities are known; and on the other hand, most synthetic data— especially the most popular, the GN model =-=[2]-=-—is overly simplistic and unrealistic, lacking key topological features such as a heterogeneous degree distribution, varied community sizes, and triadic closure, while also requiring that every node b... |

1475 |
Finding and evaluating community structure in networks
- Newman, Girvan
(Show Context)
Citation Context ...ructure is among the best, even when compared to non-overlapping CAAs, which specialize in this task. More specifically, GCE clearly outperforms the classic divisive GN algorithm of Newman and Girvan =-=[26]-=-, a similar divisive algorithm by Radicchi et al. [27], the EM method of Newman and Leicht [28], the Markov clustering algorithm (MCL) of Van Dongen [29], an information theoretic approach by Rosvall ... |

813 | Community detection in graphs
- Fortunato
- 2010
(Show Context)
Citation Context ...algorithms (CAAs) have been suggested, as computer scientists and physicists have taken on the problem of algorithmicly finding communities (for an excellent recent review of the field, see Fortunato =-=[1]-=-). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial ... |

600 |
Uncovering the overlapping community structure of complex networks in nature and society
- Palla, Derényi, et al.
- 2005
(Show Context)
Citation Context .... Similarly, in complex networks of interactions between proteins, it has been claimed that many proteins belong to multiple communities, each of which in turn corresponds to some biological function =-=[8, 9]-=-. Since 2005, the year in which Palla et al. ar X iv :1 00 2. 18 27 v2s[ ph ys ics .da taan ]s15sJu n 2 01 0 Figure 1: The ego-centric network of a Facebook user [7]. Note that this user belongs to se... |

586 | Fast unfolding of communities in large networks
- Blondel, Guillaume, et al.
- 2008
(Show Context)
Citation Context ...d annealing by Guimera and Amaral [32] in all cases except where the graph size was small and the community size large. GCE performs similarly to the modularity maximizing algorithm of Blondel et al. =-=[33]-=-, which was among the CAAs that Lancichinetti and Fortunato identified as a top performer, and slightly worse than the other two top performers: another information theoretic algorithm from Rosvall an... |

353 |
Kerbosch J: Algorithm 457: finding all cliques of an undirected graph
- Bron
- 1973
(Show Context)
Citation Context ...l of the cliques in a graph is generally computationally expensive, cliques can be found quickly in graphs that are sufficiently sparse. To this end, our implementation makes use of the Bron-Kerbosch =-=[22]-=- clique enumeration algorithm to efficiently find the maximal cliques that form seeds. In the large synthetic and empirical networks that we analyze in section 4 and section 5, the computation require... |

285 | Comparing community structure identification
- Danon, Diaz-Guilera, et al.
- 2005
(Show Context)
Citation Context ...tiontheoretic similarity measure. This measure is normalized such that the NMI of two sets of communities is 1 if they are identical, and 0 if they are totally independent of each other. Danon et al. =-=[25]-=- first applied NMI to the problem of evaluating the similarity of two sets of communities, but defined the measure only for partitions. In our benchmarks, we employ a variant of NMI introduced by Lanc... |

265 | Functional cartography of complex metabolic networks. Nature
- Guimera, Amaral
- 2005
(Show Context)
Citation Context ...al algorithm by Donetti and Munoz [31]. Against other algorithms, results were mixed. GCE performed better than a method based on modularity optimization via simulated annealing by Guimera and Amaral =-=[32]-=- in all cases except where the graph size was small and the community size large. GCE performs similarly to the modularity maximizing algorithm of Blondel et al. [33], which was among the CAAs that La... |

240 |
Defining and identifying communities in networks
- Radicchi, Castellano, et al.
- 2004
(Show Context)
Citation Context ...verlapping CAAs, which specialize in this task. More specifically, GCE clearly outperforms the classic divisive GN algorithm of Newman and Girvan [26], a similar divisive algorithm by Radicchi et al. =-=[27]-=-, the EM method of Newman and Leicht [28], the Markov clustering algorithm (MCL) of Van Dongen [29], an information theoretic approach by Rosvall and Bergstrom [30], and a spectral algorithm by Donett... |

239 | Maps of random walks on complex networks reveal community structure
- Rosvall, Bergstrom
- 2008
(Show Context)
Citation Context ...visive algorithm by Radicchi et al. [27], the EM method of Newman and Leicht [28], the Markov clustering algorithm (MCL) of Van Dongen [29], an information theoretic approach by Rosvall and Bergstrom =-=[30]-=-, and a spectral algorithm by Donetti and Munoz [31]. Against other algorithms, results were mixed. GCE performed better than a method based on modularity optimization via simulated annealing by Guime... |

183 |
2009b. Community detection algorithms: A comparative analysis. arXiv:0908.1062
- Lancichinetti, Fortunato
(Show Context)
Citation Context ... specification (called LFR), they and others have subsequently discovered—with a level of subtlety previously unattained— under what topological conditions a wide range of CAAs perform well or poorly =-=[4, 5]-=-. One surprising result revealed by this recent benchmarking is the poor performance of many CAAs when it comes to detecting moderately overlapping community structure. It is intuitive from our knowle... |

147 | Detecting the overlapping and hierarchical community structure in complex networks
- Lancichinetti, Fortunato, et al.
- 2009
(Show Context)
Citation Context ...ne cannot argue that any given fitness function will be appropriate for all types of network data. Nevertheless, in our experiments, we found that the fitness function defined by Lancichinetti et al. =-=[14]-=- provided good results on a wide range of synthetic and empirical data. They define the fitness of a community S in terms of S’s internal degree kSin and external degree kSout. kSin is equal to twice ... |

137 | Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae
- Collins, Kemmeren, et al.
(Show Context)
Citation Context ...d can thus be detected by CAAs. We use a set of known protein complexes as an approximate, ground truth. To construct the PPI network, we used the interaction data found in the Combined-AP/MS network =-=[35]-=-.3, which contains 1622 proteins and 9070 interactions. For the ground truth communities, we used the complexes listed in the CYC dataset of known complexes4, selecting only those complexes that were ... |

119 |
Finding local community structure in networks
- Clauset
- 2005
(Show Context)
Citation Context ...revious work in community assignment suggests that—by utilizing a community fitness function such as in eq. (1)— S can be efficiently expanded into C through a technique of greedy local optimization. =-=[10, 14, 15, 17]-=- This technique can be varied, but can be generally summarized in the following steps: 1. For each node v in the frontier of S (e.g., the red nodes in fig. 2), calculate v’s node fitness, i.e., how mu... |

98 |
Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities
- Lancichinetti, Fortunato
(Show Context)
Citation Context ...serious limit of the field. Because of that, it is still impossible to state which method (or subset of methods) is the most reliable in applications...” In the last year, Lancichinetti and Fortunato =-=[3]-=- have addressed this uncertainty by specifying a means of creating more realistic synthetic benchmark graphs, which have scale-free degree and community size distributions as well as overlapping commu... |

89 |
Mixture models and exploratory analysis in networks
- Newman, Leicht
- 2007
(Show Context)
Citation Context ... task. More specifically, GCE clearly outperforms the classic divisive GN algorithm of Newman and Girvan [26], a similar divisive algorithm by Radicchi et al. [27], the EM method of Newman and Leicht =-=[28]-=-, the Markov clustering algorithm (MCL) of Van Dongen [29], an information theoretic approach by Rosvall and Bergstrom [30], and a spectral algorithm by Donetti and Munoz [31]. Against other algorithm... |

77 | A new cluster algorithm for graphs.
- Dongen
- 1998
(Show Context)
Citation Context ...sic divisive GN algorithm of Newman and Girvan [26], a similar divisive algorithm by Radicchi et al. [27], the EM method of Newman and Leicht [28], the Markov clustering algorithm (MCL) of Van Dongen =-=[29]-=-, an information theoretic approach by Rosvall and Bergstrom [30], and a spectral algorithm by Donetti and Munoz [31]. Against other algorithms, results were mixed. GCE performed better than a method ... |

77 |
Detecting network communities: a new systematic and efficient algorithm
- Donetti, Munoz
- 2004
(Show Context)
Citation Context ...hod of Newman and Leicht [28], the Markov clustering algorithm (MCL) of Van Dongen [29], an information theoretic approach by Rosvall and Bergstrom [30], and a spectral algorithm by Donetti and Munoz =-=[31]-=-. Against other algorithms, results were mixed. GCE performed better than a method based on modularity optimization via simulated annealing by Guimera and Amaral [32] in all cases except where the gra... |

74 |
Finding overlapping communities in networks by label propagation
- GREGORY
(Show Context)
Citation Context ...hmarking of CAAs has been on graphs with non-overlapping communities. In particular, Lancichinetti and Fortunato [4] have benchmarked a wide variety of CAAs on a particular set of LFR graphs. Gregory =-=[5]-=- have recently followed suit and benchmarked more algorithms on this set of graphs, so we continue in this vein and benchmark GCE and a number of other CAAs to see how they perform on this emerging st... |

69 | An algorithm to find overlapping community structure - Gregory |

54 |
Detect overlapping and hierarchical community structure in networks
- Shen, Cheng, et al.
- 2008
(Show Context)
Citation Context ...unities [18], while on the other hand—if one discards smaller cliques that are highly embedded in larger cliques— they are rare structures. We note that other CAAs exploit these properties of cliques =-=[9, 16, 19]-=-, but none of them utilize cliques as seeds in the greedy expansion strategy mentioned above. One of the key parameters of our algorithm, k, is the minimum number of nodes that a clique must contain i... |

39 | A local method for detecting communities
- Bagrow, Bollt, et al.
- 2005
(Show Context)
Citation Context ...revious work in community assignment suggests that—by utilizing a community fitness function such as in eq. (1)— S can be efficiently expanded into C through a technique of greedy local optimization. =-=[10, 14, 15, 17]-=- This technique can be varied, but can be generally summarized in the following steps: 1. For each node v in the frontier of S (e.g., the red nodes in fig. 2), calculate v’s node fitness, i.e., how mu... |

33 |
Finding communities by clustering a graph into overlapping subgraphs
- Baumes, Goldberg, et al.
- 2005
(Show Context)
Citation Context ..., which utilizes a label propagation technique [5]; abchampions, which finds all regions of the graph with a certain difference between internal density and external sparsity [13]; and Iterative Scan =-=[15]-=-, which we have described in section 2. All implementations we used were from the authors. Just as GCE’s parameters were fixed at default values, we left the parameters of the other algorithms set to ... |

29 |
Multiresolution community detection for megascale networks by information-based replica correlations.
- Ronhovde, Nussinov
- 2009
(Show Context)
Citation Context ... as a top performer, and slightly worse than the other two top performers: another information theoretic algorithm from Rosvall and Bergstrom [30], and a Potts model approach by Ronhovde and Nussinov =-=[34]-=-. 5. EMPIRICAL BENCHMARKS In this section we strive to demonstrate GCE’s ability to identify meaningful communities in the context of non-trivial empirical networks, for which ground-truths are availa... |

25 |
A sequential algorithm for fast clique percolation,” Physical Review E,
- Kumpula, Kivela, et al.
- 2008
(Show Context)
Citation Context ...ies, we removed all complexes with fewer than four proteins from the ground truth. Consequently, we use the value of 3 as the minimum clique size for all clique based algorithms. Note also, we use SCP=-=[36]-=- instead of CFinder here, as the latter fails to terminate on this dataset. The resulting ground truth contains 880 proteins; 136 of these belong to more than one complex. As the ground truth contains... |

14 | Finding overlapping communities using disjoint community detection algorithms. - Gregory - 2009 |

14 |
A scalable, parallel algorithm for maximal clique enumeration
- Schmidt, Samatova, et al.
- 2009
(Show Context)
Citation Context ...me, compared to the computation required to expand seeds and check for near duplicates. To futher support the claim that finding cliques in sparse graphs is scalable, we point out that Schmidt et al. =-=[23]-=- have recently introduced a parallel variant of the Bron-Kerbosch algorithm, which they demonstrate can achieve a linear parallel speedup even when using 2048 processors. Greedy Expansion. Greedy seed... |

8 |
Clustering in complex networks
- Serrano, Boguñá
- 2006
(Show Context)
Citation Context ...of triangles and cliques that have been observed in empirical graphs. Empirical networks show a strong tendency for transitivity, i.e., for two neighbors of a given node to be connected to each other =-=[20, 21]-=-, a process which leads to higher clustering coefficients and more cliques than one would expect to find in correspondingly sparse Erdős-Rényi graphs. Thus, the fact that GCE performs well on these s... |

6 |
Detection of node group membership in networks with group overlap
- Sawardecker, Sales-Pardo, et al.
(Show Context)
Citation Context .... Similarly, in complex networks of interactions between proteins, it has been claimed that many proteins belong to multiple communities, each of which in turn corresponds to some biological function =-=[8, 9]-=-. Since 2005, the year in which Palla et al. ar X iv :1 00 2. 18 27 v2s[ ph ys ics .da taan ]s15sJu n 2 01 0 Figure 1: The ego-centric network of a Facebook user [7]. Note that this user belongs to se... |

4 | Detecting communities in networks by merging cliques
- Yan, Gregory
- 2009
(Show Context)
Citation Context ...unities [18], while on the other hand—if one discards smaller cliques that are highly embedded in larger cliques— they are rare structures. We note that other CAAs exploit these properties of cliques =-=[9, 16, 19]-=-, but none of them utilize cliques as seeds in the greedy expansion strategy mentioned above. One of the key parameters of our algorithm, k, is the minimum number of nodes that a clique must contain i... |

2 |
Clustering social networks. Lecture notes in computer science
- Mishral, Schreiber, et al.
- 2007
(Show Context)
Citation Context ...eeds randomly [14]; COPRA, which utilizes a label propagation technique [5]; abchampions, which finds all regions of the graph with a certain difference between internal density and external sparsity =-=[13]-=-; and Iterative Scan [15], which we have described in section 2. All implementations we used were from the authors. Just as GCE’s parameters were fixed at default values, we left the parameters of the... |

1 |
Facebook statistics
- Room
- 2009
(Show Context)
Citation Context ...erlap, potentially to a high degree. Consider, for example, a social network site like Facebook. On average, a Facebook user has 130 “friends,” who typically belong to multiple distinct social groups =-=[6]-=-. These groups may correspond to ties formed in high-school, college, professional settings, and family. Figure 1, which depicts the ego-centric network of a Facebook user, demonstrates this tendency ... |