Results 1 - 10
of
13
Efficient Identification of Overlapping Communities
- In IEEE International Conference on Intelligence and Security Informatics (ISI
, 2005
"... In this paper, we present an e#cient algorithm for finding overlapping communities in social networks. Our algorithm does not rely on the contents of the messages and uses the communication graph only. ..."
Abstract
-
Cited by 31 (17 self)
- Add to MetaCart
In this paper, we present an e#cient algorithm for finding overlapping communities in social networks. Our algorithm does not rely on the contents of the messages and uses the communication graph only.
An Algorithm to Find Overlapping Community Structure in Networks
"... Abstract. Recent years have seen the development of many graph clustering algorithms, which can identify community structure in networks. The vast majority of these only find disjoint communities, but in many real-world networks communities overlap to some extent. We present a new algorithm for disc ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Abstract. Recent years have seen the development of many graph clustering algorithms, which can identify community structure in networks. The vast majority of these only find disjoint communities, but in many real-world networks communities overlap to some extent. We present a new algorithm for discovering overlapping communities in networks, by extending Girvan and Newman’s well-known algorithm based on the betweenness centrality measure. Like the original algorithm, ours performs hierarchical clustering — partitioning a network into any desired number of clusters — but allows them to overlap. Experiments confirm good performance on randomly generated networks based on a known overlapping community structure, and interesting results have also been obtained on a range of real-world networks. 1
Cluster ranking with an application to mining mailbox networks
- In ICDM ’06: Proceedings of the Sixth International Conference on Data Mining
, 2006
"... We initiate the study of a new clustering framework, called cluster ranking. Rather than simply partitioning a network into clusters, a cluster ranking algorithm also orders the clusters by their strength. To this end, we introduce a novel strength measure for clusters—the integrated cohesion—which ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We initiate the study of a new clustering framework, called cluster ranking. Rather than simply partitioning a network into clusters, a cluster ranking algorithm also orders the clusters by their strength. To this end, we introduce a novel strength measure for clusters—the integrated cohesion—which is applicable to arbitrary weighted networks. We then present C-Rank: a new cluster ranking algorithm. Given a network with arbitrary pairwise similarity weights, C-Rank creates a list of overlapping clusters and ranks them by their integrated cohesion. We provide extensive theoretical and empirical analysis of C-Rank and show that it is likely to have high precision and recall. A main component of C-Rank is a heuristic algorithm for finding sparse vertex separators. At the core of this algorithm is a new connection between the well known measure of vertex betweenness and multicommodity flow. Our experiments focus on mining mailbox networks. A mailbox network is an egocentric social network, consisting of contacts with whom an individual exchanges email. Ties among contacts are represented by the frequency of their co–occurrence on message headers. C-Rank is well suited to mine such networks, since they are abundant with overlapping communities of highly variable strengths. We demonstrate the effectiveness of C-Rank on the Enron data set, consisting of 130 mailbox networks. 1
Finding Overlapping Communities in Social Networks
"... Abstract—Increasingly, methods to identify community structure in networks have been proposed which allow groups to overlap. These methods have taken a variety of forms, resulting in a lack of consensus as to what characteristics overlapping communities should have. Furthermore, overlapping communit ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract—Increasingly, methods to identify community structure in networks have been proposed which allow groups to overlap. These methods have taken a variety of forms, resulting in a lack of consensus as to what characteristics overlapping communities should have. Furthermore, overlapping community detection algorithms have been justified using intuitive arguments, rather than quantitative observations. This lack of consensus and empirical justification has limited the adoption of methods which identify overlapping communities. In this text, we distil from previous literature a minimal set of axioms which overlapping communities should satisfy. Additionally, we modify a previously published algorithm, Iterative Scan, to ensure that these properties are met. By analyzing the community structure of a large blog network, we present both structural and attribute based verification that overlapping communities naturally and frequently occur. Keywords-social network analysis, community detection, overlapping groups I.
Fast Overlapping Clustering of Networks Using Sampled Spectral Distance Embedding and GMMs
"... Clustering social networks is vital to understanding online interactions and influence. This task becomes more difficult when communities overlap, and when the social networks become extremely large. We present an efficient algorithm for constructing overlapping clusters, roughly linear in the size ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Clustering social networks is vital to understanding online interactions and influence. This task becomes more difficult when communities overlap, and when the social networks become extremely large. We present an efficient algorithm for constructing overlapping clusters, roughly linear in the size of the network. The algorithm first embeds the graph and then performs a metric clustering using a Gaussian Mixture Model (GMM). We evaluate the algorithm on the DBLP paper-paper network which consists of about 1 million nodes and over 30 million edges; we can cluster this network in under 20 minutes on a modest single CPU machine.
Spatial Scan Statistics for Graph Clustering
"... In this paper, we present a measure associated with detection and inference of statistically anomalous clusters of a graph based on the likelihood test of observed and expected edges in a subgraph. This measure is adapted from spatial scan statistics for point sets and provides quantitative assessme ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper, we present a measure associated with detection and inference of statistically anomalous clusters of a graph based on the likelihood test of observed and expected edges in a subgraph. This measure is adapted from spatial scan statistics for point sets and provides quantitative assessment for clusters. We discuss some important properties of this statistic and its relation to modularity and Bregman divergences. We apply a simple clustering algorithm to find clusters with large values of this measure in a variety of real-world data sets, and we illustrate its ability to identify statistically significant clusters of selected granularity. 1 Introduction. Numerous techniques have been proposed for identifying clusters in large networks, but it has proven difficult to
Overlapping Communities in Social Networks
"... Identifying communities is essential for understanding the dynamics of a social network. The prevailing approach to the problem of community discovery is to partition the network into disjoint groups of members that exhibit a high degree of internal communication. This approach ignores the possibili ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Identifying communities is essential for understanding the dynamics of a social network. The prevailing approach to the problem of community discovery is to partition the network into disjoint groups of members that exhibit a high degree of internal communication. This approach ignores the possibility that an individual may belong to two or more groups. Increasingly, researchers have begun to explore new methods which allow groups to overlap. One problem with existing approaches is that the definition of a community comes as the result of a particular algorithm. Such an approach to ”defining”communities has been extended to overlapping communities with some success. Our goals in this paper are twofold: first, to present an axiomatic approach to defining overlapping communities in terms of the properties a group should satisfy to be a community; and second, to justify the existence of overlapping in the structure of social communities experimentally using LiveJournal Blog data. Historically, the justification for overlapping groups has been primarily intuitive rather than quantitative. We present a heuristic algorithm which outputs a collection of communities that satisfy the required minimal properties and demonstrate that, in real-life social networks, a large number of individuals are members of communities which have non-trivial overlap with other communities. Keywords social network analysis, community detection, overlapping groups 1.
SLPA: Uncovering Overlapping Communities in Social Networks via A Speaker-listener Interaction Dynamic Process
"... Abstract—Overlap is one of the characteristics of social networks, in which a person may belong to more than one social group. For this reason, discovering overlapping structures is necessary for realistic social analysis. In this paper, we present a novel, general framework to detect and analyze bo ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract—Overlap is one of the characteristics of social networks, in which a person may belong to more than one social group. For this reason, discovering overlapping structures is necessary for realistic social analysis. In this paper, we present a novel, general framework to detect and analyze both individual overlapping nodes and entire communities. In this framework, nodes exchange labels according to dynamic interaction rules. A specific implementation called Speakerlistener Label Propagation Algorithm (SLPA 1) demonstrates an excellent performance in identifying both overlapping nodes and overlapping communities with different degrees of diversity. Keywords-social network; overlapping community detection; label propagation; dynamic interaction; algorithm; I.
Dynamic Network Evolution: Models, Clustering, Anomaly Detection
"... Abstract — Traditionally, research on graph theory focused on studying graphs that are static. However, almost all real networks are dynamic in nature and large in size. Quite recently, research areas for studying the topology, evolution, applications of complex evolving networks and processes occur ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract — Traditionally, research on graph theory focused on studying graphs that are static. However, almost all real networks are dynamic in nature and large in size. Quite recently, research areas for studying the topology, evolution, applications of complex evolving networks and processes occurring in them and governing them attracted attention from researchers. In this work, we review the significant contributions in the literature on complex evolving networks; metrics used from degree distribution to spectral graph analysis, real world applications from biology to social sciences, problem domains from anomaly detection, dynamic graph clustering to community detection. I.
Separating Features from Noise with Persistence and Statistics
"... In this thesis, we explore techniques in statistics and persistent homology, which detect features among data sets such as graphs, triangulations and point cloud. We accompany our theorems with algorithms and experiments, to demonstrate their effectiveness in practice. We start with the derivation o ..."
Abstract
- Add to MetaCart
In this thesis, we explore techniques in statistics and persistent homology, which detect features among data sets such as graphs, triangulations and point cloud. We accompany our theorems with algorithms and experiments, to demonstrate their effectiveness in practice. We start with the derivation of graph scan statistics, a measure useful to assess the statistical significance of a subgraph in terms of edge density. We cluster graphs into densely-connected subgraphs based on this measure. We give algorithms for finding such clusterings and experiment on real-world data. We next study statistics on persistence, for piecewise-linear functions defined on the triangulations of topological spaces. We derive persistence pairing probabilities among vertices in the triangulation. We also provide upper bounds for total persistence in expectation. We continue by examining the elevation function defined on the triangulation of a surface. Its local maxima obtained by persistence pairing are useful in describing features of the triangulations of protein surfaces. We describe an algorithm to compute these local maxima, with a run-time ten-thousand times faster in practice than previous method. We connect such improvement with the total Gaussian curvature of the surfaces.

