Results 1 
2 of
2
Overlapping Community Detection Using Seed Set Expansion
"... Community detection is an important task in network analysis. A community (also referred to as a cluster) is a set of cohesive vertices that have more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social netw ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
Community detection is an important task in network analysis. A community (also referred to as a cluster) is a set of cohesive vertices that have more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social network, each vertex in a graph corresponds to an individual who usually participates in multiple communities. One of the most successful techniques for finding overlapping communities is based on local optimization and expansion of a community metric around a seed set of vertices. In this paper, we propose an efficient overlapping community detection algorithm using a seed set expansion approach. In particular, we develop new seeding strategies for a personalized PageRank scheme that optimizes the conductance community score. The key idea of our algorithm is to find good seeds, and then expand these seed sets using the personalized PageRank clustering procedure. Experimental results show that this seed set expansion approach outperforms other stateoftheart overlapping community detection methods. We also show that our new seeding strategies are better than previous strategies, and are thus effective in finding good overlapping clusters in a graph.
Provably Fast Inference of Latent Features from Networks with Applications to Learning Social Circles and Multilabel Classification
"... A well known phenomenon in social networks is homophily, the tendency of agents to connect with similar agents. A derivative of this phenomenon is the emergence of communities. Another phenomenon observed in numerous networks is the existence of certain agents that belong simultaneously to multiple ..."
Abstract
 Add to MetaCart
(Show Context)
A well known phenomenon in social networks is homophily, the tendency of agents to connect with similar agents. A derivative of this phenomenon is the emergence of communities. Another phenomenon observed in numerous networks is the existence of certain agents that belong simultaneously to multiple communities. An understanding of these phenomena constitutes a central research topic of network science. In this work we focus on a fundamental theoretical question related to the above phenomena with various applications: given an undirected graph G, can we infer efficiently the latent vertex features which explain the observed network structure under the assumption of a generative model that exhibits homophily? We propose a probabilistic generative model with the property that the probability of an edge among two vertices is a nondecreasing function of the common features they possess. This property is true for many realworld networks and surprisingly is ignored by many popular overlapping community detection methods as it was shown recently by the empirical work of Yang and Leskovec [44]. Our main theoretical contribution is the first provably rapidly mixing Markov chain for inferring latent features. On the experimental side, we verify the efficiency of our method in terms of run times, where we observe that it significantly outperforms stateoftheart methods. Our method is more than 2 400 times faster than a stateoftheart machine learning method [37] and typically provides nontrivial speedups compared to BigClam [43]. Furthermore, we verify on realdata with groundtruth available that our method learns efficiently high quality labelings. We use our method to learn social circles from Twitter egonetworks and perform multilabel classification.