Results 1  10
of
60
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract

Cited by 246 (14 self)
 Add to MetaCart
(Show Context)
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse realworld networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large realworld networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “communitylike.” This behavior is not explained, even at a qualitative level, by any of the commonlyused network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are wellembeddable in a lowdimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large welldefined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract

Cited by 208 (17 self)
 Add to MetaCart
(Show Context)
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large realworld networks, ranging from traditional and online social networks, to technological and information networks and
Multiplicative Attribute Graph Model of RealWorld Networks
, 1009
"... Large scale realworld network data, such as social networks, Internet and Web graphs, are ubiquitous. The study of such social and information networks seeks to find patterns and explain their emergence through tractable models. In most networks, especially in social networks, nodes have a rich set ..."
Abstract

Cited by 46 (4 self)
 Add to MetaCart
(Show Context)
Large scale realworld network data, such as social networks, Internet and Web graphs, are ubiquitous. The study of such social and information networks seeks to find patterns and explain their emergence through tractable models. In most networks, especially in social networks, nodes have a rich set of attributes (e.g., age, gender) associated with them. However, many existing network models focus on modeling the network structure while ignoring the features of the nodes. Here we present a model that we refer to as the Multiplicative Attribute Graphs (MAG), which naturally captures the interactions between the network structure and node attributes. We consider a model where each node has a vector of categorical latent attributes associated with it. The probability of an edge between a pair of nodes then depends on the product of individual attributeattribute similarities. This model yields itself to mathematical analysis and we derive thresholds for the connectivity and the emergence of the giant connected component, and show that the model gives rise to graphs with a constant diameter. We analyze the degree distribution to show that the model can produce networks with either lognormal or powerlaw degree distribution depending on certain conditions. 1
Dynamics of Large Networks
, 2008
"... A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models that explain processes which govern the network evolution, fit such models to real networks, and use them to generate realistic graphs or give formal explanations about their properties. In addition, our work has a wide range of applications: it can help us spot anomalous graphs and outliers, forecast future graph structure and run simulations of network evolution. Another important aspect of our research is the study of “local ” patterns and structures of propagation in networks. We aim to identify building blocks of the networks and find the patterns of influence that these blocks have on information or virus propagation over the network. Our recent work included the study of the spread of influence in a large persontoperson
Random dot product graph models for social network
 OF LECTURE NOTES IN COMPUTER SCIENCE
, 2007
"... Inspired by the recent interest in combining geometry with random graph models, we explore in this paper two generalizations of the random dot product graph model proposed by Kraetzl, Nickel and Scheinerman, and Tucker [1, 2]. In particular we consider the properties of clustering, diameter and deg ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
(Show Context)
Inspired by the recent interest in combining geometry with random graph models, we explore in this paper two generalizations of the random dot product graph model proposed by Kraetzl, Nickel and Scheinerman, and Tucker [1, 2]. In particular we consider the properties of clustering, diameter and degree distribution with respect to these models. Additionally we explore the conductance of these models and show that in a geometric sense, the conductance is constant.
Editorial: The future of power law research
 Internet Mathematics
"... Abstract. I argue that power law research must move from focusing on observation, interpretation, and modeling of power law behavior to instead considering the challenging problems of validation of models and control of systems. 1. The Problem with Power Law Research To begin, I would like to recall ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
Abstract. I argue that power law research must move from focusing on observation, interpretation, and modeling of power law behavior to instead considering the challenging problems of validation of models and control of systems. 1. The Problem with Power Law Research To begin, I would like to recall a humorous insight from the paper of Fabrikant, Koutsoupias, and Papadimitriou [Fabrikant et al. 01], consisting of this quote and the following footnote. “Power laws... have been termed ‘the signature of human activity’... ” 1 The study of power laws, especially in networks, has clearly exploded over the last decade, with seemingly innumerable papers and even popular books, such as Barabási’s Linked [Barabási 02] and Watts ’ Six Degrees [Watts 03]. Power laws are, indeed, everywhere. Despite this remarkable success, I believe that research into power laws in computer networks (and networks more generally) suffers from glaring deficiencies that need to be addressed by the community. Coping with these deficiencies should lead to another great burst of exciting and compelling research. To explain the problem, I would like to make an analogy to the area of string theory. String theory is incredibly rich and beautiful mathematically, with a simple and compelling basic starting assumption: the universe’s building blocks do not really correspond to (zerodimensional) points, but to small 1 “They are certainly the product of one particular kind of human activity: looking for power laws... ” [Fabrikant et al. 01]
Expansion and lack thereof in randomly perturbed graphs
 Internet Math
"... Developing models of complex networks has been a major industry in the fields of physics, mathematics, and computer science during the last decade. Empirical study of numerous large networks harvested from the real world has revealed that, unlike the classical models of random graphs developed ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
Developing models of complex networks has been a major industry in the fields of physics, mathematics, and computer science during the last decade. Empirical study of numerous large networks harvested from the real world has revealed that, unlike the classical models of random graphs developed
A spatial web graph model with local influence regions
 Internet Mathematics
"... Abstract. We present a new stochastic model for complex networks, based on a spatial embedding of the nodes, called the Spatial Preferred Attachment (SPA) model. In the SPA model, nodes have influence regions of varying size, and new nodes may only link to a node if they fall within its influence re ..."
Abstract

Cited by 16 (11 self)
 Add to MetaCart
(Show Context)
Abstract. We present a new stochastic model for complex networks, based on a spatial embedding of the nodes, called the Spatial Preferred Attachment (SPA) model. In the SPA model, nodes have influence regions of varying size, and new nodes may only link to a node if they fall within its influence region. The spatial embedding of the nodes models the background knowledge or identity of the node, which will influence its link environment. In our model, nodes can determine their link environment based only on local knowledge of the network. We prove that our model gives a power law indegree distribution, with exponent in [2, ∞) depending on the parameters, and with concentration for a wide range of indegree values. We show that the model allows for edges that span a large distance in the underlying space, modelling a feature often observed in realworld complex networks. 1.
Geometric denoising of proteinprotein interaction networks
 PLoS Comput. Biol
, 2009
"... Understanding complex networks of proteinprotein interactions (PPIs) is one of the foremost challenges of the postgenomic era. Due to the recent advances in experimental biotechnology, including yeast2hybrid (Y2H), tandem affinity purification (TAP) and other highthroughput methods for protein ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
Understanding complex networks of proteinprotein interactions (PPIs) is one of the foremost challenges of the postgenomic era. Due to the recent advances in experimental biotechnology, including yeast2hybrid (Y2H), tandem affinity purification (TAP) and other highthroughput methods for proteinprotein interaction (PPI) detection, huge amounts of PPI network data are becoming available. Of major concern, however, are the levels of noise and incompleteness. For example, for Y2H screens, it is thought that the false positive rate could be as high as 64%, and the false negative rate may range from 43 % to 71%. TAP experiments are believed to have comparable levels of noise. We present a novel technique to assess the confidence levels of interactions in PPI networks obtained from experimental studies. We use it for predicting new interactions and thus for guiding future biological experiments. This technique is the first to utilize currently the best fitting network model for PPI networks, geometric graphs. Our approach achieves specificity of 85 % and sensitivity of 90%. We use it to assign confidence scores to physical proteinprotein interactions in the human PPI network downloaded from BioGRID. Using our approach, we predict 251 interactions in the human PPI network, a statistically significant fraction of which correspond to protein pairs sharing common GO terms. Moreover, we validate a statistically significant portion of our predicted interactions in the HPRD database and the newer release of BioGRID. The data and Matlab code implementing the