Results 1 - 10
of
123
Time-varying graphs and dynamic networks
- International Journal of Parallel, Emergent and Distributed Systems
"... The past few years have seen intensive research efforts carried out in some apparently unrelated areas of dynamic systems – delay-tolerant networks, opportunistic-mobility networks, social networks – obtaining closely related insights. Indeed, the concepts discovered in these investigations can be v ..."
Abstract
-
Cited by 61 (21 self)
- Add to MetaCart
(Show Context)
The past few years have seen intensive research efforts carried out in some apparently unrelated areas of dynamic systems – delay-tolerant networks, opportunistic-mobility networks, social networks – obtaining closely related insights. Indeed, the concepts discovered in these investigations can be viewed as parts of the same conceptual universe; and the formal models proposed so far to express some specific concepts are components of a larger formal description of this universe. The main contribution of this paper is to integrate the vast collection of concepts, formalisms, and results found in the literature into a unified framework, which we call TVG (for time-varying graphs). Using this framework, it is possible to express directly in the same formalism not only the concepts common to all those different areas, but also those specific to each. Based on this definitional work, employing both existing results and original observations, we present a hierarchical classification of TVGs; each class corresponds to a significant property examined in the distributed computing literature. We then examine how TVGs can be used to study the evolution of network properties, and propose different techniques, depending on whether the indicators for these properties are a-temporal (as in the majority of existing studies) or temporal. Finally, we briefly discuss the introduction of randomness in TVGs.
Uncovering the temporal dynamics of diffusion networks
- in Proc. of the 28th Int. Conf. on Machine Learning (ICML’11
, 2011
"... Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected – but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferri ..."
Abstract
-
Cited by 56 (11 self)
- Add to MetaCart
Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected – but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and retarding infections, broadly construed. To this end, we model diffusion processes as discrete networks of continuous temporal processes occurring at different rates. Given cascade data – observed infection times of nodes – we infer the edges of the global diffusion network and estimate the transmission rates of each edge that best explain the observed data. The optimization problem is convex. The model naturally (without heuristics) imposes sparse solutions and requires no parameter tuning. The problem decouples into a collection of independent smaller problems, thus scaling easily to networks on the order of hundreds of thousands of nodes. Experiments on real and synthetic data show that our algorithm both recovers the edges of diffusion networks and accurately estimates their transmission rates from cascade data. 1.
Streaming graph partitioning for large distributed graphs
"... Extracting knowledge by performing computations on graphs is becoming increasingly challenging as graphs grow in size. A standard approach distributes the graph over a cluster of nodes, but performing computations on a distributed graph is expensive if large amount of data have to be moved. Without ..."
Abstract
-
Cited by 52 (2 self)
- Add to MetaCart
(Show Context)
Extracting knowledge by performing computations on graphs is becoming increasingly challenging as graphs grow in size. A standard approach distributes the graph over a cluster of nodes, but performing computations on a distributed graph is expensive if large amount of data have to be moved. Without partitioning the graph, communication quickly becomes a limiting factor in scaling the system up. Existing graph partitioning heuristics incur high computation and communication cost on large graphs, sometimes as high as the future computation itself. Observing that the graph has to be loaded into the cluster, we ask if the partitioning can be done at the same time with a lightweight streaming algorithm. We propose natural, simple heuristics and compare their performance to hashing and METIS, a fast, offline heuristic. We show on a large collection of graph datasets that our heuristics are a significant improvement, with the best obtaining an average gain of 76%. The heuristics are scalable in the size of the graphs and the number of partitions. Using our streaming partitioning methods, we are able to speed up PageRank computations on Spark [32], a distributed computation system, by 18 % to 39 % for large social networks.
Multiplicative Attribute Graph Model of Real-World Networks
, 1009
"... Large scale real-world network data, such as social networks, Internet and Web graphs, are ubiquitous. The study of such social and information networks seeks to find patterns and explain their emergence through tractable models. In most networks, especially in social networks, nodes have a rich set ..."
Abstract
-
Cited by 46 (4 self)
- Add to MetaCart
(Show Context)
Large scale real-world network data, such as social networks, Internet and Web graphs, are ubiquitous. The study of such social and information networks seeks to find patterns and explain their emergence through tractable models. In most networks, especially in social networks, nodes have a rich set of attributes (e.g., age, gender) associated with them. However, many existing network models focus on modeling the network structure while ignoring the features of the nodes. Here we present a model that we refer to as the Multiplicative Attribute Graphs (MAG), which naturally captures the interactions between the network structure and node attributes. We consider a model where each node has a vector of categorical latent attributes associated with it. The probability of an edge between a pair of nodes then depends on the product of individual attribute-attribute similarities. This model yields itself to mathematical analysis and we derive thresholds for the connectivity and the emergence of the giant connected component, and show that the model gives rise to graphs with a constant diameter. We analyze the degree distribution to show that the model can produce networks with either log-normal or power-law degree distribution depending on certain conditions. 1
The Network Completion Problem: Inferring Missing Nodes and Edges in Networks
"... While the social and information networks have become ubiquitous, the challenge ofcollecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the uno ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
(Show Context)
While the social and information networks have become ubiquitous, the challenge ofcollecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM algorithm with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parametersas well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes. 1
Evolution of Social-Attribute Networks: Measurements, Modeling, and Implications using Google+
, 2012
"... Understanding social network structure and evolution has important implications for many aspects of network and system design including provisioning, bootstrapping trust and reputation systems via social networks, and defenses against Sybil attacks. Several recent results suggest that augmenting the ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
Understanding social network structure and evolution has important implications for many aspects of network and system design including provisioning, bootstrapping trust and reputation systems via social networks, and defenses against Sybil attacks. Several recent results suggest that augmenting the social network structure with user attributes (e.g., location, employer, communities of interest) can provide a more fine-grained understanding of social networks. However, there have been few studies to provide a systematic understanding of these effects at scale. We bridge this gap using a unique dataset collected as the Google+ social network grew over time since its release in late June 2011. We observe novel phenomena with respect to both standard social network metrics and new attribute-related metrics (that we define). We also observe interesting evolutionary patterns as Google+ went from a bootstrap phase to a steady invitation-only stage before a public release. Based on our empirical observations, we develop a new generative model to jointly reproduce the social structure and the node attributes. Using theoretical analysis and empirical evaluations, we show that our model can accurately reproduce the social and attribute structure of real social networks. We also demonstrate that our model provides more accurate predictions for practical application contexts.
Mizan: A system for dynamic load balancing in large-scale graph processing
- In EuroSys ’13
, 2013
"... Pregel [23] was recently introduced as a scalable graph min-ing system that can provide significant performance im-provements over traditional MapReduce implementations. Existing implementations focus primarily on graph par-titioning as a preprocessing step to balance computation across compute node ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
(Show Context)
Pregel [23] was recently introduced as a scalable graph min-ing system that can provide significant performance im-provements over traditional MapReduce implementations. Existing implementations focus primarily on graph par-titioning as a preprocessing step to balance computation across compute nodes. In this paper, we examine the run-time characteristics of a Pregel system. We show that graph partitioning alone is insufficient for minimizing end-to-end computation. Especially where data is very large or the run-time behavior of the algorithm is unknown, an adaptive ap-proach is needed. To this end, we introduce Mizan, a Pregel system that achieves efficient load balancing to better adapt to changes in computing needs. Unlike known implementa-tions of Pregel, Mizan does not assume any a priori knowl-edge of the structure of the graph or behavior of the algo-rithm. Instead, it monitors the runtime characteristics of the system. Mizan then performs efficient fine-grained vertex migration to balance computation and communication. We have fully implemented Mizan; using extensive evaluation we show that—especially for highly-dynamic workloads— Mizan provides up to 84 % improvement over techniques leveraging static graph pre-partitioning. 1.
Learning Networks of Heterogeneous Influence
- In NIPS, 2012a
"... Information, disease, and influence diffuse over networks of entities in both nat-ural systems and human society. Analyzing these transmission networks plays an important role in understanding the diffusion processes and predicting future events. However, the underlying transmission networks are oft ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
(Show Context)
Information, disease, and influence diffuse over networks of entities in both nat-ural systems and human society. Analyzing these transmission networks plays an important role in understanding the diffusion processes and predicting future events. However, the underlying transmission networks are often hidden and in-complete, and we observe only the time stamps when cascades of events happen. In this paper, we address the challenging problem of uncovering the hidden net-work only from the cascades. The structure discovery problem is complicated by the fact that the influence between networked entities is heterogeneous, which can not be described by a simple parametric model. Therefore, we propose a kernel-based method which can capture a diverse range of different types of influence without any prior assumption. In both synthetic and real cascade data, we show that our model can better recover the underlying diffusion network and drastically improve the estimation of the transmission functions among networked entities. 1
Modeling social networks with node attributes using the multiplicative attribute graph model
- In UAI
, 2011
"... Networks arising from social, technological and natural domains exhibit rich connectivity patterns and nodes in such networks are often labeled with attributes or features. We address the question of modeling the structure of networks where nodes have attribute information. We present a Multiplicati ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Networks arising from social, technological and natural domains exhibit rich connectivity patterns and nodes in such networks are often labeled with attributes or features. We address the question of modeling the structure of networks where nodes have attribute information. We present a Multiplicative Attribute Graph (MAG) model that considers nodes with categorical attributes and models the probability of an edge as the product of individual attribute link formation affinities. We developascalablevariationalexpectation maximization parameter estimation method. Experiments show that MAG model reliably captures network connectivity as well as provides insights into how different attributes shape the network structure. 1
Latent multi-group membership graph model
"... We develop the Latent Multi-group Membership Graph (LMMG) model, a model of networks with rich node feature structure. In the LMMG model, each node belongs to multiple groups and each latent group models the occurrence of links as well as the node feature structure. The LMMG can be used to summarize ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
We develop the Latent Multi-group Membership Graph (LMMG) model, a model of networks with rich node feature structure. In the LMMG model, each node belongs to multiple groups and each latent group models the occurrence of links as well as the node feature structure. The LMMG can be used to summarize the network structure, to predict links between the nodes, and to pre-dict missing features of a node. We derive effi-cient inference and learning algorithms and eval-uate the predictive performance of the LMMG on several social and document network datasets. 1.