Results 1  10
of
13
From "Think Like a Vertex " to "Think Like a Graph"
"... To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a “think like a vertex ” programmin ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
(Show Context)
To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a “think like a vertex ” programming model to support iterative graph computation. This vertexcentric model is easy to program and has been proved useful for many graph algorithms. However, this model hides the partitioning information from the users, thus prevents many algorithmspecific optimizations. This often results in longer execution time due to excessive network messages (e.g. in Pregel) or heavy scheduling overhead to ensure data consistency (e.g. in GraphLab). To address this limitation, we propose a new “think like a graph ” programming paradigm. Under this graphcentric model, the partition structure is opened up to the users, and can be utilized so that communication within a partition can bypass the heavy message passing or scheduling machinery. We implemented this model in a new system, called Giraph++, based on Apache Giraph, an open source implementation of Pregel. We explore the applicability of the graphcentric model to three categories of graph algorithms, and demonstrate its flexibility and superior performance, especially on wellpartitioned data. For example, on a web graph with 118 million vertices and 855 million edges, the graphcentric version of connected component detection algorithm runs 63X faster and uses 204X fewer network messages than its vertexcentric counterpart. 1.
Fast Iterative Graph Computation: A Path Centric Approach
 In SC
, 2014
"... Abstract—Large scale graph processing represents an interesting systems challenge due to the lack of locality. This paper presents PathGraph, a system for improving iterative graph computation on graphs with billions of edges. Our system design has three unique features: First, we model a large gra ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Large scale graph processing represents an interesting systems challenge due to the lack of locality. This paper presents PathGraph, a system for improving iterative graph computation on graphs with billions of edges. Our system design has three unique features: First, we model a large graph using a collection of treebased partitions and use pathcentric computation rather than vertexcentric or edgecentric computation. Our pathcentric graph parallel computation model significantly improves the memory and disk locality for iterative computation algorithms on large graphs. Second, we design a compact storage that is optimized for iterative graph parallel computation. Concretely, we use deltacompression, partition a large graph into treebased partitions and store trees in a DFS order. By clustering highly correlated paths together, we further maximize sequential access and minimize random access on storage media. Third but not the least, we implement the pathcentric computation model by using a scatter/gather programming model, which parallels the iterative computation at partition tree level and performs sequential local updates for vertices in each tree partition to improve the convergence speed. We compare PathGraph to most recent alternative graph processing systems such as GraphChi and XStream, and show that the pathcentric approach outperforms vertexcentric and edgecentric systems on a number of graph algorithms for both inmemory and outofcore graphs.
Maiter: An Asynchronous Graph Processing Framework for Deltabased Accumulative Iterative Computation
"... Myriad of graphbased algorithms in machine learning and data mining require parsing relational data iteratively. These algorithms are implemented in a largescale distributed environment in order to scale to massive data sets. To accelerate these largescale graphbased iterative computations, we ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Myriad of graphbased algorithms in machine learning and data mining require parsing relational data iteratively. These algorithms are implemented in a largescale distributed environment in order to scale to massive data sets. To accelerate these largescale graphbased iterative computations, we propose deltabased accumulative iterative computation (DAIC). Different from traditional iterative computations, which iteratively update the result based on the result from the previous iteration, DAIC updates the result by accumulating the “changes” between iterations. By DAIC, we can process only the “changes” to avoid the negligible updates. Furthermore, we can perform DAIC asynchronously to bypass the highcost synchronous barriers in heterogeneous distributed environments. Based on the DAIC model, we design and implement an asynchronous graph processing framework, Maiter. We evaluate Maiter on local cluster as well as on Amazon EC2 Cloud. The results show that Maiter achieves as much as 60x speedup over Hadoop and outperforms other stateoftheart frameworks.
One Trillion Edges: Graph Processing at FacebookScale
"... ABSTRACT Analyzing large graphs provides valuable insights for social networking and web companies in content ranking and recommendations. While numerous graph processing systems have been developed and evaluated on available benchmark graphs of up to 6.6B edges, they often face significant difficu ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT Analyzing large graphs provides valuable insights for social networking and web companies in content ranking and recommendations. While numerous graph processing systems have been developed and evaluated on available benchmark graphs of up to 6.6B edges, they often face significant difficulties in scaling to much larger graphs. Industry graphs can be two orders of magnitude larger hundreds of billions or up to one trillion edges. In addition to scalability challenges, real world applications often require much more complex graph processing workflows than previously evaluated. In this paper, we describe the usability, performance, and scalability improvements we made to Apache Giraph, an opensource graph processing system, in order to use it on Facebookscale graphs of up to one trillion edges. We also describe several key extensions to the original Pregel model that make it possible to develop a broader range of production graph applications and workflows as well as improve code reuse. Finally, we report on realworld operations as well as performance characteristics of several largescale production applications.
Signal/Collect  Processing Large Graphs in Seconds
"... Both researchers and industry are confronted with the need to process increasingly large amounts of data, much of which has a natural graph representation. Some use MapReduce for scalable processing, but this abstraction is not designed for graphs and has shortcomings when it comes to both iterative ..."
Abstract
 Add to MetaCart
Both researchers and industry are confronted with the need to process increasingly large amounts of data, much of which has a natural graph representation. Some use MapReduce for scalable processing, but this abstraction is not designed for graphs and has shortcomings when it comes to both iterative and asynchronous processing, which are particularly important for graph algorithms. This paper presents the Signal/Collect programming model for scalable synchronous and asynchronous graph processing. We show that this abstraction can capture the essence of many algorithms on graphs in a concise and elegant way by giving Signal/Collect adaptations of algorithms that solve tasks as varied as clustering, inferencing, ranking, classification, constraint optimisation, and even query processing. Furthermore, we built and evaluated a parallel and distributed framework that executes algorithms in our programming model. We empirically show that our framework efficiently and scalably parallelises and distributes algorithms that are expressed in the programming model. We also show that asynchronicity can speed up execution times. Our framework can compute a PageRank on a large (>1.4 billion vertices,>6.6 billion edges) realworld graph in 112 seconds on eight machines, which is competitive with other graph processing approaches.
ITERATIVE GRAPH COMPUTATION IN THE BIG DATA ERA
, 2015
"... Iterative graph computation is a key component in many realworld applications, as the graph data model naturally captures complex relationships between entities. The big data era has seen the rise of several new challenges to this classic computation model. In this dissertation we describe three p ..."
Abstract
 Add to MetaCart
Iterative graph computation is a key component in many realworld applications, as the graph data model naturally captures complex relationships between entities. The big data era has seen the rise of several new challenges to this classic computation model. In this dissertation we describe three projects that address different aspects of these challenges. First, because of the increasing volume of data, it is increasingly important to scale iterative graph computation to large graphs. We observe that an important class of graph applications performing little computation per vertex scales poorly when running on multiple cores. These computationally light applications are limited by memory access rates, and cannot fully utilize the benefits of multiple cores. We propose a new blockoriented computation model which creates two levels of iterative computation. On each processor, a small block of highly connected vertices is iterated locally, while the blocks are updated iteratively at the global level. We show that blockoriented execution reduces the communicationtocomputation ratio and significantly improves the perfor
My Weak Consistency is Strong When Bad Things Do Not Come in Threes
"... ABSTRACT It is expensive to maintain strong data consistency during concurrent execution. However, weak consistency levels, which are considered harmful, have been widely applied in analytical jobs. Their success challenges our belief: data consistency, which is believed to be an essential to preci ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT It is expensive to maintain strong data consistency during concurrent execution. However, weak consistency levels, which are considered harmful, have been widely applied in analytical jobs. Their success challenges our belief: data consistency, which is believed to be an essential to precise computing, does not always need to be preserved. In this paper, we tackle one of the core questions related to the application of weak consistency: When does weak consistency work well? We propose an effective explanation for the success of weak consistency. We name it bad things do not come in threes, or BN3. It is based on the observation that the volume of data is far larger than the number of workers. If all workers are operating concurrently, the probability that two workers access the same data at the same time is relatively low. Although it is not small enough to be neglected, the chance that three or more workers access the same data at the same time is even lower. Based on the BN3 conjecture, we analyze different consistency levels. We show that a weak consistency level in transaction processing is equivalent to snapshot isolation (SI) under reasonable assumptions. Although the BN3 is an oversimplification of real scenarios, it explains why weak consistency often achieves results that are accurate enough. It also serves as a quality promise for the future wide application of weak consistency in analytical tasks. We verify our results in experimental studies.
X. SHI ET AL. 1 Frog: Asynchronous Graph Processing on GPU with Hybrid Coloring Model
"... Abstract—GPUs have been increasingly used to accelerate graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynchronous computing model to accelerate the iterative convergence. Unfortunately, the consistent asynchronous computing ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—GPUs have been increasingly used to accelerate graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynchronous computing model to accelerate the iterative convergence. Unfortunately, the consistent asynchronous computing requires locking or atomic operations, leading to significant penalties/overheads when implemented on GPUs. To this end, coloring algorithm is adopted to separate the vertices with potential updating conflicts, guaranteeing the consistency/correctness of the parallel processing. Common coloring algorithms, however, may suffer from low parallelism because of a large number of colors generally required for processing a largescale graph with billions of vertices. We propose a lightweight asynchronous processing framework called Frog with a hybrid coloring model. The fundamental idea is based on Pareto principle (or 8020 rule) about coloring algorithms as we observed through masses of real graph coloring cases. We find that majority of vertices (about 80%) are colored with only a few colors, such that they can be read and updated in a very high degree of parallelism without violating the sequential consistency. Accordingly, our solution will separate the processing of the vertices based on the distribution of colors. In this work, we mainly answer the four questions: (1) how to partition the vertices in a sparse graph with maximized parallelism, (2) how to process largescale graphs which are out of GPU memory, (3) how to reduce the overhead of data transfers on PCIe while processing each partition, and (4) how to customize the data structure that is particularly suitable for GPUbased graph processing. Experiments based on realworld data show that our asynchronous GPU graph processing engine
Dynamic Interaction Graphs with Probabilistic Edge Decay
"... Abstract—A large scale network of social interactions, such as mentions in Twitter, can often be modeled as a “dynamic interaction graph ” in which new interactions (edges) are continually added over time. Existing systems for extracting timely insights from such graphs are based on either a cumula ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—A large scale network of social interactions, such as mentions in Twitter, can often be modeled as a “dynamic interaction graph ” in which new interactions (edges) are continually added over time. Existing systems for extracting timely insights from such graphs are based on either a cumulative “snapshot” model or a “sliding window ” model. The former model does not sufficiently emphasize recent interactions. The latter model abruptly forgets past interactions, leading to discontinuities in which, e.g., the graph analysis completely ignores historically important influencers who have temporarily gone dormant. We introduce TIDE, a distributed system for analyzing dynamic graphs that employs a new “probabilistic edge decay ” (PED) model. In this model, the graph analysis algorithm of interest is applied at each time step to one or more graphs obtained as samples from the current “snapshot ” graph that comprises all interactions that have occurred so far. The probability that a given edge of the snapshot graph is included in a sample decays over time according to a user specified decay function. The PED model allows controlled tradeoffs between recency and continuity, and allows existing analysis algorithms for static graphs to be applied to dynamic graphs essentially without change. For the important class of exponential decay functions, we provide efficient methods that leverage past samples to incrementally generate new samples as time advances. We also exploit the large degree of overlap between samples to reduce memory consumption from O(N) to O(logN) when maintaining N sample graphs. Finally, we provide bulkexecution methods for applying graph algorithms to multiple sample graphs simultaneously without requiring any changes to existing graphprocessing APIs. Experiments on a real Twitter dataset demonstrate the effectiveness and efficiency of our TIDE prototype, which is built on top of the Spark distributed computing framework. I.
unknown title
"... To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a “think like a vertex ” programmi ..."
Abstract
 Add to MetaCart
(Show Context)
To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a “think like a vertex ” programming model to support iterative graph computation. This vertexcentric model is easy to program and has been proved useful for many graph algorithms. However, this model hides the partitioning information from the users, thus prevents many algorithmspecific optimizations. This often results in longer execution time due to excessive network messages (e.g. in Pregel) or heavy scheduling overhead to ensure data consistency (e.g. in GraphLab). To address this limitation, we propose a new “think like a graph ” programming paradigm. Under this graphcentric model, the partition structure is opened up to the users, and can be utilized so that communication within a partition can bypass the heavy message passing or scheduling machinery. We implemented this model in a new system, called Giraph++, based on Apache Giraph, an open source implementation of Pregel. We explore the applicability of the graphcentric model to three categories of graph algorithms, and demonstrate its flexibility and superior performance, especially on wellpartitioned data. For example, on a web graph with 118 million vertices and 855 million edges, the graphcentric version of connected component detection algorithm runs 63X faster and uses 204X fewer network messages than its vertexcentric counterpart. 1.