Results 1  10
of
81
Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers
, 2010
"... ..."
(Show Context)
Pregel: A system for largescale graph processing
 IN SIGMOD
, 2010
"... Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs—in some cases billions of vertices, trillions of edges—poses challenges to their efficient processing. In this paper we present a computational model ..."
Abstract

Cited by 496 (0 self)
 Add to MetaCart
(Show Context)
Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs—in some cases billions of vertices, trillions of edges—poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertexcentric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and faulttolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distributionrelated details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.
Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud
, 2012
"... While highlevel data parallel frameworks, like MapReduce, simplify the design and implementation of largescale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill ..."
Abstract

Cited by 141 (2 self)
 Add to MetaCart
While highlevel data parallel frameworks, like MapReduce, simplify the design and implementation of largescale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graphparallel computation while ensuring data consistency and achieving a high degree of parallel performance in the sharedmemory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees. We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic ChandyLamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 12 orders of magnitude performance gains over Hadoopbased implementations.
PowerGraph: Distributed GraphParallel Computation on Natural Graphs
"... Largescale graphstructured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graphparallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the realworld have highly ..."
Abstract

Cited by 128 (4 self)
 Add to MetaCart
(Show Context)
Largescale graphstructured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graphparallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the realworld have highly skewed powerlaw degree distributions, which challenge the assumptions made by these abstractions, limiting performance and scalability. In this paper, we characterize the challenges of computation on natural graphs in the context of existing graphparallel abstractions. We then introduce the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges. Leveraging the PowerGraph abstraction we introduce a new approach to distributed graph placement and representation that exploits the structure of powerlaw graphs. We provide a detailed analysis and experimental evaluation comparing PowerGraph to two popular graphparallel systems. Finally, we describe three different implementation strategies for PowerGraph and discuss their relative merits with empirical evaluations on largescale realworld problems demonstrating order of magnitude gains. 1
Scalable SPARQL Querying of Large RDF Graphs
"... The generation of RDF data has accelerated to the point where many data sets need to be partitioned across multiple machines in order to achieve reasonable performance when querying the data. Although tremendous progress has been made in the Semantic Web community for achieving high performance data ..."
Abstract

Cited by 71 (1 self)
 Add to MetaCart
(Show Context)
The generation of RDF data has accelerated to the point where many data sets need to be partitioned across multiple machines in order to achieve reasonable performance when querying the data. Although tremendous progress has been made in the Semantic Web community for achieving high performance data management on a single node, current solutions that allow the data to be partitioned across multiple machines are highly inefficient. In this paper, we introduce a scalable RDF data management system that is up to three orders of magnitude more efficient than popular multinode RDF data management systems. In so doing, we introduce techniques for (1) leveraging stateoftheart single node RDFstore technology (2) partitioning the data across nodes in a manner that helps accelerate query processing through locality optimizations and (3) decomposing SPARQL queries into high performance fragments that take advantage of how data is partitioned in a cluster.
GPS: A Graph Processing System
"... GPS (for Graph Processing System) is a complete opensource system we developed for scalable, faulttolerant, and easytoprogram execution of algorithms on extremely large graphs. GPS is similar to Google’s proprietary Pregel system [MAB+ 11], with some useful additional functionality described in ..."
Abstract

Cited by 68 (3 self)
 Add to MetaCart
(Show Context)
GPS (for Graph Processing System) is a complete opensource system we developed for scalable, faulttolerant, and easytoprogram execution of algorithms on extremely large graphs. GPS is similar to Google’s proprietary Pregel system [MAB+ 11], with some useful additional functionality described in the paper. In distributed graph processing systems like GPS and Pregel, graph partitioning is the problem of deciding which vertices of the graph are assigned to which compute nodes. In addition to presenting the GPS system itself, we describe how we have used GPS to study the effects of different graph partitioning schemes. We present our experiments on the performance of GPS under different static partitioning schemes—assigning vertices to workers “intelligently ” before the computation starts—and with GPS’s dynamic repartitioning feature, which reassigns vertices to different compute nodes during the computation by observing their message sending patterns.
The Combinatorial BLAS: Design, Implementation, and Applications
, 2010
"... This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse mat ..."
Abstract

Cited by 58 (10 self)
 Add to MetaCart
(Show Context)
This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to their irregular nature and low operational intensity. Many graph computations, however, contain sufficient coarse grained parallelism for thousands of processors, which can be uncovered by using the right primitives. We describe the Parallel Combinatorial BLAS, which consists of a small but powerful set of linear algebra primitives specifically targeting graph and data mining applications. We provide an extendible library interface and some guiding principles for future development. The library is evaluated using two important graph algorithms, in terms of both performance and easeofuse. The scalability and raw performance of the example applications, using the combinatorial BLAS, are unprecedented on distributed memory clusters.
Trinity: A Distributed Graph Engine on a Memory Cloud
 In SIGMOD
, 2013
"... Computationsperformedbygraphalgorithmsaredatadriven, and require a high degree of random data access. Despite the great progresses made in disk technology, it still cannot providethelevelofefficient randomaccess requiredbygraph computation. Ontheotherhand, memorybasedapproaches usually do not scale ..."
Abstract

Cited by 50 (0 self)
 Add to MetaCart
(Show Context)
Computationsperformedbygraphalgorithmsaredatadriven, and require a high degree of random data access. Despite the great progresses made in disk technology, it still cannot providethelevelofefficient randomaccess requiredbygraph computation. Ontheotherhand, memorybasedapproaches usually do not scale due to the capacity limit of single machines. Inthispaper, weintroduceTrinity,ageneralpurpose graph engine over a distributed memory cloud. Through optimized memory management and network communication, Trinity supports fast graph exploration as well as efficient parallel computing. In particular, Trinity leverages graph access patterns in both online and offline computation to optimize memory and communication for best performance. These enable Trinity to support efficient online query processing and offline analytics on large graphs with just a few commodity machines. Furthermore, Trinity provides a high level specification language called TSL for users to declare data schema and communication protocols, which brings great easeofuse for general purpose graphmanagement and computing. Our experiments show Trinity’s performance in both low latency graph queries as well as high throughput graph analytics on webscale, billionnode graphs.
CHALLENGES IN PARALLEL GRAPH PROCESSING
 PARALLEL PROCESSING LETTERS
, 2006
"... Graph algorithms are becoming increasingly important for solving many problems in scientific computing, data mining and other domains. As these problems grow in scale, parallel computing resources are required to meet their computational and memory requirements. Unfortunately, the algorithms, softwa ..."
Abstract

Cited by 45 (4 self)
 Add to MetaCart
Graph algorithms are becoming increasingly important for solving many problems in scientific computing, data mining and other domains. As these problems grow in scale, parallel computing resources are required to meet their computational and memory requirements. Unfortunately, the algorithms, software, and hardware that have worked well for developing mainstream parallel scientific applications are not necessarily effective for largescale graph problems. In this paper we present the interrelationships between graph problems, software, and parallel hardware in the current state of the art and discuss how those issues present inherent challenges in solving largescale graph problems. The range of these challenges suggests a research agenda for the development of scalable highperformance software for graph problems.
Multithreaded Asynchronous Graph Traversal for InMemory and SemiExternal Memory
"... Processing large graphs is becoming increasingly important for many computational domains. Unfortunately, many algorithms and implementations do not scale with the demand for increasing graph sizes. As a result, researchers have attempted to meet the growing data demands using parallel and external ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
Processing large graphs is becoming increasingly important for many computational domains. Unfortunately, many algorithms and implementations do not scale with the demand for increasing graph sizes. As a result, researchers have attempted to meet the growing data demands using parallel and external memory techniques. Our work, targeted to chip multiprocessors, takes a highly parallel asynchronous approach to hide the high data latency due to both poor locality and delays in the underlying graph data storage. We present a novel asynchronous approach to compute Breadth First Search (BFS), Single Source Shortest Path (SSSP), and Connected Components (CC) for large graphs in shared memory. We present an experimental study applying our technique to both InMemory (IM) and SemiExternal Memory (SEM) graphs utilizing multicore processors and solidstate memory devices. Our experiments using both synthetic and realworld datasets show that our asynchronous approach is able to overcome data latencies and provide significant speedup over alternative approaches.