#### DMCA

## Pregel: A system for large-scale graph processing (2010)

### Cached

### Download Links

Venue: | IN SIGMOD |

Citations: | 466 - 0 self |

### Citations

4583 | The anatomy of a large-scale hypertextual web search engine
- Brin, Page
- 1998
(Show Context)
Citation Context ...gorithms developed by Pregel users to solve real problems: Page Rank, Shortest Paths, Bipartite Matching, and a Semi-Clustering algorithm. 5.1 PageRank A Pregel implementation of a PageRank algorithm =-=[7]-=- is shown in Figure 4. The PageRankVertex class inherits from Vertex. Its vertex value type is double to store a tentative PageRank, and its message type is double to carry PageRank fractions, while t... |

2330 | A note on two problems in connexion with graphs. Numerische Mathematik
- Dijkstra
- 1959
(Show Context)
Citation Context ...ances, modifying it to compute the shortest paths tree as well is quite straightforward. This algorithm may perform many more comparisons than sequential counterparts such as Dijkstra or Bellman-Ford =-=[5, 15, 17, 24]-=-, but it is able to solve the shortest paths problem at a scale that is infeasible with any single-machine implementation. More advanced parallel algorithms exist, e.g., Thorup [44] or the ∆-stepping ... |

1460 | The Google file system
- GHEMAWAT, GOBIOFF, et al.
(Show Context)
Citation Context ...name service, so that instances can be referred to by logical names independent of their current binding to a physical machine. Persistent data is stored as files on a distributed storage system, GFS =-=[19]-=-, or in Bigtable [9], and temporary data such as buffered messages on local disk. 4.1 Basic architecture The Pregel library divides a graph into partitions, each consisting of a set of vertices and al... |

1353 |
A bridging model for parallel computation
- VALIANT
- 1990
(Show Context)
Citation Context ...his paper describes the resulting system, called Pregel 1 , and reports our experience with it. The high-level organization of Pregel programs is inspired by Valiant’s Bulk Synchronous Parallel model =-=[45]-=-. Pregel computations consist of a sequence of iterations, called supersteps. During a superstep the framework invokes a userdefined function for each vertex, conceptually in parallel. The function sp... |

1078 | Rows in networks
- Fulkerson
- 1962
(Show Context)
Citation Context ...ances, modifying it to compute the shortest paths tree as well is quite straightforward. This algorithm may perform many more comparisons than sequential counterparts such as Dijkstra or Bellman-Ford =-=[5, 15, 17, 24]-=-, but it is able to solve the shortest paths problem at a scale that is infeasible with any single-machine implementation. More advanced parallel algorithms exist, e.g., Thorup [44] or the ∆-stepping ... |

990 | Bigtable: A distributed storage system for structured data
- Chang, Dean, et al.
- 2008
(Show Context)
Citation Context ... an easier sequential programming semantics. 3.5 Input and output There are many possible file formats for graphs, such as a text file, a set of vertices in a relational database, or rows in Bigtable =-=[9]-=-. To avoid imposing a specific choice of file format, Pregel decouples the task of interpreting an input file as a graph from the task of graph computation. Similarly, output can be generated in an ar... |

722 | Dryad: distributed data-parallel programs from sequential building blocks
- Isard, Budiu, et al.
- 2007
(Show Context)
Citation Context ...fficient support for iterative computations over the graph. This graph focus also distinguishes it from other frameworks that hide distribution details such as Sawzall [41], Pig Latin [40], and Dryad =-=[27, 47]-=-. Pregel is also different because it implements a stateful model where long-lived processes compute, communicate, and modify local state, rather than a dataflow model where any process computes solel... |

707 | The LEDA Platform of Combinatorial and Geometric Computing
- Mehlhorn, Näher
- 1999
(Show Context)
Citation Context ...ies [40, 47], but these extensions are usually not ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL [43], LEDA =-=[35]-=-, NetworkX [25], JDSL [20], Stanford GraphBase [29], or FGL [16], limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL [22] and CGMgraph [... |

608 |
and Sanjay Ghemawat. Mapreduce: Simplified data processing on large clusters
- Dean
(Show Context)
Citation Context ... implementation effort that must be repeated for each new algorithm or graph representation. 2. Relying on an existing distributed computing platform, often ill-suited for graph processing. MapReduce =-=[14]-=-, for example, is a very good fit for a wide array of largescale computing problems. It is sometimes used to mine large graphs [11, 30], but this can lead to suboptimal performance and usability issue... |

576 | Pig Latin: A not-so-foreign language for data processing
- Olston, Reed, et al.
- 2008
(Show Context)
Citation Context ...ne large graphs [11, 30], but this can lead to suboptimal performance and usability issues. The basic models for processing data have been extended to facilitate aggregation [41] and SQL-like queries =-=[40, 47]-=-, but these extensions are usually not ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL [43], LEDA [35], Networ... |

554 |
editors. The GRID 2: Blueprint for a New Computing Infrastructure
- Foster, Kesselman
- 2004
(Show Context)
Citation Context ...es to Pregel are the Parallel Boost Graph Library and CGMgraph. The Parallel BGL [22, 23] specifies several key generic concepts for defining distributed graphs, provides implementations based on MPI =-=[18]-=-, and implements a number of algorithms based on them. It attempts to maintain compatibility with the (sequential) BGL [43] to facilitate porting algorithms. It implements property maps to hold inform... |

482 |
On a routing problem
- Bellman
- 1958
(Show Context)
Citation Context ...ances, modifying it to compute the shortest paths tree as well is quite straightforward. This algorithm may perform many more comparisons than sequential counterparts such as Dijkstra or Bellman-Ford =-=[5, 15, 17, 24]-=-, but it is able to solve the shortest paths problem at a scale that is infeasible with any single-machine implementation. More advanced parallel algorithms exist, e.g., Thorup [44] or the ∆-stepping ... |

266 | Interpreting the data: Parallel analysis with Sawzall
- Pike, Dorward, et al.
(Show Context)
Citation Context ...It is sometimes used to mine large graphs [11, 30], but this can lead to suboptimal performance and usability issues. The basic models for processing data have been extended to facilitate aggregation =-=[41]-=- and SQL-like queries [40, 47], but these extensions are usually not ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such ... |

212 |
The Stanford GraphBase: a platform for combinatorial computing
- Knuth
- 1993
(Show Context)
Citation Context ...ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL [43], LEDA [35], NetworkX [25], JDSL [20], Stanford GraphBase =-=[29]-=-, or FGL [16], limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL [22] and CGMgraph [8] libraries address parallel graph algorithms, but... |

165 |
High-speed switch scheduling for local-area networks
- Anderson, Owicki, et al.
- 1993
(Show Context)
Citation Context ...t is a subset of edges with no common endpoints. A maximal matching is one to which no additional edge can be added without sharing an endpoint. We implemented a randomized maximal matching algorithm =-=[1]-=- and a maximum-weight bipartite matching algorithm [4]; we describe the former here. In the Pregel implementation of this algorithm the vertex value is a tuple of two values: a flag indicating which s... |

165 | Exploring network structure, dynamics, and function using NetworkX
- Hagberg, Schult, et al.
- 2008
(Show Context)
Citation Context ...ut these extensions are usually not ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL [43], LEDA [35], NetworkX =-=[25]-=-, JDSL [20], Stanford GraphBase [29], or FGL [16], limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL [22] and CGMgraph [8] libraries ad... |

148 |
The boost graph library: user guide and reference manual
- Siek, Lee, et al.
- 2002
(Show Context)
Citation Context ...L-like queries [40, 47], but these extensions are usually not ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL =-=[43]-=-, LEDA [35], NetworkX [25], JDSL [20], Stanford GraphBase [29], or FGL [16], limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL [22] and... |

122 | Pegasus: A peta-scale graph mining system implementation and observations
- Kang, Tsourakakis, et al.
- 2009
(Show Context)
Citation Context ...computing platform, often ill-suited for graph processing. MapReduce [14], for example, is a very good fit for a wide array of largescale computing problems. It is sometimes used to mine large graphs =-=[11, 30]-=-, but this can lead to suboptimal performance and usability issues. The basic models for processing data have been extended to facilitate aggregation [41] and SQL-like queries [40, 47], but these exte... |

121 | A higher order estimate of the optimum checkpoint interval for restart dumps
- Daly
(Show Context)
Citation Context ...han the latest superstep S ′ completed by any partition before the failure, requiring that recovery repeat the missing supersteps. We select checkpoint frequency based on a mean time to failure model =-=[13]-=-, balancing checkpoint cost against expected recovery cost. Confined recovery is under development to improve the cost and latency of recovery. In addition to the basic checkpoints, the workers also l... |

121 |
Undirected Single-Source Shortest Paths with Positive Integer Weights in Linear Time
- Thorup
- 1999
(Show Context)
Citation Context ...an-Ford [5, 15, 17, 24], but it is able to solve the shortest paths problem at a scale that is infeasible with any single-machine implementation. More advanced parallel algorithms exist, e.g., Thorup =-=[44]-=- or the ∆-stepping method [37], and have been used as the basis for special-purpose parallel shortest paths implementations [12, 32]. Such advanced algorithms can also be expressed in the Pregel frame... |

75 | The parallel BGL: A generic library for distributed graph computations
- Gregor, Lumsdaine
- 2005
(Show Context)
Citation Context ...BGL [43], LEDA [35], NetworkX [25], JDSL [20], Stanford GraphBase [29], or FGL [16], limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL =-=[22]-=- and CGMgraph [8] libraries address parallel graph algorithms, but do not address fault tolerance or other issues that are important for very large scale distributed systems. None of these alternative... |

62 | Maximum weight matching via max-product belief propagation. International Symposium of Information Theory
- Bayati, Shah, et al.
- 2005
(Show Context)
Citation Context ...imal matching is one to which no additional edge can be added without sharing an endpoint. We implemented a randomized maximal matching algorithm [1] and a maximum-weight bipartite matching algorithm =-=[4]-=-; we describe the former here. In the Pregel implementation of this algorithm the vertex value is a tuple of two values: a flag indicating which set the vertex is in (L or R), and the name of its matc... |

60 |
A Library for Bulk Synchronous Parallel Programming
- Miller
- 1993
(Show Context)
Citation Context ...rallel model [45], which provides its synchronous superstep model of computation and communication. There have been a number of general BSP library implementations, for example the Oxford BSP Library =-=[38]-=-, Green BSP library [21], BSPlib [26] and Paderborn University BSP library [6]. They vary in the set of communication primitives provided, and in how they deal with distribution issues such as reliabi... |

55 |
A.A.: Fault-Tolerant Parallel Computation
- Kanellakis, Shvartsman
- 1997
(Show Context)
Citation Context ...ly free of deadlocks and data races common in asynchronous systems. In principle the performance of Pregel programs should be competitive with that of asynchronous systems given enough parallel slack =-=[28, 34]-=-. Because typical graph computations have many more vertices than machines, one should be able to balance the machine loads so that the synchronization between supersteps does not add excessive latenc... |

51 |
Graph twiddling in a mapreduce world
- Cohen
(Show Context)
Citation Context ...computing platform, often ill-suited for graph processing. MapReduce [14], for example, is a very good fit for a wide array of largescale computing problems. It is sometimes used to mine large graphs =-=[11, 30]-=-, but this can lead to suboptimal performance and usability issues. The basic models for processing data have been extended to facilitate aggregation [41] and SQL-like queries [40, 47], but these exte... |

49 | A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets
- Madduri, Ediger, et al.
(Show Context)
Citation Context ...ults on the scale of our 1 billion vertex and 127.1 billion edge log-normal graph. Another line of research has tackled use of external disk memory to handle huge problems with single machines, e.g., =-=[33, 36]-=-, but these implementations require hours for graphs of a billion vertices. 8. CONCLUSIONS AND FUTURE WORK The contribution of this paper is a model suitable for large-scale graph computing and a desc... |

32 |
Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. Dryadlinq: a system for generalpurpose distributed data-parallel computing using a high-level language
- Yu, Isard, et al.
- 2008
(Show Context)
Citation Context ...ne large graphs [11, 30], but this can lead to suboptimal performance and usability issues. The basic models for processing data have been extended to facilitate aggregation [41] and SQL-like queries =-=[40, 47]-=-, but these extensions are usually not ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL [43], LEDA [35], Networ... |

28 |
stepping: a parallelizable shortest path algorithm
- Meyer, Sanders
(Show Context)
Citation Context ...ted only when edges are added or removed. More advanced uses are possible. For example, an aggregator can be used to implement a distributed priority queue for the ∆-stepping shortest paths algorithm =-=[37]-=-. Each vertex is assigned to a priority bucket based on its tentative distance. In one superstep, the vertices contribute their indices to a min aggregator. The minimum is broadcast to all workers in ... |

25 | Inductive graphs and functional graph algorithms
- Erwig
- 2001
(Show Context)
Citation Context ...ph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL [43], LEDA [35], NetworkX [25], JDSL [20], Stanford GraphBase [29], or FGL =-=[16]-=-, limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL [22] and CGMgraph [8] libraries address parallel graph algorithms, but do not addre... |

24 | CGMgraph/CGMlib: Implementing and testing CGM graph algorithms on PC clusters
- Chan, Dehne
- 2003
(Show Context)
Citation Context ...], NetworkX [25], JDSL [20], Stanford GraphBase [29], or FGL [16], limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL [22] and CGMgraph =-=[8]-=- libraries address parallel graph algorithms, but do not address fault tolerance or other issues that are important for very large scale distributed systems. None of these alternatives fit our purpose... |

21 |
Data structures and algorithms
- Goodrich, Tamassia
- 2001
(Show Context)
Citation Context ...tensions are usually not ideal for graph algorithms that often better fit a message passing model. 3. Using a single-computer graph algorithm library, such as BGL [43], LEDA [35], NetworkX [25], JDSL =-=[20]-=-, Stanford GraphBase [29], or FGL [16], limiting the scale of problems that can be addressed. 4. Using an existing parallel graph system. The Parallel BGL [22] and CGMgraph [8] libraries address paral... |

14 |
Tomasz Radzik. Shortest paths algorithms: Theory and experimental evaluation
- Cherkassky, Goldberg
- 1996
(Show Context)
Citation Context ...ators would be useful for detecting the convergence condition. 5.2 Shortest Paths Shortest paths problems are among the best known problems in graph theory and arise in a wide variety of applications =-=[10, 24]-=-, with several important variants. The singlesource shortest paths problem requires finding a shortest path between a single source vertex and every other vertex in the graph. The s-t shortest path pr... |

14 | Parallel shortestpathalgorithmsforsolvinglarge-scaleinstances. In9th DIMACSImplementation Challenge - Madduri, Bader, et al. - 2006 |

14 |
Umit Catalyurek. A scalable distributed parallel breadth-first search algorithm on BlueGene/L
- Yoo, Chow, et al.
- 2005
(Show Context)
Citation Context ...xperimental results for graphs at the scale of billions of vertices. The largest have reported results from custom implementations of s-t shortest path, rather than from general frameworks. Yoo et al =-=[46]-=- report on a BlueGene/L implementation of breadth-first search (s-t shortest path) on 32,768 PowerPC processors with a highperformance torus network, achieving 1.5 seconds for a Poisson distributed ra... |

10 |
Urs Hoelzle. Web search for a planet: The google cluster architecture
- Barroso, Dean
- 2003
(Show Context)
Citation Context ...nusual needs can write their own by subclassing the abstract base classes Reader and Writer. 4. IMPLEMENTATION Pregel was designed for the Google cluster architecture, which is described in detail in =-=[3]-=-. Each cluster consists of thousands of commodity PCs organized into racks with high intra-rack bandwidth. Clusters are interconnected but distributed geographically. Our applications typically execut... |

5 |
A work-optimal deterministic algorithm for the certified write-all problem with a nontrivial number of asynchronous processors
- Malewicz
(Show Context)
Citation Context ...ly free of deadlocks and data races common in asynchronous systems. In principle the performance of Pregel programs should be competitive with that of asynchronous systems given enough parallel slack =-=[28, 34]-=-. Because typical graph computations have many more vertices than machines, one should be able to balance the machine loads so that the synchronization between supersteps does not add excessive latenc... |

4 | Advanced shortest paths algorithms on a massively-multithreaded architecture
- Crobak, Berry, et al.
- 2007
(Show Context)
Citation Context ...e implementation. More advanced parallel algorithms exist, e.g., Thorup [44] or the ∆-stepping method [37], and have been used as the basis for special-purpose parallel shortest paths implementations =-=[12, 32]-=-. Such advanced algorithms can also be expressed in the Pregel framework. The simplicity of the implementation in Figure 5, however, together with the already acceptable performance (see Section 6), m... |

2 |
von Otte and I
- Bonorden, Juurlink, et al.
- 2003
(Show Context)
Citation Context ...n and communication. There have been a number of general BSP library implementations, for example the Oxford BSP Library [38], Green BSP library [21], BSPlib [26] and Paderborn University BSP library =-=[6]-=-. They vary in the set of communication primitives provided, and in how they deal with distribution issues such as reliability (machine failure), load balancing, and synchronization. To our knowledge,... |

2 |
Torsten Suel and Thanasis Tsantilas, “Portable and Efficient Parallel Computing Using the BSP Model
- Goudreau, Lang, et al.
- 1999
(Show Context)
Citation Context ... provides its synchronous superstep model of computation and communication. There have been a number of general BSP library implementations, for example the Oxford BSP Library [38], Green BSP library =-=[21]-=-, BSPlib [26] and Paderborn University BSP library [6]. They vary in the set of communication primitives provided, and in how they deal with distribution issues such as reliability (machine failure), ... |

2 |
Vitaly Osipov. Design and Implementation of a Practical I/O-efficient Shortest Paths Algorithm
- Meyer
- 2009
(Show Context)
Citation Context ...ults on the scale of our 1 billion vertex and 127.1 billion edge log-normal graph. Another line of research has tackled use of external disk memory to handle huge problems with single machines, e.g., =-=[33, 36]-=-, but these implementations require hours for graphs of a billion vertices. 8. CONCLUSIONS AND FUTURE WORK The contribution of this paper is a model suitable for large-scale graph computing and a desc... |

1 |
and Kamesh Madduri, Designing multithreaded algorithms for breadth-first search and st-connectivity on the Cray MTA-2
- Bader
- 2006
(Show Context)
Citation Context ...path) on 32,768 PowerPC processors with a highperformance torus network, achieving 1.5 seconds for a Poisson distributed random graph with 3.2 billion vertices and 32 billion edges. Bader and Madduri =-=[2]-=- report on a Cray MTA2 implementation of a similar problem on a 10 node, highly multithreaded system, achieving .43 seconds for a scale-free R-MAT random graph with 134 million vertices and 805 millio... |