Results 1  10
of
71
Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud
"... While highlevel data parallel frameworks, like MapReduce, simplify the design and implementation of largescale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill ..."
Abstract

Cited by 129 (2 self)
 Add to MetaCart
While highlevel data parallel frameworks, like MapReduce, simplify the design and implementation of largescale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graphparallel computation while ensuring data consistency and achieving a high degree of parallel performance in the sharedmemory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees. We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic ChandyLamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 12 orders of magnitude performance gains over Hadoopbased implementations. 1.
PowerGraph: Distributed GraphParallel Computation on Natural Graphs
"... Largescale graphstructured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graphparallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the realworld have highly ..."
Abstract

Cited by 117 (4 self)
 Add to MetaCart
(Show Context)
Largescale graphstructured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graphparallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the realworld have highly skewed powerlaw degree distributions, which challenge the assumptions made by these abstractions, limiting performance and scalability. In this paper, we characterize the challenges of computation on natural graphs in the context of existing graphparallel abstractions. We then introduce the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges. Leveraging the PowerGraph abstraction we introduce a new approach to distributed graph placement and representation that exploits the structure of powerlaw graphs. We provide a detailed analysis and experimental evaluation comparing PowerGraph to two popular graphparallel systems. Finally, we describe three different implementation strategies for PowerGraph and discuss their relative merits with empirical evaluations on largescale realworld problems demonstrating order of magnitude gains. 1
GraphChi: Largescale Graph Computation On just a PC
 In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, OSDI’12
, 2012
"... Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains c ..."
Abstract

Cited by 109 (6 self)
 Add to MetaCart
(Show Context)
Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains challenging, especially to nonexperts. In this work, we present GraphChi, a diskbased system for computing efficiently on graphs with billions of edges. By using a wellknown method to break large graphs into small parts, and a novel parallel sliding windows method, GraphChi is able to execute several advanced data mining, graph mining, and machine learning algorithms on very large graphs, using just a single consumerlevel computer. We further extend GraphChi to support graphs that evolve over time, and demonstrate that, on a single computer, GraphChi can process over one hundred thousand graph updates per second, while simultaneously performing computation. We show, through experiments and theoretical analysis, that GraphChi performs well on both SSDs and rotational hard drives. By repeating experiments reported for existing distributed systems, we show that, with only fraction of the resources, GraphChi can solve the same problems in very reasonable time. Our work makes largescale graph computation available to anyone with a modern PC. 1
COLORFUL TRIANGLE COUNTING AND A MAPREDUCE IMPLEMENTATION
"... In this note we introduce a new randomized algorithm for counting triangles in graphs. We show that under mild conditions, the estimate of our algorithm is strongly concentrated around the true number of triangles. Specifically, let G be a graph with n vertices, t triangles and let ∆ be the maximum ..."
Abstract

Cited by 28 (5 self)
 Add to MetaCart
(Show Context)
In this note we introduce a new randomized algorithm for counting triangles in graphs. We show that under mild conditions, the estimate of our algorithm is strongly concentrated around the true number of triangles. Specifically, let G be a graph with n vertices, t triangles and let ∆ be the maximum number of triangles an edge of G is contained in. Also, let N = 1/p the number of colors we ∆ log n use in our randomized algorithm. We show that if p ≥ max ( log n
Truss Decomposition in Massive Networks
"... The ktruss is a type of cohesive subgraphs proposed recently for the study of networks. While the problem of computing most cohesive subgraphs is NPhard, there exists a polynomial time algorithm for computing ktruss. Compared with kcore which is also efficient to compute, ktruss represents the ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
(Show Context)
The ktruss is a type of cohesive subgraphs proposed recently for the study of networks. While the problem of computing most cohesive subgraphs is NPhard, there exists a polynomial time algorithm for computing ktruss. Compared with kcore which is also efficient to compute, ktruss represents the “core ” of a kcore that keeps the key information of, while filtering out less important information from, the kcore. However, existing algorithms for computing ktruss are inefficient for handling today’s massive networks. We first improve the existing inmemory algorithm for computing ktruss in networks of moderate size. Then, we propose two I/Oefficient algorithms to handle massive networks that cannot fit in main memory. Our experiments on real datasets verify the efficiency of our algorithms and the value of ktruss. 1.
Upper and Lower Bounds on the Cost of a MapReduce Computation ∗
"... In this paper we study the tradeoff between parallelism and communication cost in a mapreduce computation. For any problem that is not “embarrassingly parallel, ” the finer we partition the work of the reducers so that more parallelism can be extracted, the greater will be the total communication b ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
In this paper we study the tradeoff between parallelism and communication cost in a mapreduce computation. For any problem that is not “embarrassingly parallel, ” the finer we partition the work of the reducers so that more parallelism can be extracted, the greater will be the total communication between mappers and reducers. We introduce a model of problems that can be solved in a single round of mapreduce computation. This model enables a generic recipe for discovering lower bounds on communication cost as a function of the maximum number of inputs that can be assigned to one reducer. We use the model to analyze the tradeoff for three problems: finding pairs of strings at Hamming distance d, finding triangles and other patterns in a larger graph, and matrix multiplication. For finding strings of Hamming distance 1, we have upper and lower bounds that match exactly. For triangles and many other graphs, we have upper and lower bounds that are the same to within a constant factor. For the problem of matrix multiplication, we have matching upper and lower bounds for oneround mapreduce algorithms. We are also able to explore tworound mapreduce algorithms for matrix multiplication and show that these never have more communication, for a given reducer size, than the best oneround algorithm, and often have significantly less. 1.
Communication steps for parallel query processing
 In PODS
, 2013
"... We consider the problem of computing a relational query q on a large input database of size n, using a large number p of servers. The computation is performed in rounds, and each server can receive only O(n/p1−ε) bits of data, where ε ∈ [0, 1] is a parameter that controls replication. We examine ho ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
(Show Context)
We consider the problem of computing a relational query q on a large input database of size n, using a large number p of servers. The computation is performed in rounds, and each server can receive only O(n/p1−ε) bits of data, where ε ∈ [0, 1] is a parameter that controls replication. We examine how many global communication steps are needed to compute q. We establish both lower and upper bounds, in two settings. For a single round of communication, we give lower bounds in the strongest possible model, where arbitrary bits may be exchanged; we show that any algorithm requires ε ≥ 1−1/τ∗, where τ ∗ is the fractional vertex cover of the hypergraph of q. We also give an algorithm that matches the lower bound for a specific class of databases. For multiple rounds of communication, we present lower bounds in a model where routing decisions for a tuple are tuplebased. We show that for the class of treelike queries there exists a tradeoff between the number of rounds and the space exponent ε. The lower bounds for multiple rounds are the first of their kind. Our results also imply that transitive closure cannot be computed in O(1) rounds of communication.
Densest Subgraph in Streaming and MapReduce
"... The problem of finding locally dense components of a graph is an important primitive in data analysis, with wideranging applications from community mining to spam detection and the discovery of biological network modules. In this paper we present new algorithms for finding the densest subgraph in t ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
The problem of finding locally dense components of a graph is an important primitive in data analysis, with wideranging applications from community mining to spam detection and the discovery of biological network modules. In this paper we present new algorithms for finding the densest subgraph in the streaming model. For any ɛ> 0, our algorithms make O(log 1+ɛ n) passes over the input and find a subgraph whose density is guaranteed to be within a factor 2(1 + ɛ) of the optimum. Our algorithms are also easily parallelizable and we illustrate this by realizing them in the MapReduce model. In addition we perform extensive experimental evaluation on massive realworld graphs showing the performance and scalability of our algorithms in practice. 1.
Patric: A parallel algorithm for counting triangles and computing clustering coefficients in massive networks
, 2012
"... We present MPIbased parallel algorithms for counting triangles and computing clustering coefficients in massive networks. � A triangle in a graph G(V, E) is a set of three nodes u, v, w ∊V such that there is an edge between each pair of nodes. The number of triangles incident on node v, with adjace ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
(Show Context)
We present MPIbased parallel algorithms for counting triangles and computing clustering coefficients in massive networks. � A triangle in a graph G(V, E) is a set of three nodes u, v, w ∊V such that there is an edge between each pair of nodes. The number of triangles incident on node v, with adjacency list N(v), is defined as, �  { ( u, w) � E  u, w � N ( v)} Counting triangles is important in the analysis of various networks, e.g., social, biological, web etc. Emerging massive networks do not fit in the main memory of a single machine and are very challenging to work with. Our distributedmemory parallel algorithm allows us to deal with such massive networks in a time and spaceefficient manner. We were able to count triangles in a graph with 2 billions of nodes and 50 billions of edges in 10 minutes. � The clustering coefficient (CC) of a node v ∊V with degree dv is defined as,
Enumerating subgraph instances using mapreduce
, 2012
"... The theme of this paper is how to find all instances of a given “sample” graph in a larger “data graph, ” using a single round of mapreduce. For the simplest sample graph, the triangle, we improve upon the best known such algorithm. We then examine the general case, considering both the communicatio ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
The theme of this paper is how to find all instances of a given “sample” graph in a larger “data graph, ” using a single round of mapreduce. For the simplest sample graph, the triangle, we improve upon the best known such algorithm. We then examine the general case, considering both the communication cost between mappers and reducers and the total computation cost at the reducers. To minimize communication cost, we exploit the techniques of [2] for computing multiway joins (evaluating conjunctive queries) in a single mapreduce round. Several methods are shown for translating sample graphs into a union of conjunctive queries with as few queries as possible. We also address the matter of optimizing computation cost. Many serial algorithms are shown to be “convertible,” in the sense that it is possible to partition the data graph, explore each partition in a separate reducer, and have the total work at the reducers be of the same order as the work of the serial algorithm. For data graphs of unrestricted degree, we show that there are convertible algorithms whose running time is of the same order as the lower bounds on number of occurrences of the sample graph that were provided by [4]. We also offer better convertible algorithms when the degree of nodes in a data graph of m nodes is limited to √ m. 1.