Results

**1 - 5**of**5**### Distributed-Memory Breadth-First Search on Massive Graphs ∗

"... In this chapter, we study the problem of traversing large graphs. A traversal, a systematic method of exploring all the vertices and edges in a graph, can be done in many different orders. A traversal in “breadth-first ” order, a breadth-first search (BFS), is important because it serves as a buildi ..."

Abstract
- Add to MetaCart

(Show Context)
In this chapter, we study the problem of traversing large graphs. A traversal, a systematic method of exploring all the vertices and edges in a graph, can be done in many different orders. A traversal in “breadth-first ” order, a breadth-first search (BFS), is important because it serves as a building block for many graph algorithms. Parallel graph algorithms increasingly rely on BFS for exploring all

### Parallel Processing of Filtered Queries in Attributed Semantic Graphs I

"... Execution of complex analytic queries on massive semantic graphs is a challenging problem in big-data analytics that requires high-performance parallel computing. In a semantic graph, vertices and edges carry attributes of various types and the analytic queries typically depend on the values of thes ..."

Abstract
- Add to MetaCart

(Show Context)
Execution of complex analytic queries on massive semantic graphs is a challenging problem in big-data analytics that requires high-performance parallel computing. In a semantic graph, vertices and edges carry attributes of various types and the analytic queries typically depend on the values of these attributes. Thus, the computation must view the graph through a filter that passes only those individual vertices and edges of interest. Previous investigations have developed Knowledge Discovery Toolbox (KDT), a sophisticated a Python library for parallel graph computations. In KDT, the user can write custom graph algorithms by specifying operations between edges and vertices (semiring operations). The user can also customize existing graph algorithms by writing filters. Although the high-level language for this customization enables domain scientists to productively express their graph analytics requirements, the customized queries perform poorly due to the overhead of having to call into the Python virtual machine for each vertex and edge. In this work, we use the Selective Embedded Just-In-Time Specialization (SEJITS) approach to automat-ically translate semiring operations and filters defined by programmers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by comparing it with the high-performance Combinatorial BLAS engine and show that our approach combines the benefits of programming in a high-

### A Parallel Tree Grafting Algorithm for Maximum Cardinality Matching in Bipartite Graphs

"... Abstract—In computing matchings in graphs on parallel pro-cessors, it is challenging to achieve high performance because these algorithms rely on searching for paths in the graph, and when these paths become long, there is little concurrency. We present a new algorithm and its shared-memory paral-le ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract—In computing matchings in graphs on parallel pro-cessors, it is challenging to achieve high performance because these algorithms rely on searching for paths in the graph, and when these paths become long, there is little concurrency. We present a new algorithm and its shared-memory paral-lelization for computing maximum cardinality matchings in bipartite graphs. Our algorithm searches for augmenting paths via specialized breadth-first searches (BFS) from multiple source vertices, hence creating more parallelism than single source algorithms. Unfortunately, algorithms that employ multiple-source searches cannot discard a search tree once no augmenting path is discovered from the tree, unlike algorithms that rely on single-source searches. We describe a novel tree-grafting method that eliminates most of the redundant edge traversals resulting from this property of multiple-source searches. We also employ the recent direction-optimizing BFS algorithm as a subroutine to discover augmenting paths faster. Our algorithm compares favorably with the current best algorithms in terms of the number of edges traversed, the average augmenting path length, and the number of iterations. We provide a proof of correctness for our algorithm. Our NUMA-aware implementation is scalable to 80 threads of an Intel multiprocessor. On average, our parallel algorithm runs an order of magnitude faster than the fastest algorithms available. The performance improvement is more significant on graphs with small matching number. I.

### Parallel Distributed Breadth First Search on the Kepler Architecture

"... We present the results obtained by using an evolution of our CUDA-based solution for the exploration, via a Breadth First Search, of large graphs. This latest version exploits at its best the features of the Ke-pler architecture and relies on a 2D decomposition of the adjacency matrix to reduce the ..."

Abstract
- Add to MetaCart

(Show Context)
We present the results obtained by using an evolution of our CUDA-based solution for the exploration, via a Breadth First Search, of large graphs. This latest version exploits at its best the features of the Ke-pler architecture and relies on a 2D decomposition of the adjacency matrix to reduce the number of communications among the GPUs. The final result is a code that can visit 400 billion edges in a second by using a cluster equipped with 4096 Tesla K20X GPUs. 1

### Branch-Avoiding Graph Algorithms

"... All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."

Abstract
- Add to MetaCart

All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.