Results 1  10
of
41
The Combinatorial BLAS: Design, Implementation, and Applications
, 2010
"... This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse mat ..."
Abstract

Cited by 58 (10 self)
 Add to MetaCart
This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to their irregular nature and low operational intensity. Many graph computations, however, contain sufficient coarse grained parallelism for thousands of processors, which can be uncovered by using the right primitives. We describe the Parallel Combinatorial BLAS, which consists of a small but powerful set of linear algebra primitives specifically targeting graph and data mining applications. We provide an extendible library interface and some guiding principles for future development. The library is evaluated using two important graph algorithms, in terms of both performance and easeofuse. The scalability and raw performance of the example applications, using the combinatorial BLAS, are unprecedented on distributed memory clusters.
A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets
, 2009
"... We present a new lockfree parallel algorithm for computing betweenness centrality of massive complex networks that achieves better spatial locality compared with previous approaches. Betweenness centrality is a key kernel in analyzing the importance of vertices (or edges) in applications ranging fr ..."
Abstract

Cited by 53 (10 self)
 Add to MetaCart
(Show Context)
We present a new lockfree parallel algorithm for computing betweenness centrality of massive complex networks that achieves better spatial locality compared with previous approaches. Betweenness centrality is a key kernel in analyzing the importance of vertices (or edges) in applications ranging from social networks, to power grids, to the influence of jazz musicians, and is also incorporated into the DARPA HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging highperformance computing architectures for graph analytics. We design an optimized implementation of betweenness centrality for the massively multithreaded Cray XMT system with the Threadstorm processor. For a smallworld network of 268 million vertices and 2.147 billion edges, the 16processor XMT system achieves a TEPS rate (an algorithmic performance count for the number of edges traversed per second) of 160 million per second, which corresponds to more than a 2 × performance improvement over the previous parallel implementation. We demonstrate the applicability of our implementation to analyze massive realworld datasets by computing approximate betweenness centrality for the large IMDb movieactor network. 1.
Ligra: A Lightweight Graph Processing Framework for Shared Memory
"... There has been significant recent interest in parallel frameworks for processing graphs due to their applicability in studying social networks, the Web graph, networks in biology, and unstructured meshes in scientific simulation. Due to the desire to process large graphs, these systems have emphasiz ..."
Abstract

Cited by 40 (3 self)
 Add to MetaCart
There has been significant recent interest in parallel frameworks for processing graphs due to their applicability in studying social networks, the Web graph, networks in biology, and unstructured meshes in scientific simulation. Due to the desire to process large graphs, these systems have emphasized the ability to run on distributed memory machines. Today, however, a single multicore server can support more than a terabyte of memory, which can fit graphs with tens or even hundreds of billions of edges. Furthermore, for graph algorithms, sharedmemory multicores are generally significantly more efficient on a per core, per dollar, and per joule basis than distributed memory systems, and sharedmemory algorithms tend to be simpler than their distributed counterparts. In this paper, we present a lightweight graph processing framework that is specific for sharedmemory parallel/multicore machines,
Visualization of social and other scalefree networks
 IN PROC. OF IEEE INFOVIS
, 2008
"... This paper proposes novel methods for visualizing specifically the large powerlaw graphs that arise in sociology and the sciences. In such cases a large portion of edges can be shown to be less important and removed while preserving component connectedness and other features (e.g. cliques) to more ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
(Show Context)
This paper proposes novel methods for visualizing specifically the large powerlaw graphs that arise in sociology and the sciences. In such cases a large portion of edges can be shown to be less important and removed while preserving component connectedness and other features (e.g. cliques) to more clearly reveal the network’s underlying connection pathways. This simplification approach deterministically filters (instead of clustering) the graph to retain important node and edge semantics, and works both automatically and interactively. The improved graph filtering and layout is combined with a novel computer graphics anisotropic shading of the dense crisscrossing array of edges to yield a full social network and scalefree graph visualization system. Both quantitative analysis and visual results demonstrate the effectiveness of this approach.
Massive Social Network Analysis: Mining Twitter for Social Good
 39TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING
, 2010
"... Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterizat ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Toolkit for massive graphs representing social network data. On a 128processor Cray XMT, GraphCT estimates the betweenness centrality of an artificially generated (RMAT) 537 million vertex, 8.6 billion edge graph in 55 minutes and a realworld graph (Kwak, et al.) with 61.6 million vertices and 1.47 billion edges in 105 minutes. We use GraphCT to analyze public data from Twitter, a microblogging network. Twitter’s message connections appear primarily treestructured as a news dissemination system. Within the public data, however, are clusters of conversations. Using GraphCT, we can rank actors within these conversations and help analysts focus attention on a much smaller data subset.
A Flexible OpenSource Toolbox for Scalable Complex Graph Analysis
, 2011
"... The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a highlevel language without grappling with the difficulties of writing parallel code, calling parallel libraries, or becoming a graph expert. KDT provides a flexible Py ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
(Show Context)
The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a highlevel language without grappling with the difficulties of writing parallel code, calling parallel libraries, or becoming a graph expert. KDT provides a flexible Python interface to a small set of highlevel graph operations; composing a few of these operations is often sufficient for a specific analysis. Scalability and performance are delivered by linking to a stateoftheart backend compute engine that scales from laptops to large HPC clusters. KDT delivers very competitive performance from a generalpurpose, reusable library for graphs on the order of 10 billion edges and greater. We demonstrate speedup of 1 and 2 orders of magnitude over PBGL and Pegasus, respectively, on some tasks. Examples from simple use cases and key graphanalytic benchmarks illustrate the productivity and performance realized by KDT users. Semantic graph abstractions provide both flexibility and high performance for realworld use cases. Graphalgorithm researchers benefit from the ability to develop algorithms quickly using KDT’s graph and underlying matrix abstractions for distributed memory. KDT is available as opensource code to foster experimentation.
QUBE: a Quick algorithm for Updating BEtweenness centrality
, 2012
"... The betweenness centrality of a vertex in a graph is a measure for the participation of the vertex in the shortest paths in the graph. The Betweenness centrality is widely used in network analyses. Especially in a social network, the recursive computation of the betweenness centralities of vertices ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
The betweenness centrality of a vertex in a graph is a measure for the participation of the vertex in the shortest paths in the graph. The Betweenness centrality is widely used in network analyses. Especially in a social network, the recursive computation of the betweenness centralities of vertices is performed for the community detection and finding the influential user in the network. Since a social network graph is frequently updated, it is necessary to update the betweenness centrality efficiently. When a graph is changed, the betweenness centralities of all the vertices should be recomputed from scratch using all the vertices in the graph. To the best of our knowledge, this is the first work that proposes an efficient algorithm which handles the update of the betweenness centralities of vertices in a graph. In
SNAP, Smallworld Network Analysis and Partitioning: an opensource parallel graph framework for the exploration of largescale networks
"... We present SNAP (Smallworld Network Analysis and Partitioning), an opensource graph framework for exploratory study and partitioning of largescale networks. To illustrate the capability of SNAP, we discuss the design, implementation, and performance of three novel parallel community detection alg ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
We present SNAP (Smallworld Network Analysis and Partitioning), an opensource graph framework for exploratory study and partitioning of largescale networks. To illustrate the capability of SNAP, we discuss the design, implementation, and performance of three novel parallel community detection algorithms that optimize modularity, a popular measure for clustering quality in social network analysis. In order to achieve scalable parallel performance, we exploit typical network characteristics of smallworld networks, such as the low graph diameter, sparse connectivity, and skewed degree distribution. We conduct an extensive experimental study on realworld graph instances and demonstrate that our parallel schemes, coupled with aggressive algorithm engineering for smallworld networks, give significant running time improvements over existing modularitybased clustering heuristics, with little or no loss in clustering quality. For instance, our divisive clustering approach based on approximate edge betweenness centrality is more than two orders of magnitude faster than a competing greedy approach, for a variety of large graph instances on the Sun Fire T2000 multicore system. SNAP also contains parallel implementations of fundamental graphtheoretic kernels and topological analysis metrics (e.g., breadthfirst search, connected components, vertex and edge centrality) that are optimized for smallworld networks. The SNAP framework is extensible; the graph kernels are modular, portable across shared memory multicore and symmetric multiprocessor systems, and simplify the design of highlevel domainspecific applications. 1
A spaceefficient parallel algorithm for computing betweenness centrality in distributed memory
 In Proc. Int’l. Conf. on High Performance Computing (HiPC 2010
, 2010
"... Abstract—Betweenness centrality is a measure based on shortest paths that attempts to quantify the relative importance of nodes in a network. As computation of betweenness centrality becomes increasingly important in areas such as social network analysis, networks of interest are becoming too large ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Betweenness centrality is a measure based on shortest paths that attempts to quantify the relative importance of nodes in a network. As computation of betweenness centrality becomes increasingly important in areas such as social network analysis, networks of interest are becoming too large to fit in the memory of a single processing unit, making parallel execution a necessity. Parallelization over the vertex set of the standard algorithm, with a final reduction of the centrality for each vertex, is straightforward but requires Ω(V  2) storage. In this paper we present a new parallelizable algorithm with low spatial complexity that is based on the best known sequential algorithm. Our algorithm requires O(V  + E) storage and enables efficient parallel execution. Our algorithm is especially well suited to distributed memory processing because it can be implemented using coarsegrained parallelism. The presented time bounds for parallel execution of our algorithm on CRCW PRAM and on distributed memory systems both show good asymptotic performance. Experimental results with a distributed memory computer show the practical applicability of our algorithm. I.