Results 1 - 10
of
70
Optimization of Sparse Matrix-vector Multiplication on Emerging Multicore Platforms
- In Proc. SC2007: High performance computing, networking, and storage conference
, 2007
"... We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore spec ..."
Abstract
-
Cited by 54 (15 self)
- Add to MetaCart
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) – one of the most heavily used kernels in scientific computing – across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientific study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms. 1.
Graph Sandwich Problems
, 1994
"... The graph sandwich problem for property \Pi is defined as follows: Given two graphs G ) such that E ` E , is there a graph G = (V; E) such that E which satisfies property \Pi? Such problems generalize recognition problems and arise in various applications. Concentrating mainly o ..."
Abstract
-
Cited by 45 (8 self)
- Add to MetaCart
The graph sandwich problem for property \Pi is defined as follows: Given two graphs G ) such that E ` E , is there a graph G = (V; E) such that E which satisfies property \Pi? Such problems generalize recognition problems and arise in various applications. Concentrating mainly on properties characterizing subfamilies of perfect graphs, we give polynomial algorithms for several properties and prove the NP-completeness of others. We describe
Robust Ordering of Sparse Matrices using Multisection
- Department of Computer Science, York University
, 1996
"... In this paper we provide a robust reordering scheme for sparse matrices. The scheme relies on the notion of multisection, a generalization of bisection. The reordering strategy is demonstrated to have consistently good performance in terms of fill reduction when compared with multiple minimum degree ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
In this paper we provide a robust reordering scheme for sparse matrices. The scheme relies on the notion of multisection, a generalization of bisection. The reordering strategy is demonstrated to have consistently good performance in terms of fill reduction when compared with multiple minimum degree and generalized nested dissection. Experimental results show that by using multisection, we obtain an ordering which is consistently as good as or better than both for a wide spectrum of sparse problems. 1 Introduction It is well recognized that finding a fill-reducing ordering is crucial in the success of the numerical solution of sparse linear systems. For symmetric positive-definite systems, the minimum degree [38] and the nested dissection [11] orderings are perhaps the most popular ordering schemes. They represent two opposite approaches to the ordering problem. However, they share a common undesirable characteristic. Both schemes produce generally good orderings, but the ordering qua...
An Algorithm for Coarsening Unstructured Meshes
- Numer. Math
, 1996
"... . We develop and analyze a procedure for creating a hierarchical basis of continuous piecewise linear polynomials on an arbitrary, unstructured, nonuniform triangular mesh. Using these hierarchical basis functions, we are able to define and analyze corresponding iterative methods for solving the lin ..."
Abstract
-
Cited by 40 (5 self)
- Add to MetaCart
. We develop and analyze a procedure for creating a hierarchical basis of continuous piecewise linear polynomials on an arbitrary, unstructured, nonuniform triangular mesh. Using these hierarchical basis functions, we are able to define and analyze corresponding iterative methods for solving the linear systems arising from finite element discretizations of elliptic partial differential equations. We show that such iterative methods perform as well as those developed for the usual case of structured, locally refined meshes. In particular, we show that the generalized condition numbers for such iterative methods are of order J 2 , where J is the number of hierarchical basis levels. Key words. Finite element, hierarchical basis, multigrid, unstructured mesh. AMS subject classifications. 65F10, 65N20 1. Introduction. Iterative methods using the hierarchical basis decomposition have proved to be among the most robust for solving broad classes of elliptic partial differential equations, ...
Highly Parallel Sparse Cholesky Factorization
- SIAM Journal on Scientific and Statistical Computing
, 1992
"... We develop and compare several fine-grained parallel algorithms to compute the Cholesky factorization of a sparse matrix. Our experimental implementations are on the Connection Machine, a distributed-memory SIMD machine whose programming model conceptually supplies one processor per data element. In ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
We develop and compare several fine-grained parallel algorithms to compute the Cholesky factorization of a sparse matrix. Our experimental implementations are on the Connection Machine, a distributed-memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special-purpose algorithms in which the matrix structure conforms to the connection structure of the machine, our focus is on matrices with arbitrary sparsity structure.
Predicting Structure In Sparse Matrix Computations
- SIAM J. Matrix Anal. Appl
, 1994
"... . Many sparse matrix algorithms---for example, solving a sparse system of linear equations---begin by predicting the nonzero structure of the output of a matrix computation from the nonzero structure of its input. This paper is a catalog of ways to predict nonzero structure. It contains known result ..."
Abstract
-
Cited by 34 (4 self)
- Add to MetaCart
. Many sparse matrix algorithms---for example, solving a sparse system of linear equations---begin by predicting the nonzero structure of the output of a matrix computation from the nonzero structure of its input. This paper is a catalog of ways to predict nonzero structure. It contains known results for problems including various matrix factorizations, and new results for problems including some eigenvector computations. Key words. sparse matrix algorithms, graph theory, matrix factorization, systems of linear equations, eigenvectors AMS(MOS) subject classifications. 15A18, 15A23, 65F50, 68R10 1. Introduction. A sparse matrix algorithm is an algorithm that performs a matrix computation in such a way as to take advantage of the zero/nonzero structure of the matrices involved. Usually this means not explicitly storing or manipulating some or all of the zero elements; sometimes sparsity can also be exploited to work on different parts of a matrix problem in parallel. Large sparse matr...
Complexity classification of some edge modification problems
, 2001
"... In an edge modification problem one has to change the edge set of a given graph as little as possible so as to satisfy a certain property. We prove the NP-hardness of a variety of edge modification problems with respect to some well-studied classes of graphs. These include perfect, chordal, chain, c ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
In an edge modification problem one has to change the edge set of a given graph as little as possible so as to satisfy a certain property. We prove the NP-hardness of a variety of edge modification problems with respect to some well-studied classes of graphs. These include perfect, chordal, chain, comparability, split and asteroidal triple free. We show that some of these problems become polynomial when the input graph has bounded degree. We also give a general constant factor approximation algorithm for deletion and editing problems on bounded degree graphs with respect to properties that can be characterized by a finite set of forbidden induced subgraphs.
Tractability of Parameterized Completion Problems on Chordal, Strongly Chordal and Proper Interval Graphs
, 1994
"... We study the parameterized complexity of three NP-hard graph completion problems. The MINIMUM FILL-IN problem is to decide if a graph can be triangulated by adding at most k edges. We develop O(c m) and O(k mn + f(k)) algorithms for this problem on a graph with n vertices and m edges. Here f(k ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
We study the parameterized complexity of three NP-hard graph completion problems. The MINIMUM FILL-IN problem is to decide if a graph can be triangulated by adding at most k edges. We develop O(c m) and O(k mn + f(k)) algorithms for this problem on a graph with n vertices and m edges. Here f(k) is exponential in k and the constants hidden by the big-O notation are small and do not depend on k. In particular, this implies that the problem is fixed-parameter tractable (FPT). The PROPER
The Hierarchical Basis Multigrid Method And Incomplete LU Decomposition
- In Seventh International Symposium on Domain Decomposition Methods for Partial Differential Equations
, 1994
"... . A new multigrid or incomplete LU technique is developed in this paper for solving large sparse algebraic systems from discretizing partial differential equations. By exploring some deep connection between the hierarchical basis method and incomplete LU decomposition, the resulting algorithm can be ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
. A new multigrid or incomplete LU technique is developed in this paper for solving large sparse algebraic systems from discretizing partial differential equations. By exploring some deep connection between the hierarchical basis method and incomplete LU decomposition, the resulting algorithm can be effectively applied to problems discretized on completelyunstructured grids. Numerical experiments demonstrating the efficiency of the method are also reported. Key words. Finite element, hierarchical basis, multigrid, incomplete LU . AMS(MOS) subject classifications. 65F10, 65N20 1. Introduction. In this work, we explore the connection between the methods of sparse Gaussian elimination [8][13], incomplete LU (ILU) decomposition [9][10] and the hierarchical basis multigrid (HBMG) [16][4]. Hierarchical basis methods have proved to be one of the more robust classes of methods for solving broad classes of elliptic partial differential equations, especially the large systems arising in conju...
Subexponential Parameterized Algorithms on Graphs of Bounded Genus and H-Minor-Free Graphs
, 2003
"... We introduce a new framework for designing fixed-parameter algorithms with subexponential running time---2 . Our results apply to a broad family of graph problems, called bidimensional problems, which includes many domination and covering problems such as vertex cover, feedback vertex set, minimum m ..."
Abstract
-
Cited by 27 (9 self)
- Add to MetaCart
We introduce a new framework for designing fixed-parameter algorithms with subexponential running time---2 . Our results apply to a broad family of graph problems, called bidimensional problems, which includes many domination and covering problems such as vertex cover, feedback vertex set, minimum maximal matching, dominating set, edge dominating set, clique-transversal set, and many others restricted to bounded genus graphs. Furthermore, it is fairly straightforward to prove that a problem is bidimensional. In particular, our framework includes as special cases all previously known problems to have such subexponential algorithms. Previously, these algorithms applied to planar graphs, single-crossing-minor-free graphs, and/or map graphs; we extend these results to apply to bounded-genus graphs as well. In a parallel development of combinatorial results, we establish an upper bound on the treewidth (or branchwidth) of a bounded-genus graph that excludes some planar graph H as a minor. This bound depends linearly on the size (H)| of the excluded graph H and the genus g(G) of the graph G, and applies and extends the graph-minors work of Robertson and Seymour. Building on these results...

