Results 1 - 10
of
50
ILUM: A Multi-Elimination ILU Preconditioner For General Sparse Matrices
- SIAM J. Sci. Comput
, 1999
"... Standard preconditioning techniques based on incomplete LU (ILU) factorizations offer a limited degree of parallelism, in general. A few of the alternatives advocated so far consist of either using some form of polynomial preconditioning, or applying the usual ILU factorization to a matrix obtain ..."
Abstract
-
Cited by 49 (9 self)
- Add to MetaCart
Standard preconditioning techniques based on incomplete LU (ILU) factorizations offer a limited degree of parallelism, in general. A few of the alternatives advocated so far consist of either using some form of polynomial preconditioning, or applying the usual ILU factorization to a matrix obtained from a multicolor ordering. In this paper we present an incomplete factorization technique based on independent set orderings and multicoloring. We note that in order to improve robustness, it is necessary to allow the preconditioner to have an arbitrarily high accuracy, as is done with ILUs based on threshold techniques. The ILUM factorization described in this paper is in this category. It can be viewed as a multifrontal version a Gaussian elimination procedure with threshold dropping which has a high degree of potential parallelism. The emphasis is on methods that deal specifically with general unstructured sparse matrices such as those arising from finite element methods on un...
BoomerAMG: a Parallel Algebraic Multigrid Solver and Preconditioner
- Applied Numerical Mathematics
, 2000
"... Driven by the need to solve linear sytems arising from problems posed on extremely large, unstructured grids, there has been a recent resurgence of interest in algebraic multigrid (AMG). AMG is attractive in that it holds out the possibility of multigridlike performance on unstructured grids. The sh ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
Driven by the need to solve linear sytems arising from problems posed on extremely large, unstructured grids, there has been a recent resurgence of interest in algebraic multigrid (AMG). AMG is attractive in that it holds out the possibility of multigridlike performance on unstructured grids. The sheer size of many modern physics and simulation problems has led to the development of massively parallel computers, and has sparked much research into developing algorithms for them. Parallelizing AMG is a difficult task, however. While much of the AMG method parallelizes readily, the process of coarse-grid selection, in particular, is fundamentally sequential in nature. We have previously introduced a parallel algorithm [7] for the selection of coarsegrid points, based on modifications of certain parallel independent set algorithms and the application of heuristics designed to insure the quality of the coarse grids, and shown results from a prototype serial version of the algorithm. In this pa...
Parallel Optimisation Algorithms for Multilevel Mesh Partitioning
- Parallel Comput
, 2000
"... Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is ..."
Abstract
-
Cited by 37 (14 self)
- Add to MetaCart
Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is a restriction of this technique in which mesh entities are only allowed to migrate between subdomains in one direction. The third treats the gain as a potential field and uses the concept of relative gain for selecting appropriate vertices to migrate. The results are compared and seen to produce very high global quality partitions, very rapidly. The results are also compared with another partitioning tool and shown to be of higher quality although taking longer to compute. 2000 Elsevier Science B.V. All rights reserved.
What color is your Jacobian? Graph coloring for computing derivatives
- SIAM REV
, 2005
"... Graph coloring has been employed since the 1980s to efficiently compute sparse Jacobian and Hessian matrices using either finite differences or automatic differentiation. Several coloring problems occur in this context, depending on whether the matrix is a Jacobian or a Hessian, and on the specific ..."
Abstract
-
Cited by 36 (7 self)
- Add to MetaCart
Graph coloring has been employed since the 1980s to efficiently compute sparse Jacobian and Hessian matrices using either finite differences or automatic differentiation. Several coloring problems occur in this context, depending on whether the matrix is a Jacobian or a Hessian, and on the specifics of the computational techniques employed. We consider eight variant vertexcoloring problems here. This article begins with a gentle introduction to the problem of computing a sparse Jacobian, followed by an overview of the historical development of the research area. Then we present a unifying framework for the graph models of the variant matrixestimation problems. The framework is based upon the viewpoint that a partition of a matrixinto structurally orthogonal groups of columns corresponds to distance-2 coloring an appropriate graph representation. The unified framework helps integrate earlier work and leads to fresh insights; enables the design of more efficient algorithms for many problems; leads to new algorithms for others; and eases the task of building graph models for new problems. We report computational results on two of the coloring problems to support our claims. Most of the methods for these problems treat a column or a row of a matrixas an atomic entity, and partition the columns or rows (or both). A brief review of methods that do not fit these criteria is provided. We also discuss results in discrete mathematics and theoretical computer science that intersect with the topics considered here.
hypre: a Library of High Performance Preconditioners
- Preconditioners,” Lecture Notes in Computer Science
, 2002
"... hypre is a software library for the solution of large, sparse linear systems on massively parallel computers. Its emphasis is on modern powerful and scalable preconditioners. hypre provides various conceptual interfaces to enable application users to access the library in the way they naturally ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
hypre is a software library for the solution of large, sparse linear systems on massively parallel computers. Its emphasis is on modern powerful and scalable preconditioners. hypre provides various conceptual interfaces to enable application users to access the library in the way they naturally think about their problems. This paper presents the conceptual interfaces in hypre. An overview of the preconditioners that are available in hypre is given, including some numerical results that show the eciency of the library.
Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries
- Modern Software Tools in Scientific Computing
, 1997
"... Parallel numerical software based on the message-passing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses object-oriented programming to conceal the details of the messag ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
Parallel numerical software based on the message-passing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses object-oriented programming to conceal the details of the message passing, without concealing the parallelism, in a high-quality set of numerical software libraries. In fact, the programming model used by PETSc is also the most appropriate for NUMA shared-memory machines, since they require the same careful attention to memory hierarchies as do distributed-memory machines. Thus, the concepts discussed are appropriate for all scalable computing systems. The PETSc libraries provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability. 1 Introduction Currently the only general-purpose, efficient, scalable approach to programming distributed-memory parallel systems is the message-pass...
Parallel Algorithms for the Adaptive Refinement and Partitioning of Unstructured Meshes
- In Proceedings of the Scalable High-Performance Computing Conference
, 1997
"... The efficient solution of many large-scale scientific calculations depends on adaptive mesh strategies. In this paper we present new parallel algorithms to solve two significant problems that arise in this context: the generation of the adaptive mesh and the mesh partitioning. The crux of our refine ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
The efficient solution of many large-scale scientific calculations depends on adaptive mesh strategies. In this paper we present new parallel algorithms to solve two significant problems that arise in this context: the generation of the adaptive mesh and the mesh partitioning. The crux of our refinement algorithm is the identification of independent sets of elements that can be refined in parallel. The objective of our partitioning heuristic is to construct partitions with good aspect ratios. We present run-time bounds and computational results obtained on the Intel DELTA for these algorithms. These results demonstrate that the algorithms exhibit scalable performance and have run-times small in comparison with other aspects of the computation. 1 Introduction Adaptive mesh refinement techniques have been shown to be very successful in reducing the computation and storage requirements for determining approximate solutions to many partial differential equations (PDEs) [9]. Rather than us...
A Comparison of Parallel Graph Coloring Algorithms
, 1995
"... Dynamic irregular triangulated meshes are used in adaptive grid partial differential equation (PDE) solvers, and in simulations of random surface models of quantum gravity in physics and cell membranes in biology. Parallel algorithms for random surface simulations and adaptive grid PDE solvers requi ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Dynamic irregular triangulated meshes are used in adaptive grid partial differential equation (PDE) solvers, and in simulations of random surface models of quantum gravity in physics and cell membranes in biology. Parallel algorithms for random surface simulations and adaptive grid PDE solvers require coloring of the triangulated mesh, so that neighboring vertices are not updated simultaneously. Graph coloring is also used in iterative parallel algorithms for solving large irregular sparse matrix equations. Here we introduce some parallel graph coloring algorithms based on well-known sequential heuristic algorithms, and compare them with some existing parallel algorithms. These algorithms are implemented on both SIMD and MIMD parallel architectures and tested for speed, efficiency, and quality (the average number of colors required) for coloring random triangulated meshes and graphs from sparse matrix problems. 1 Introduction Many simulations in computational science discretize the ...
Scalable Parallel Graph Coloring Algorithms
, 2000
"... Finding a good graph coloring quickly is often a crucial phase in the development of efficient, parallel algorithms for many scientific and engineering applications. In this paper we consider the problem of solving the graph coloring problem itself in parallel. We present a simple and fast paral ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
Finding a good graph coloring quickly is often a crucial phase in the development of efficient, parallel algorithms for many scientific and engineering applications. In this paper we consider the problem of solving the graph coloring problem itself in parallel. We present a simple and fast parallel graph coloring heuristic that is well suited for shared memory programming and yields an almost linear speedup on the PRAM model. We also present a second heuristic that improves on the number of colors used. The heuristics have been implemented using OpenMP. Experiments conducted on an SGI Cray Origin 2000 super computer using very large graphs from finite element methods and eigenvalue computations validate the theoretical run-time analysis.
Parallel Heuristics for Improved, Balanced Graph Colorings
- Journal of Parallel and Distributed Computing
, 1996
"... : The computation of good, balanced graph colorings is an essential part of many algorithms required in scientific and engineering applications. Motivated by an effective sequential heuristic, we introduce a new parallel heuristic, PLF, and show that this heuristic has the same expected runtime unde ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
: The computation of good, balanced graph colorings is an essential part of many algorithms required in scientific and engineering applications. Motivated by an effective sequential heuristic, we introduce a new parallel heuristic, PLF, and show that this heuristic has the same expected runtime under the PRAM computational model as the scalable coloring heuristic introduced by Jones and Plassmann (JP). We present experimental results performed on the Intel DELTA that demonstrate that this new heuristic consistently generates better colorings and requires only slightly more time than the JP heuristic. In the second part of the paper we introduce two new parallel color-balancing heuristics, PDR(k) and PLF(k). We show that these heuristics have the desirable property that they do not increase the number of colors used by an initial coloring during the balancing process. We present experimental results that show that these heuristics are very effective in obtaining balanced colorings and, ...

