Results 1  10
of
127
PETSc users manual
 ANL95/11  Revision 2.1.0, Argonne National Laboratory
, 2001
"... tract W31109Eng38. 2 This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on highperformance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provid ..."
Abstract

Cited by 278 (20 self)
 Add to MetaCart
(Show Context)
tract W31109Eng38. 2 This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on highperformance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of largescale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all messagepassing communication. PETSc includes an expanding suite of parallel linear, nonlinear equation solvers and time integrators that may be used in application codes written in Fortran, C, and C++. PETSc provides many of the mechanisms needed within parallel application codes, such as parallel matrix and vector assembly routines. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of objectoriented programming, PETSc provides enormous flexibility for users. PETSc is a sophisticated set of software tools; as such, for some users it initially has a much steeper
Jacobianfree NewtonKrylov methods: a survey of approaches and applications
 J. Comput. Phys
"... Jacobianfree NewtonKrylov (JFNK) methods are synergistic combinations of Newtontype methods for superlinearly convergent solution of nonlinear equations and Krylov subspace methods for solving the Newton correction equations. The link between the two methods is the Jacobianvector product, which ..."
Abstract

Cited by 192 (6 self)
 Add to MetaCart
(Show Context)
Jacobianfree NewtonKrylov (JFNK) methods are synergistic combinations of Newtontype methods for superlinearly convergent solution of nonlinear equations and Krylov subspace methods for solving the Newton correction equations. The link between the two methods is the Jacobianvector product, which may be probed approximately without forming and storing the elements of the true Jacobian, through a variety of means. Various approximations to the Jacobian matrix may still be required for preconditioning the resulting Krylov iteration. As with Krylov methods for linear problems, successful application of the JFNK method to any given problem is dependent on adequate preconditioning. JFNK has potential for application throughout problems governed by nonlinear partial dierential equations and integrodierential equations. In this survey article we place JFNK in context with other nonlinear solution algorithms for both boundary value problems (BVPs) and initial value problems (IVPs). We provide an overview of the mechanics of JFNK and attempt to illustrate the wide variety of preconditioning options available. It is emphasized that JFNK can be wrapped (as an accelerator) around another nonlinear xed point method (interpreted as a preconditioning process, potentially with signicant code reuse). The aim of this article is not to trace fully the evolution of JFNK, nor to provide proofs of accuracy or optimal convergence for all of the constituent methods, but rather to present the reader with a perspective on how JFNK may be applicable to problems of physical interest and to provide sources of further practical information. A review paper solicited by the EditorinChief of the Journal of Computational
Globalized Newton–Krylov–Schwarz algorithms and software for parallel implicit CFD
 Int. J. High Perform. Comput. Appl
"... Implicit solution methods are important in applications modeled by PDEs with disparate temporal and spatial scales. Because such applications require high resolution with reasonable turnaround, parallelization is essential. The pseudotransient matrixfree NewtonKrylovSchwarz ( Y NKS) algorithmic ..."
Abstract

Cited by 44 (17 self)
 Add to MetaCart
(Show Context)
Implicit solution methods are important in applications modeled by PDEs with disparate temporal and spatial scales. Because such applications require high resolution with reasonable turnaround, parallelization is essential. The pseudotransient matrixfree NewtonKrylovSchwarz ( Y NKS) algorithmic framework is presented as a widely applicable answer. This article shows that for the classical problem of threedimensional transonic Euler flow about an M6 wing, Y NKS can simultaneously deliver globalized, asymptotically rapid convergence through adaptive pseudotransient continuation and Newton’s method; reasonable parallelizability for an implicit method through deferred synchronization and favorable communicationtocomputation scaling in the Krylov linear solver; and high per processor performance through attention to distributed memory and cache locality, especially through the Schwarz preconditioner. Two discouraging features of Y NKS methods are their sensitivity to the coding of the underlying PDE discretization and the large number of parameters that must be selected to govern convergence. The authors therefore distill several recommendations from their experience and reading of the literature on various algorithmic components of Y NKS, and they describe a freely available MPIbased portable parallel software implementation of the solver employed here. 1
pARMS: A parallel version of the algebraic recursive multilevel solver
 Numer. Linear Algebra Appl
"... ..."
A minimum overlap restricted additive Schwarz preconditioner and applications in 3D flow simulations
 Contemporary Mathematics
, 1998
"... Numerical simulations of unsteady threedimensional compressible flow problems require the solution of large, sparse, nonlinear systems of equations arising from the discretization of Euler or NavierStokes equations on unstructured, possibly dynamic, meshes. In this ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
(Show Context)
Numerical simulations of unsteady threedimensional compressible flow problems require the solution of large, sparse, nonlinear systems of equations arising from the discretization of Euler or NavierStokes equations on unstructured, possibly dynamic, meshes. In this
Performance modeling and tuning of an unstructured mesh CFD application
 IN PROCEEDINGS OF SC2000
, 2000
"... This paper describes performance tuning experiences with a threedimensional unstructured grid Euler flow code from NASA, which we have reimplemented in the PETSc framework and ported to several largescale machines, including the ASCI Red and Blue Pacific machines, the SGI Origin, the Cray T3E, and ..."
Abstract

Cited by 25 (6 self)
 Add to MetaCart
(Show Context)
This paper describes performance tuning experiences with a threedimensional unstructured grid Euler flow code from NASA, which we have reimplemented in the PETSc framework and ported to several largescale machines, including the ASCI Red and Blue Pacific machines, the SGI Origin, the Cray T3E, and Beowulf clusters. The code achieves a respectable level of performance for sparse problems, typical of scientific and engineering codes based on partial differential equations, and scales well up to thousands of processors. Since the gap between CPU speed and memory access rate is widening, the code is analyzed from a memorycentric perspective (in contrast to traditional floporientation) to understand its sequential and parallel performance. Performance tuning is approached on three fronts: data layouts to enhance locality of reference, algorithmic parameters, and parallel programming model. This effort was guided partly by some simple performance models developed for the sparse matrixvector product operation.
On the interaction of architecture and algorithm in the domainbased parallelization of an unstructured grid incompressible flow code
 In Proceedings of the Tenth International Conference on Domain Decomposition Methods
, 1998
"... The convergence rates and, therefore, the overall parallel e ciencies of additive Schwarz methods are often notoriously dependent on subdomain granularity. Except when e ective coarsegrid operators and intergrid transfer operators are known, so that optimal multilevel preconditioners can be constru ..."
Abstract

Cited by 25 (17 self)
 Add to MetaCart
(Show Context)
The convergence rates and, therefore, the overall parallel e ciencies of additive Schwarz methods are often notoriously dependent on subdomain granularity. Except when e ective coarsegrid operators and intergrid transfer operators are known, so that optimal multilevel preconditioners can be constructed, the number
High Performance Parallel Implicit CFD
 Parallel Computing
, 2000
"... Fluid dynamical simulations based on #nite discretizations on #quasi#static grids scale well in parallel, but execute at a disappointing percentage of perprocessor peak #oating point operation rates without special attention to layout and access ordering of data. We document both claims from our e ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
Fluid dynamical simulations based on #nite discretizations on #quasi#static grids scale well in parallel, but execute at a disappointing percentage of perprocessor peak #oating point operation rates without special attention to layout and access ordering of data. We document both claims from our experience with an unstructured grid CFD code that is typical of the state of the practice at NASA. These basic performance characteristics of PDEbased codes can be understood with surprisingly simple models, for whichwe quote earlier work, presenting primarily experimental results herein. These performance models and experimental results motivate algorithmic and software practices that lead to improvements in both parallel scalability and per node performance. This snapshot of ongoing work updates our 1999 Bell Prizewinning simulation on ASCI computers. Key words: parallel implicit solvers, unstructured grids, computational #uid dynamics, highperformance computing 1991 MSC: 65H20, 65N5...
An algebraic convergence theory for restricted additive Schwarz methods using weighted max norms
 SIAM J. NUMER. ANAL
, 2001
"... Convergence results for the restrictive additive Schwarz (RAS) method of Cai and Sarkis [SIAM J. Sci. Comput., 21 (1999), pp. 792–797] for the solution of linear systems of the form Ax = b are provided using an algebraic view of additive Schwarz methods and the theory of multisplittings. The linear ..."
Abstract

Cited by 19 (9 self)
 Add to MetaCart
Convergence results for the restrictive additive Schwarz (RAS) method of Cai and Sarkis [SIAM J. Sci. Comput., 21 (1999), pp. 792–797] for the solution of linear systems of the form Ax = b are provided using an algebraic view of additive Schwarz methods and the theory of multisplittings. The linear systems studied are usually discretizations of partial differential equations in two or three dimensions. It is shown that in the case of A symmetric positive definite, the projections defined by the methods are not orthogonal with respect to the inner product defined by A, and therefore the standard analysis cannot be used here. The convergence results presented are for the class of Mmatrices (and more generally for Hmatrices) using weighted max norms. Comparison between different versions of the RAS method are given in terms of these norms. A comparison theorem with respect to the classical additive Schwarz method makes it possible to indirectly get quantitative results on rates of convergence which otherwise cannot be obtained by the theory. Several RAS variants are considered, including new ones and twolevel schemes.
Analysis of a twolevel Schwarz method with coarse spaces based on local DirichlettoNeumann maps
 Comput. Methods Appl. Math
"... Schwarz method with coarse spaces based on local Dirichlet–to–Neumann maps. computer ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
(Show Context)
Schwarz method with coarse spaces based on local Dirichlet–to–Neumann maps. computer