Results 1  10
of
153
A column approximate minimum degree ordering algorithm
, 2000
"... Sparse Gaussian elimination with partial pivoting computes the factorization PAQ = LU of a sparse matrix A, where the row ordering P is selected during factorization using standard partial pivoting with row interchanges. The goal is to select a column preordering, Q, based solely on the nonzero patt ..."
Abstract

Cited by 318 (52 self)
 Add to MetaCart
Sparse Gaussian elimination with partial pivoting computes the factorization PAQ = LU of a sparse matrix A, where the row ordering P is selected during factorization using standard partial pivoting with row interchanges. The goal is to select a column preordering, Q, based solely on the nonzero pattern of A such that the factorization remains as sparse as possible, regardless of the subsequent choice of P. The choice of Q can have a dramatic impact on the number of nonzeros in L and U. One scheme for determining a good column ordering for A is to compute a symmetric ordering that reduces fillin in the Cholesky factorization of ATA. This approach, which requires the sparsity structure of ATA to be computed, can be expensive both in
A supernodal approach to sparse partial pivoting
 SIAM Journal on Matrix Analysis and Applications
, 1999
"... We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. To perform most of the numerical computation in dense matrix kernels, we introduce the notion of unsymmetric supernodes. To better exploit the memory ..."
Abstract

Cited by 262 (25 self)
 Add to MetaCart
We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. To perform most of the numerical computation in dense matrix kernels, we introduce the notion of unsymmetric supernodes. To better exploit the memory hierarchy, weintroduce unsymmetric supernodepanel updates and twodimensional data partitioning. To speed up symbolic factorization, we use Gilbert and Peierls's depth rst search with Eisenstat and Liu's symmetric structural reductions. We have implemented a sparse LU code using all these ideas. We present experiments demonstrating that it is signi cantly faster than earlier partial pivoting codes. We also compare performance with Umfpack, which uses a multifrontal approach; our code is usually faster.
Solving unsymmetric sparse systems of linear equations with PARDISO
 Journal of Future Generation Computer Systems
, 2004
"... Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to ..."
Abstract

Cited by 198 (12 self)
 Add to MetaCart
(Show Context)
Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to extend these concepts further and unsymmetric prepermutation of rows is used to place large matrix entries on the diagonal. Complete block diagonal supernode pivoting allows dynamical interchanges of columns and rows during the factorization process. The level3 BLAS efficiency is retained and an advanced twolevel left–right looking scheduling scheme results in good speedup on SMP machines. These algorithms have been integrated into the recent unsymmetric version of the PARDISO solver. Experiments demonstrate that a wide set of unsymmetric linear systems can be solved and high performance is consistently achieved for large sparse unsymmetric matrices from real world applications. Key words: Computational sciences, numerical linear algebra, direct solver, unsymmetric linear systems
A column preordering strategy for the unsymmetricpattern multifrontal method
 ACM Transactions on Mathematical Software
, 2004
"... A new method for sparse LU factorization is presented that combines a column preordering strategy with a rightlooking unsymmetricpattern multifrontal numerical factorization. The column ordering is selected to give a good a priori upper bound on fillin and then refined during numerical factoriza ..."
Abstract

Cited by 100 (5 self)
 Add to MetaCart
(Show Context)
A new method for sparse LU factorization is presented that combines a column preordering strategy with a rightlooking unsymmetricpattern multifrontal numerical factorization. The column ordering is selected to give a good a priori upper bound on fillin and then refined during numerical factorization (while preserving the bound). Pivot rows are selected to maintain numerical stability and to preserve sparsity. The method analyzes the matrix and automatically selects one of three preordering and pivoting strategies. The number of nonzeros in the LU factors computed by the method is typically less than or equal to those found by a wide range of unsymmetric sparse LU factorization methods, including leftlooking methods and prior multifrontal methods.
A numerical evaluation of sparse direct solvers for the solution of large sparse, symmetric linear systems of equations
, 2005
"... ..."
Sparse Gaussian Elimination on High Performance Computers
, 1996
"... This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performan ..."
Abstract

Cited by 40 (7 self)
 Add to MetaCart
This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performance machines. In the first part we discuss optimizations of a sequential algorithm to exploit the memory hierarchies that exist in most RISCbased superscalar computers. We begin with the leftlooking supernodecolumn algorithm by Eisenstat, Gilbert and Liu, which includes Eisenstat and Liu's symmetric structural reduction for fast symbolic factorization. Our key contribution is to develop both numeric and symbolic schemes to perform supernodepanel updates to achieve better data reuse in cache and floatingpoint register...
Recent Advances in Direct Methods for Solving Unsymmetric Sparse Systems of Linear Equations
, 2001
"... ..."
(Show Context)
WSMP: Watson sparse matrix package (PartII: direct solution of general sparse systems,”
 IBM T. J. Watson Research
, 2000
"... ..."
Lowfrequency variability in shallowwater models of the winddriven ocean circulation. Part II: Timedependent solutions.
 J. Phys. Oceanogr.,
, 2003
"... ABSTRACT Successive bifurcationsfrom steady states through periodic to aperiodic solutionsare studied in a shallowwater, reducedgravity, 2½layer model of the midlatitude ocean circulation subject to timeindependent wind stress. The bifurcation sequence is studied in detail for a rectangular ba ..."
Abstract

Cited by 34 (20 self)
 Add to MetaCart
(Show Context)
ABSTRACT Successive bifurcationsfrom steady states through periodic to aperiodic solutionsare studied in a shallowwater, reducedgravity, 2½layer model of the midlatitude ocean circulation subject to timeindependent wind stress. The bifurcation sequence is studied in detail for a rectangular basin with an idealized spatial pattern of wind stress. The aperiodic behavior is studied also in a North Atlanticshaped basin with realistic continental contours. The bifurcation sequence in the rectangular basin is studied in Part I, the present article. It follows essentially the one reported for singlelayer quasigeostrophic and 1½layer shallowwater models. As the intensity of the northsouthsymmetric, zonal wind stress is increased, the nearly symmetric doublegyre circulation is destabilized through a perturbed pitchfork bifurcation. The lowstress steady solution, with its nearly equal subtropical and subpolar gyres, is replaced by an approximately mirrorsymmetric pair of stable equilibria. The two solution branches so obtained are named after the inertial recirculation cell that is stronger, subtropical or subpolar, respectively. This perturbed pitchfork bifurcation and the associated Hopf bifurcations are robust to changes in the interface friction between the two active layers and the thickness H 2 of the lower active layer. They persist in the presence of asymmetries in the wind stress and of changes in the model's spatial resolution and finitedifference scheme. Timedependent model behavior in the rectangular basin, as well as in the more realistic, North Atlanticshaped one, is studied in Part II.