Results 1  10
of
138
A highperformance software package for semidefinite programs: SDPA 7
, 2010
"... ..."
(Show Context)
Using mixed precision for sparse matrix computations to enhance the performance while achieving 64bit accuracy
 ACM Trans. Math. Softw
"... By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techni ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techniques and sparse iterative techniques such as Krylov subspace methods. The approach presented here can apply not only to conventional processors but also to exotic technologies such as
Accelerating Scientific Computations with Mixed Precision Algorithms
, 2008
"... On modern architectures, the performance of 32bit operations is often at least twice as fast as the performance of 64bit operations. By using a combination of 32bit and 64bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanc ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
(Show Context)
On modern architectures, the performance of 32bit operations is often at least twice as fast as the performance of 64bit operations. By using a combination of 32bit and 64bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented.
On computing inverse entries of a sparse matrix in an outofcore environment
, 2010
"... Abstract. The inverse of an irreducible sparse matrix is structurally full, so that it is impractical to think of computing or storing it. However, there are several applications where a subset of the entries of the inverse is required. Given a factorization of the sparse matrix held in outofcore ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
(Show Context)
Abstract. The inverse of an irreducible sparse matrix is structurally full, so that it is impractical to think of computing or storing it. However, there are several applications where a subset of the entries of the inverse is required. Given a factorization of the sparse matrix held in outofcore storage, we show how to compute such a subset e ciently, by accessing only parts of the factors. When there are many inverse entries to compute, we need to guarantee that the overall computation scheme has reasonable memory requirements, while minimizing the cost of loading the factors. This leads to a partitioning problem that we prove is NPcomplete. We also show that we cannot get a close approximation to the optimal solution in polynomial time. We thus need to develop heuristic algorithms, and we propose: (i) a lower bound on the cost of an optimum solution; (ii) an exact algorithm for a particular case; (iii) two other heuristics for a more general case; and (iv) hypergraph partitioning models for the most general setting. We illustrate the performance of our algorithms in practice using the MUMPS software package on a set of reallife problems as well as some standard test matrices. We show that our techniques can improve the execution time by a factor of 50. Key words. Sparse matrices, direct methods for linear systems and matrix inversion, multifrontal method, graphs and hypergraphs. AMS subject classi cations. 05C50, 05C65, 65F05, 65F50 1. Introduction. We
An ESchedulerBased Data Dependence Analysis and Task Scheduling for Parallel Circuit Simulation
 TCASII
, 2011
"... Abstract—The sparse matrix solver has become the bottleneck ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
(Show Context)
Abstract—The sparse matrix solver has become the bottleneck
On optimal tree traversals for sparse matrix factorization
 In IPDPS’2011, the 25th IEEE International Parallel and Distributed Processing Symposium. IEEE Computer
, 2011
"... Abstract—We study the complexity of traversing treeshaped workflows whose tasks require large I/O files. Such workflows typically arise in the multifrontal method of sparse matrix factorization. We target a classical twolevel memory system, where the main memory is faster but smaller than the seco ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
(Show Context)
Abstract—We study the complexity of traversing treeshaped workflows whose tasks require large I/O files. Such workflows typically arise in the multifrontal method of sparse matrix factorization. We target a classical twolevel memory system, where the main memory is faster but smaller than the secondary memory. A task in the workflow can be processed if all its predecessors have been processed, and if its input and output files fit in the currently available main memory. The amount of available memory at a given time depends upon the ordering in which the tasks are executed. What is the minimum amount of main memory, over all postorder schemes, or over all possible traversals, that is needed for an incore execution? We establish several complexity results that answer these questions. We propose a new, polynomial time, exact algorithm which runs faster than a reference algorithm. Next, we address the setting where the required memory renders a pure incore solution unfeasible. In this setting, we ask the following question: what is the minimum amount of I/O that must be performed between the main memory and the secondary memory? We show that this latter problem is NPhard, and propose efficient heuristics. All algorithms and heuristics are thoroughly evaluated on assembly trees arising in the context of sparse matrix factorizations. KeywordsSparse matrix factorization, Multifrontal method,
A Continuation Multilevel Monte Carlo algorithm
, 2014
"... We propose a novel Continuation Multi Level Monte Carlo (CMLMC) algorithm for weak approximation of stochastic models that are described in terms of differential equations either driven by random measures or with random coefficients. The CMLMC algorithm solves the given approximation problem for ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
We propose a novel Continuation Multi Level Monte Carlo (CMLMC) algorithm for weak approximation of stochastic models that are described in terms of differential equations either driven by random measures or with random coefficients. The CMLMC algorithm solves the given approximation problem for a sequence of decreasing tolerances, ending with the desired one. CMLMC assumes discretization hierarchies that are defined a priori for each level and are geometrically refined across levels. The actual choice of computational work across levels is based on parametric models for the average cost per sample and the corresponding weak and strong errors. These parameters are calibrated using Bayesian estimation, taking particular notice of the deepest levels of the discretization hierarchy, where only few realizations are available to produce the estimates. The resulting CMLMC estimator exhibits a nontrivial splitting between bias and statistical contributions. We also show the asymptotic normality of the statistical error in the MLMC estimator and justify in this way our error estimate that allows prescribing both required accuracy and confidence in the final result. Numerical examples substantiate the above results and illustrate the corresponding computational savings.
Combinatorial problems in solving linear systems
, 2009
"... Numerical linear algebra and combinatorial optimization are vast subjects; as is their interaction. In virtually all cases there should be a notion of sparsity for a combinatorial problem to arise. Sparse matrices therefore form the basis of the interaction of these two seemingly disparate subjects. ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
Numerical linear algebra and combinatorial optimization are vast subjects; as is their interaction. In virtually all cases there should be a notion of sparsity for a combinatorial problem to arise. Sparse matrices therefore form the basis of the interaction of these two seemingly disparate subjects. As the core of many of today’s numerical linear algebra computations consists of the solution of sparse linear system by direct or iterative methods, we survey some combinatorial problems, ideas, and algorithms relating to these computations. On the direct methods side, we discuss issues such as matrix ordering; bipartite matching and matrix scaling for better pivoting; task assignment and scheduling for parallel multifrontal solvers. On the iterative method side, we discuss preconditioning techniques including incomplete factorization preconditioners, support graph preconditioners, and algebraic multigrid. In a separate part, we discuss the block triangular form of sparse matrices.