Parallel Numerical Linear Algebra
, 1993
We survey general techniques and open problems in numerical linear algebra on parallel architectures. We first discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing efficient algorithms. We
Ktheory for operator algebras
 Mathematical Sciences Research Institute Publications
, 1998
Neumann used the same name for Hilbert spaces in the modern sense (complete inner product spaces), which he defined in 1928. p. 3 line6: At the end of the line, 2ɛ should be 4ɛ. p. 3 I.1.2.3: The statement that a dense subspace of a Hilbert space H contains an orthonormal basis for H can be false if H
An Extended Set of Fortran Basic Linear Algebra Subprograms
 ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
, 1986
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrixvector operations which should provide for efficient and portable implementations of algorithms for high performance computers.
Automatically tuned linear algebra software
 CONFERENCE ON HIGH PERFORMANCE NETWORKING AND COMPUTING
, 1998
much ofthe technology and approach developed here can be applied to the other Level 3 BLAS and the general strategy can have an impact on basic linear algebra operations in general and may be extended to other important kernel operations.
Efficient and Practical Stochastic Subgradient Descent for Nuclear Norm Regularization
by highly optimized and parallelizable dense linear algebra operations on small matrices. Our practical algorithms always maintain a lowrank factorization of iterates that can be conveniently held in memory and efficiently multiplied to generate predictions in matrix completion settings. Empirical
ASCENT: Adaptive selfconfiguring sensor networks topologies
, 2004
Advances in microsensor and radio technology will enable small but smart sensors to be deployed for a wide range of environmental monitoring applications. The low pernode cost will allow these wireless networks of sensors and actuators to be densely distributed. The nodes in these dense networks
Linear Algebra Operators for GPU Implementation of Numerical Algorithms
 ACM Transactions on Graphics
, 2003
for the implementation of linear algebra operators on programmable graphics processors (GPUs), thus providing the building blocks for the design of more complex numerical algorithms. In particular, we propose a stream model for arithmetic operations on vectors and matrices that exploits the intrinsic parallelism
Benchmarking GPUs to tune dense linear algebra
, 2008
We present performance results for dense linear algebra using recent NVIDIA GPUs. Our matrixmatrix multiply routine (GEMM) runs up to 60 % faster than the vendor’s implementation and approaches the peak of hardware capabilities. Our LU, QR and Cholesky factorizations achieve up to 80
PEAS: A Robust Energy Conserving Protocol for Longlived Sensor Networks
, 2003
ones. PEAS operations are based on individual node's observation of the local environment and do not require any node to maintain per neighbor node state. PEAS performance possesses a high degree of robustness in the presence of both node power depletions and unexpected failures. Our simulations
SPARSKIT: a basic tool kit for sparse matrix computations  Version 2
, 1994
/Boeing collection of matrices for which we provide a number of tools. Among other things the package provides programs for converting data structures, printing simple statistics on a matrix, plotting a matrix profile, performing basic linear algebra operations with sparse matrices and so on. Work done partly
