Results 1 -
2 of
2
Parallelization of the QR Decomposition with Column Pivoting Using Column Cyclic Distribution on Multicore and GPU Processors
"... Abstract. The QR decomposition with column pivoting (QRP) of a matrix is widely used for numerical rank revealing in applications. The performance of LAPACK implementation (DGEQP3) of the Householder QRP algorithm is limited by Level 2 BLAS operations required for updating the column norms. In this ..."
Abstract
- Add to MetaCart
. In this paper, we propose an implementation of the QRP algorithm using a distribution of the matrix columns in a round-robin fashion for better data locality and parallel memory bus utilization on multicore architectures. Our performance results show a 60 % improvement over the routine DGEQP3 of Intel MKL
F08 – Least-squares and Eigenvalue Problems (LAPACK) F08AGF (DORMQR) NAG Library Routine Document
"... Note: before using this routine, please read the Users ’ Note for your implementation to check the interpretation of bold italicised terms and other implementation-dependent details. 1 Purpose F08AGF (DORMQR) multiplies an arbitrary real matrix C by the real orthogonal matrix Q from a QR factorizati ..."
Abstract
- Add to MetaCart
factorization computed by F08AEF (DGEQRF), F08BEF (DGEQPF) or F08BFF (DGEQP3).