12 citations found. Retrieving documents...
C.H.Bischof. Adaptive Blocking in the QR Factorization. The Journal of Supercomputing, 3:193--208, 1989.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Applying recursion to serial and parallel QR factorization.. - Elmroth, Gustavson   (11 citations)  (Correct)

....track of an iteration index and two indices equal to the first and last column the processor is working on. The sizes of the blocks to be updated depend on the size of the remaining matrix and the number of processors available. Near the end of the computation, a form of adaptive blocking is used [13]. As the remaining problem size decreases, both the number of processors and the sizes of the blocks being updated are decreased. A processor that will both update and factorize a block will update only the columns it will factor, i.e. last # first # jb # 1 in Figure 11. There is no fixed ....

C. Bischof, "Adaptive Blocking in the QR Factorization," J. Supercomputing 3, 193--208 (1989).


On Designing Portable High Performance . . . - Demmel, al. (1992)   (Correct)

....machine would require its own special tables. Second, we could devise an automatic installation procedure which could run just a few benchmarks and automatically produce the necessary tables. Third, we could devise algorithms which tuned themselves at run time, choosing parameters automatically [9, 10]. The choice of method depends on the degree of portability we desire; we return to this in Section 6 below. Finally, we have determined that floating point exception handling impacts efficiency. Since overflow is a fatal exception on some machines, completely portable code must avoid it at all ....

C. Bischof. Adaptive blocking in the QR factorization. J. Supercomputing, 3(3):193-- 208, 1989.


On Designing Portable High Performance . . . - Demmel, al. (1991)   (Correct)

....machine would require its own special tables. Second, we could devise an automatic installation procedure which could run just a few benchmarks and automatically produce the necessary tables. Third, we could devise algorithms which tuned themselves at run time, choosing parameters automatically [9, 10]. The choice of method depends on the degree of portability we desire; we return to this in section 6 below. Finally, we have determined that floating point exception handling impacts efficiency. Since overflow is a fatal exception on some machines, completely portable code must avoid it at all ....

C. Bischof. Adaptive blocking in the QR factorization. J. Supercomputing, 3(3):193-- 208, 1989.


Design And Performance Modeling Of Parallel Block Matrix.. - Dackland, Elmroth   (Correct)

....research projects have also been focusing on algorithms for matrix factorizations for DMM. For example, in the mid eighties non block algorithms were discussed in [16, 17, 23, 24] For more references, see [14] The development of distributed block algorithms have started more recently (e.g. see [3, 13]) This paper is a contribution to the design, analysis, and evaluation of distributed block algorithms for some matrix factorizations which are efficient, and scalable in the sense that they preserve their maximal performance (measured in Mf lops node) when both the problem size and the number of ....

C. Bischof, "Adaptive Blocking in the QR Factorization", The Journal of Supercomputing, No. 3, Vol. 3, Kluwer Academic Publishers, (1989), pp 193-208.


New Serial and Parallel Recursive QR Factorization.. - Elmroth, Gustavson (1998)   (7 citations)  (Correct)

....keep track of an iteration index and two indices equal to the first and last column the processor is working on. The size of the blocks to update depends on the size of the remaining matrix and the number of processors available. Near the end of the computation, a form of adaptive blocking is used [2]. As the remaining problem size decreases, both the number of processors and the sizes of the blocks being updated are decreased. A processor that will both update and factorize a block will only update the columns it will factor. There is no fixed synchronization in the algorithm; it is an ....

C. Bischof. Adaptive blocking in the QR factorization. The Journal of Supercomputing, 3:193--208, 1989.


Sparse Givens QR Factorization on a Multiprocessor - Touriņo, Doallo, Zapata (1996)   (Correct)

....ERB CHGE CT92 0005) Householder reflections. Since these sequential algorithms have a high arithmetic complexity, the development of parallel algorithms is of considerable interest. Several parallel orthogonal factorization algorithms have been designed for various machines. We cite just a few: [3] for the Intel iPSC 1, 5] for the nCUBE 10, 9] for a network of transputers, 1] for the nCUBE 2, 2] for the CM 200, all of them for dense matrices; and [14] CM 2) 13] Fujitsu AP1000) 12] Cray T3D) for sparse matrices. We have implemented the Givens method with column pivoting for ....

C.H.Bischof. Adaptive Blocking in the QR Factorization. The Journal of Supercomputing, 3:193--208, 1989.


Parallel Sparse Modified Gram-Schmidt QR Decomposition - Doallo, Fraguela.. (1996)   (Correct)

....Householder reflections and Givens rotations. Since these sequential algorithms have a high arithmetic complexity, the development of parallel algorithms is of considerable interest. Several parallel orthogonal factorization algorithms have been designed for various machines. We cite just a few: [2] for the Intel iPSC 1, 3] for the nCUBE 10, 4] for a network of transputers, 5] for the nCUBE 2, 6] for the CM 200, all of them for dense matrices; and [7] CM 2) 8] Fujitsu AP1000) 9] Cray T3D) for sparse matrices. We have implemented the MGS procedure with column pivoting for sparse ....

Bischof, C.H.: Adaptive Blocking in the QR Factorization. The Journal of Supercomputing, 3 (1989) 193--208


Efficient Computation of the Singular Value Decomposition.. - Gu, Demmel, Dhillon (1994)   (1 citation)  (Correct)

....only 4 3 n 3 flops, 9 times fewer [13, p. 248] In fact, on current computer architectures with steep memory hierarchies, the using the SVD may take over 15 times longer than QR decomposition. This is because the QR decomposition algorithm can be reorganized to exploit the memory hierarchy [3], but the conventional SVD algorithm is much less amenable to this reorganization. The SVD is usually computed in two phases: Phase I: Use orthogonal transformations to reduce A to an upper bidiagonal matrix: A = U 1 U 2 ) B 0 V T ; 1.3) where (U 1 U 2 ) 2 R m Thetam and V 2 R ....

C. Bischof. Adaptive blocking in the QR factorization. J. Supercomputing, 3(3):193--208, 1989.


Dense and Iterative Concurrent Linear Algebra in.. - Bangalore.. (1993)   (3 citations)  (Correct)

....the size of the next panel dynamically, permitting a dynamic tradeoff of bandwidth, latency, and load balancing during the level 3 factorization. Van de Geijn utilizes the blocking of the data distribution to achieve a fixed panel size [10] Bischof has also explored variable blocking algorithms [2, 3]. The following function calls implement LU factorization within Cdense: void lufactorCmatrixlvl2( Cluinfolvl2 LU, Cmatrix B, Cvector rhs) void lufactorCmatrixlvl3( Cluinfolvl3 LU, Cmatrix B, Cvector rhs) Both single (row replicated) and multiple right hand sides are supported. The ....

Christian H. Bischof. Adaptive blocking in the QR factorization. The Journal of Supercomputing, 3(3):193--208, 1989.


Parallel Block Matrix Factorizations for Distributed Memory.. - Dackland, Elmroth (1992)   (Correct)

....research projects have also been focusing on algorithms for matrix factorizations for DMM. For example, in the mid eighties non block algorithms were discussed in [15, 16, 22, 23] For more references, see [13] The development of distributed block algorithms have started more recently (e.g. see [3, 12]) This paper is a contribution to the design, analysis, and evaluation of distributed block algorithms for some matrix factorizations which are efficient, and scalable in the sense that they preserve their maximal performance when both the problem size and the number of processors increases. ....

C. Bischof, "Adaptive Blocking in the QR Factorization", The Journal of Supercomputing, No. 3, Vol. 3, Kluwer Academic Publishers, (1989), pp 193-208.


Sparse Givens QR Factorization on a Multiprocessor - Tourino, Doallo, Zapata (1996)   (Correct)

No context found.

C.H.Bischof. Adaptive Blocking in the QR Factorization. The Journal of Supercomputing, 3:193--208, 1989.


Parallel Sparse Modified Gram-Schmidt QR Decomposition - Doallo, Fraguela.. (1996)   (Correct)

No context found.

Bischof, C.H.: Adaptive Blocking in the QR Factorization. The Journal of Supercomputing, 3 (1989) 193-208

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC