### Table 1: Resource utilization for LU decomposition

"... In PAGE 2: ... Results and Analysis The proposed design for LU decomposition can handle varying block sizes and input matrix sizes. Table1 shows the resource utilization of the three different steps of our design for a matrix of size 1024x1024 and a block size of 16. The compute engine was implemented with a single data path available for performing the computations in the inner-most loop.... In PAGE 2: ... The compute engine was implemented with a single data path available for performing the computations in the inner-most loop. As seen in Table1 , the resource utilization for steps 3 and 4 are nearly 50% of a single FPGA. Hence, it is possible to increase the number of data paths to two or three and obtain further speed-up.... ..."

### Table 2: Stabilityofvarious pivoting schemes in LU decomposition

1993

"... In PAGE 9: ... Neither pairwise nor parallel pivoting require pivot search outside of two rows, but pairwise pivoting is inherently sequential in its access to rows, whereas parallel pivoting #28as its name indicates#29 parallelizes easily. Table2 summarizes the analysis in #5B77#5D of the speed and stability of these methods 1 . The point is that in the worst case partial, pairwise and parallel pivoting are all unstable, but on average only parallel pivoting is unstable.... ..."

Cited by 8

### Table 2. Number of steps and LU-decompositions.

### Table 3 - Experimental results with LU decomposition program.

### Table 5.1: Efficiencies of the scheduling algorithms for (a) block LU decomposition, (b) row LU decomposition.

1998

### Table 8. Parameter values for the two variations of LU decomposition.

1991

"... In PAGE 20: ... Parameter values for the two variations of LU decomposition. The results of the above analysis are summarized in Table8 . Calculations for the default 120- processor system shows that the No-Copy algorithm produces unacceptable performance, whereas the Copy algorithm yields performance close to that of UMA-2.... ..."

Cited by 7

### Table 8. Parameter values for the two variations of LU decomposition.

1991

"... In PAGE 20: ... Parameter values for the two variations of LU decomposition. The results of the above analysis are summarized in Table8 . Calculations for the default 120- processor system shows that the No-Copy algorithm produces unacceptable performance, whereas the Copy algorithm yields performance close to that of UMA-2.... ..."

Cited by 7

### Table 8: The scalability of the SDC algorithm versus LU decomposition

1997

"... In PAGE 13: ...Table 7: Actual and predicted performance of the SDC algorithm with Newton iteration for the spectral decomposition along the pure imaginary axis Delta 8 #02 16 PEs 16 #02 16 PEs 16 #02 32 PEs n actual predicted actual predicted actual predicted time #28sec#29 time #28sec#29 time #28sec#29 time #28sec#29 time #28sec#29 time #28 sec#29 1000 #7B #7B 134 102 110 93 2000 502 402 448 320 336 269 3000 1037 921 792 687 576 542 4000 #7B #7B 1436 1231 1014 927 8000 #7B #7B #7B #7B 4268 3910 Table8 compares the execution time cost to divide the spectrum once by the SDC algorithm with the cost for LU decomposition. The ratio of SDC to LU costs in each of the three categories, the cost of a #0Dop, message initiation cost and inverse bandwidth cost, is shown in the third column and also displayed here: * 67; 160 + 23 lg p 6+lgp ; 90+40lgp 3+ 1 4 lg p + These cost ratios vary slowly with the number of processors.... In PAGE 13: ... 1 The BLACS use protocol 2, and the communication pattern most closely resembles the #5Cshift quot; timings. 2 #0B is from Table8... ..."

Cited by 32

### Table 3. SBT environment variables used with LU-decomposition.

2001

Cited by 1