### TABLE 1. Linear Algebra Kernel Descriptions

"... In PAGE 3: ... Sor improves more because it has a higher percentage of references removed. BLU Block LU BLUP Block LUP Chol Cholesky Decomposition Afold Adjoint Convolution Fold Convolution Seval Spline Evaluation Sor Successive Over Relaxation Linear Algebra Kernel Description TABLE1 . Linear Algebra Kernel Descriptions Linear Algebra Kernel Normalized Execution Time 92 93 61 69 90 95 85 90 VM MMk MMi LU LUP BLUBLUP Chol Afold Fold Seval Sor Mean 0 20 40 60 80 100 Original Optimized... ..."

### TABLE 1. Linear Algebra Kernel Descriptions

"... In PAGE 3: ... Sor improves more because it has a higher percentage of references removed. BLU Block LU BLUP Block LUP Chol Cholesky Decomposition Afold Adjoint Convolution Fold Convolution Seval Spline Evaluation Sor Successive Over Relaxation Linear Algebra Kernel Description TABLE1 . Linear Algebra Kernel Descriptions Linear Algebra Kernel Normalized Execution Time 92 93 61 69 90 95 85 90 VM MMk MMi LU LUP BLUBLUP Chol Afold Fold Seval Sor Mean 0 20 40 60 80 100 Original Optimized... ..."

### TABLE 1. Linear Algebra Kernel Descriptions

"... In PAGE 3: ... Sor improves more because it has a higher percentage of references removed. BLU Block LU BLUP Block LUP Chol Cholesky Decomposition Afold Adjoint Convolution Fold Convolution Seval Spline Evaluation Sor Successive Over Relaxation Linear Algebra Kernel Description TABLE1 . Linear Algebra Kernel Descriptions Linear Algebra Kernel Normalized Execution Time 92 93 61 69 90 95 85 90 VM MMk MMi LU LUP BLUBLUP Chol Afold Fold Seval Sor Mean 0 20 40 60 80 100 Original Optimized ... ..."

### Table 3. Communication of linear algebra kernels

1995

"... In PAGE 5: ... Table 2 gives an overview of the data repre- sentation and layout for the dominating computations of the linear algebra kernels. Table3 shows the benchmarks clas- sified by the communication operations that they use, along with their associated array ranks. Finally, Table 4 demon- strates the computation (FLOP count) to communication ratio in the main loop of each linear algebra benchmark, memory usage for the implemented data types, as well as... ..."

Cited by 2

### Table 3. Communication of linear algebra kernels

"... In PAGE 4: ... Table 2 gives an overview of the data repre- sentation and layout for the dominating computations of the linear algebra kernels. Table3 shows the benchmarks clas- sified by the communication operations that they use, along with their associated array ranks. Finally, Table 4 demon- strates the computation (FLOP count) to communication ratio in the main loop of each linear algebra benchmark, memory usage for the implemented data types, as well as... ..."

### Table 1 MTL linear algebra operations.

1998

"... In PAGE 3: ...Table 1 MTL linear algebra operations. 3 MTL Algorithms Table1 lists the principal algorithms provided by MTL. This list seems sparse, but a large number of functions are available by combining the above algorithms with the strided(), scaled(), and trans() iterator adapters.... ..."

Cited by 9

### TABLE 1. Linear Algebra Kernel Descriptions

"... In PAGE 2: ... Table 1 gives a description of the kernels used in this study. Linear Algebra Kernel Description VM Vector-Matrix Multiply MMk Matrix Multiply reduction order MMi Matrix Multiply cache order LU LU Decomposition LUP LU Decomposition w/pivoting TABLE1 . Linear Algebra Kernel Descriptions Livermore Loop Kernel Normalized Execution Time 81 72 35 84 57 60 96 92 102 98 108 91 1 4 5 7 11 12 13 15 18 20 23 Mean 0 20 40 60 80 100 120 Original Optimized... ..."

### TABLE 1. Linear Algebra Kernel Descriptions

"... In PAGE 2: ... Table 1 gives a description of the kernels used in this study. Linear Algebra Kernel Description VM Vector-Matrix Multiply MMk Matrix Multiply reduction order MMi Matrix Multiply cache order LU LU Decomposition LUP LU Decomposition w/pivoting TABLE1 . Linear Algebra Kernel Descriptions Livermore Loop Kernel Normalized Execution Time 81 72 35 84 57 60 96 92 102 98 108 91 1 4 5 7 11 12 13 15 18 20 23 Mean 0 20 40 60 80 100 120 Original Optimized... ..."

### TABLE 1. Linear Algebra Kernel Descriptions

"... In PAGE 2: ... Table 1 gives a description of the kernels used in this study. Linear Algebra Kernel Description VM Vector-Matrix Multiply MMk Matrix Multiply reduction order MMi Matrix Multiply cache order LU LU Decomposition LUP LU Decomposition w/pivoting TABLE1 . Linear Algebra Kernel Descriptions Livermore Loop Kernel Normalized Execution Time 81 72 35 84 57 60 96 92 102 98 108 91 1 4 5 7 11 12 13 15 18 20 23 Mean 0 20 40 60 80 100 120 Original Optimized ... ..."

### Table 1. MTL linear algebra operations.

1998

"... In PAGE 4: ...The MTL Generic Algorithms for Linear Algebra The Matrix Template Library provides a rich set of basic linear algebra opera- tions, roughly equivalent to the Level-1, Level-2 and Level-3 BLAS. Table1 lists the principle algorithms included in MTL. In the table, alpha and s are scalars, x,y,z are 1-D containers, A,B,C,E are row or column oriented matrices, U, L are upper and lower triangular matrices, and i is an iterator.... In PAGE 4: ... With BLAIS, the blocking sizes can be modi#0Ced at compile time through a few global constants, so that the algorithms can be customized for the memory hierarchy of a particular architecture. Note that in Table1 di#0Berent operations are not de#0Cned for each permutation of transpose, scaling, and striding. Instead, only one algorithm is provided, but it can be combined with the use of strided and scaled vector adapters, or the trans#28#29 method to create the permutations.... ..."

Cited by 7