### Table 2: Break-even point for computing M (BEPCM) and break-even point for using M (BEPUM) in terms of q, obtained at interior-point iteration number 2. In the BEPUM column, number of iterations per PCGLS step is equal to one. The quantity nnz(A) is the number of nonzero elements of A and nnz(A)=mn represents the sparseness of A.

"... In PAGE 20: ... When number of iterations per PCGLS step is xed to one, we de ne the break-even point for using M (BEPUM) to be the largest integer q for which the cost of a PCGLS step is less than or equal to the cost of a direct step, where the cost is measured by oating point operations unless otherwise indicated. Table2 gives the break-even points for both M1 and M2 for several Netlib linear programming problems. The gures in the column for BEPCM indicate that in our current implementation M1 is generally less expensive to compute than M2 is.... In PAGE 20: ... The gures in the column for BEPUM give upper bounds for choosing q. For a given problem when the ratio RS=D is very small in Table 1, the di erence in the corresponding gures for BEPCM and BEPUM in Table2 is large, and vice-versa. For a constant q, we de ne the break-even point for using M (BEPUM) as the largest number of PCGLS iterations for which the cost of a PCGLS step is less than or equal to the cost of a direct step, where the cost is... ..."

### Table 2: Break-even point for computing M (BEPCM) and break-even point for using M (BEPUM) in terms of q, obtained at interior-point iteration number 2. In the BEPUM column, number of iterations per PCGLS step is equal to one. The quantity nnz(A) is the number of nonzero elements of A and nnz(A)=mn represents the sparseness of A.

"... In PAGE 20: ... When number of iterations per PCGLS step is xed to one, we de ne the break-even point for using M (BEPUM) to be the largest integer q for which the cost of a PCGLS step is less than or equal to the cost of a direct step, where the cost is measured by oating point operations unless otherwise indicated. Table2 gives the break-even points for both M1 and M2 for several Netlib linear programming problems. The gures in the column for BEPCM indicate that in our current implementation M1 is generally less expensive to compute than M2 is.... In PAGE 20: ... The gures in the column for BEPUM give upper bounds for choosing q. For a given problem when the ratio RS=D is very small in Table 1, the di erence in the corresponding gures for BEPCM and BEPUM in Table2 is large, and vice-versa. For a constant q, we de ne the break-even point for using M (BEPUM) as the largest number of PCGLS iterations for which the cost of a PCGLS step is less than or equal to the cost of a direct step, where the cost is measured by oating point operations unless unless otherwise indicated.... ..."

### Table X. Break-Even Point (in Seconds). Illustrates the Time Required for the Object Layout Adaptation to Pay Off. If the Unoptimized Program Version Ran Longer than the Break-Even Point, Performing the Data Layout Technique First and then Running the Optimized Program Version Would Perform Better. The Compilation Cost C0 Includes the Cost for Applying Standard Optimizations to the Application and Inserting Instrumentation Utilized Later by the Memory Optimization. C1 Includes the Cost for Reading the Collected Path Profiling Data and Creating the TRG Graph, Computing the New Memory Layout and Changing the Layout of All Live Objects, as well as the Cost for Generating Code for the New Memory Layout

### Table 4: Cost of the Monsoon and of the CRAY I A cost advantage of a factor of 3.85 is a good start- ing point for the Monsoon. However, with decreas- ing memory bandwidth, the quality of the Monsoon drops signi cantly ( gure 8). That does not occur for the CRAY I. The break-even point of both designs lies around a bandwidth of 8 bytes per 40d ( 20 ns). Thus, the Monsoon is only cost e cient for high band- width memory systems.

"... In PAGE 7: ...Table4 lists the cost of both optimized designs. Since RAMs due to their regularity require less chip area than other components, Massonne scaled the costs of the designs accordingly.... ..."

### Table VI. Break-even point (in seconds): Illustrates the time required for the optimization to pay o . If the unoptimized program version ran longer than the break-even point, performing trace scheduling rst and then running the optimized program version would perform better overall. See accompanying text for an explanation on how this values are computed. The compilation cost C0 includes the cost for applying standard optimizations to the application and inserting instrumentation utilized later by the dynamic trace scheduler. C1 includes the cost for reading the collected path pro ling data and re-optimizing the application using the trace scheduler that is guided by the path pro les.

2003

Cited by 28