8 citations found. Retrieving documents...
Kuck and Associates. Kuck and Associates C++ User's Guide.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Mayfly A Pattern for Lightweight Generic Interfaces - Siek, Lumsdaine (1999)   (1 citation)  (Correct)

....are met: 1. The functions are # small. # not recursive. # not virtual. 2. The objects must be on the stack, not dynamically allocated. 5 7. 1 Performance Optimization Discussion When these conditions are met, compilers can apply function inlining and lightweight object optimization [7, 13] to completely remove the overhead due to the function calls and to the objects. A function call introduces overhead because it takes on the order of 30 instructions (most of which are loads and stores) to create and fill a new frame on the stack. This is significant when the function body ....

Kuck and Associates. Kuck and Associates C++ User's Guide.


A Modern Framework for Portable High Performance Numerical Linear.. - Siek (1999)   (6 citations)  (Correct)

.... [16, 18] compile time prime number calculations [54] Tuned MTL algorithms for high performance Tiling and blocking techniques [10, 11, 12, 14, 32, 34, 35, 39, 60, 61] automatically tuned libraries [7, 59] Proved that iterators can be used in high performance arenas Optimizing compilers [33, 41], lightweight object optimization, inlining Created the Mayfly pattern Andrew Lumsdaine thought of the name Designed the ITL interface ITL implementation by Andrew Lumsdaine and Rich Lee Table 1.1. Breakdown of personal accomplishments vs. others related work and work used in this thesis. ....

....fashion until there are only basic data types (integers, floats, etc. Then each reference through an object to one of its parts is replaced with a direct reference to the part. Note that this is only applied to objects on the stack (local variables) The Kuck and Associates C compiler [33] performs this optimization, which is also known as scalar replacement of aggregates [41] The end result is that the data items within the objects can then be mapped appropriately to CHAPTER 6. HIGH PERFORMANCE 65 machine registers by the normal register allocation algorithms. With the use of ....

[Article contains additional citation context not shown here]

Kuck and Associates. Kuck and Associates C++ User's Guide.


Performance Benchmarking of Object Oriented MPI (OOMPI).. - Rijks, Squyres.. (1999)   (2 citations)  (Correct)

....OOMPI version 1.0.2g creates negligible overhead on top of the underlying MPI. This is because of both the thin design of OOMPI and the efficiency of modern optimizing C compilers. The use of inlining and of small object optimizations allow the compiler to turn OOMPI into a truly thin layer [7 9]. OOMPI and its design are discussed fully in [12] and [11] Section 2 describes the experimental setup used to perform timing experiments. Section 3 provides performance timings and results. Finally, Section 4 discusses some of the ramifications of the results. 2 Experimental Setup The ....

Kuck and Associates. Kuck and Associates C++ User's Guide.


The Matrix Template Library: A Unifying Framework for.. - Siek, Lumsdaine (1998)   (3 citations)  (Correct)

....The most exciting aspect of the Matrix Template Library is that we can provide a very high level of performance. Here we present the performance results for dense column oriented matrix vector product and sparse row oriented matrix vector product. The compilers used were Kuck and Associates C [5], and the Sun Solaris C compiler with maximum available optimizations. The experiments were run on a Sun UltraSPARC 170E. Fig. 4 shows the dense matrix vector performance for MTL, Fortran BLAS, the Sun Performance Library, and TNT [10] 0 50 100 150 200 250 300 350 400 450 0 50 100 150 N MTL ....

Kuck and Associates. Kuck and Associates C++ User's Guide.


Generic Programming for High Performance Numerical Linear.. - Siek, Lumsdaine, Lee (1998)   (9 citations)  (Correct)

....(both public domain and vendor supplied) Fig. 4 shows the dense matrix matrix product performance for MTL, Fortran BLAS, the Sun Performance Library, TNT [16] and ATLAS [17] all obtained on a Sun UltraSPARC 170E. The MTL and TNT executables were compiled using Kuck and Associates C (KCC) [11], in conjunction with the Solaris C compiler. ATLAS was compiled with the Solaris C compiler and the Fortran BLAS (obtained from Netlib) were compiled with the Solaris Fortran 77 compiler. All possible compiler optimization flags were used in all cases. To demonstrate portability across different ....

Kuck and Associates. Kuck and Associates C++ User's Guide.


A Rational Approach to Portable High Performance: The Basic.. - Siek, Lumsdaine (1998)   (4 citations)  (Correct)

....added overhead in the layering because all the function calls are inlined. Using the FAST library allows the BLAIS routines to be expressed in a very simple and elegant fashion. The BLAIS library specification consists of fixed size vector vector, matrix vector, and matrix matrix routines. int x[4] = 1,1,1,1, y[4] 2,2,2,2; STL template class InIter1, InIter2, OutIter, BinaryOp OutIter transform(InIter1 first1, InIter1 last1, InIter2 first2, OutIter result, BinaryOp binaryop) transform(x, x 4, y, y, plus int ( FAST template int N, class InIter1, class InIter2, class ....

....the layering because all the function calls are inlined. Using the FAST library allows the BLAIS routines to be expressed in a very simple and elegant fashion. The BLAIS library specification consists of fixed size vector vector, matrix vector, and matrix matrix routines. int x[4] 1,1,1,1, y[4] = 2,2,2,2; STL template class InIter1, InIter2, OutIter, BinaryOp OutIter transform(InIter1 first1, InIter1 last1, InIter2 first2, OutIter result, BinaryOp binaryop) transform(x, x 4, y, y, plus int ( FAST template int N, class InIter1, class InIter2, class OutIter, class BinOp ....

[Article contains additional citation context not shown here]

Kuck and Associates. Kuck and Associates C++ User's Guide.


The Matrix Template Library: A Unifying Framework for.. - Siek, Lumsdaine (1998)   (3 citations)  (Correct)

No context found.

Kuck and Associates. Kuck and Associates C++ User's Guide.


A Rational Approach to Portable High Performance: The Basic.. - Siek, Lumsdaine (1998)   (4 citations)  (Correct)

No context found.

Kuck and Associates. Kuck and Associates C++ User's Guide.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC