See this document in CiteSeerX!

Acceleration of First and Higher Order Recurrences on Processors with Instruction Level Parallelism (1993)  (Make Corrections)  (5 citations)
Michael Schlansker, Vinod Kathail
Languages and Compilers for Parallel Computing



  Home/Search   Context   Related

 
View or download:
hp.com/research/itc/...Acceleration.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  hp.com/research/itc/car/papers... (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This report describes parallelization techniques for accelerating a broad class of recurrences on processors with instruction level parallelism. We introduce a new technique, called blocked back-substitution, which has lower operation count and higher performance than previous methods. The blocked back-substitution technique requires unrolling and non-symmetric optimization of innermost loop iterations. We present metrics to characterize the performance of software-pipelined loops and compare... (Update)

Context of citations to this paper:   More

...by the optimization. Recently, transformations have been proposed which require that the loop be unrolled. Blocked backsubstitution [17] unrolls the loop b times and reduces the RecMII by a factor of b. Control recurrences within loops can also be accelerated by a factor of b...

...using a variety of transformations. These include expression re association, tree height reduction [11] and blocked back substitution [17]. Although ILP compilers may aggressively restructure computation, they typically preserve the program s original control structure. This...

Cited by:   More
Iterative Modulo Scheduling: An Algorithm for Software Pipelining.. - Rau (1994)   (Correct)
Height Reduction of Control Recurrences for ILP Processors - Michael Schlansker Vinod (1994)   (Correct)
Modulo Scheduling, Machine Representations, and.. - Eichenberger (1997)   (Correct)

Active bibliography (related documents):   More   All
0.7:   Solving Linear Recurrences with Loop Raking - Guy Blelloch School (1992)   (Correct)
0.5:   Control CPR: A Branch Height Reduction Optimization for .. - Schlansker, Mahlke.. (1999)   (Correct)
0.3:   Loop Optimization Techniques On Multi-Issue Architectures - Kaiser   (Correct)

Similar documents based on text:   More   All
1.1:   Parallelization of Control Recurrences for ILP Processors - Schlansker, Kathail, Anik (1994)   (Correct)
0.2:   Automatic architectural synthesis of VLIW and EPIC processors - Aditya, Rau, Kathail   (Correct)
0.2:   Compiler Code Transformations for Superscalar-Based.. - Mahlke, Chen.. (1992)   (Correct)

Related documents from co-citation:   More   All
4:   Some scheduling techniques and an easily schedulable horizontal architecture for.. (context) - Rau, Glaeser - 1981
4:   Trace Scheduling: A Technique for Global Microcode Compaction (context) - Fisher - 1981
4:   Parallelization of loops with exits on pipelined architectures (context) - Tirumalai, Lee et al. - 1990

BibTeX entry:   (Update)

M. Schlansker and V. Kathail, "Acceleration of first and higher order recurrences on processors with instruction level parallelism," in Proceedings of Languages and Compilers for Parallel Computing, 6th International Workskop, August 1993. http://citeseer.ist.psu.edu/schlansker93acceleration.html   More

@inproceedings{ schlansker93acceleration,
    author = "Michael S. Schlansker and Vinod Kathail",
    title = "Acceleration of First and Higher Order Recurrences on Processors with Instruction Level Parallelism",
    booktitle = "Languages and Compilers for Parallel Computing",
    pages = "406-429",
    year = "1993",
    url = "citeseer.ist.psu.edu/schlansker93acceleration.html" }
Citations (may not include all citations):
407   Trace Scheduling: A Technique for Global Microcode Compactio.. (context) - Fisher - 1981
176   Some Scheduling Techniques and an Easily Schedulable Horizon.. (context) - Rau, Glaeser - 1981
164   The Superblock: An Effective Technique for VLIW and Supersca.. (context) - Hwu - 1993
156   The Multiflow Trace Scheduling Compiler - Lowney - 1993
104   The structure of Computers and Computations (context) - Kuck - 1978
66   A Systolic Array Optimizing Compiler (context) - Lam - 1987
46   The Journal of Supercomputing (context) - Dehnert, Towle et al. - 1993
25   Recognizing and Parallelizing Bounded Recurrences (context) - Callahan - 1991
24   ACM Transactions on Mathematical Software (context) - Wang, Method et al. - 1981
21   Practical Parallel Band Triangular System Solvers (context) - Chen, Kuck et al. - 1978
14   Some Aspects of the Cyclic Reduction Algorithm for Block Tri.. (context) - Heller - 1976
13   Solving Triangular Systems on a Parallel Computer (context) - Sameh, Brent - 1977
11   Data Flow and Dependence Analysis for Instruction Level Para.. (context) - Rau - 1992
10   Parallel Tridiagonal Equation Solvers (context) - Stone - 1975
10   Compiling Techniques for First-Order Linear Recurrences on a.. (context) - Tanaka - 1988
7   Time and Parallel Processor Bounds for Linear Recurrence Sys.. (context) - Chen, Kuck - 1975
4   Code Generation Schemas for Modulo Scheduled DO-Loops and WH.. (context) - Rau, Schlansker et al. - 1992
2   Vectorization of Linear Recurrence Relations (context) - Van Der Vorst, Dekker - 1989
2   Acceleration of Algebraic Recurrences on Processors with Ins.. (context) - Schlansker, Kathail - 1993



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.hpl.hp.com/research/itc/car/papers/):   More
Code Size Minimization and Retargetable Assembly for custom .. - Aditya, Mahlke, Rau (2000)   (Correct)
Automatic architectural synthesis of VLIW and EPIC processors - Aditya, Rau, Kathail   (Correct)
Parallelization of Control Recurrences for ILP Processors - Schlansker, Kathail, Anik (1994)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC