See this document in CiteSeerX!

Compiler Code Transformations for Superscalar-Based High-Performance Systems (1992)  (Make Corrections)  (18 citations)
Scott A. Mahlke, William Y. Chen, John C. Gyllenhaal, Wen-mei W. Hwu
Proceedings Supercomputing '92



  Home/Search   Context   Related

 
View or download:
uiuc.edu/IMPACT/ft...92optimization.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  uiuc.edu/Impact/peo...mahlke_pubs (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Exploiting parallelism at both the multiprocessor level and the instruction level is an effective means for supercomputers to achieve high-performance. The amount of instruction-level parallelism available to superscalar or VLIW node processors can be limited, however, with conventional compiler optimization techniques. In this paper, a set of compiler transformations designed to increase instruction-level parallelism is described. The effectiveness of these transformations is evaluated using... (Update)

Context of citations to this paper:   More

.... with other code optimizations, can increase instruction level parallelism and improve memory hierarchy locality [Alex93, Baco94, Davi94, Davi96, Mahl92], as described in chapter four and chapter five. A detailed example of loop unrolling has already been presented in...

...by the compiler and discusses the effect of loop unrolling on instruction size buffer and register pressure within the loop. Mahlke [19] studies optimizations which can increase instruction level parallelism for supercomputer. Loop unrolling is one of them. By analyzing...

Cited by:   More
The Impact SC140 Code Generator - Shannon (2002)   (Correct)
Runtime Predictability of Loops - de Alba, Kaeli (2001)   (Correct)
Compiler and Microarchitecture Mechanisms for Exploiting.. - Postiff (2001)   (Correct)

Active bibliography (related documents):   More   All
0.5:   Hyperblock Performance Optimizations For ILP Processors - August (1996)   (Correct)
0.2:   Using Profile Information to Assist Advanced.. - Chen, Mahlke.. (1992)   (Correct)
0.2:   Three Architectural Models for Compiler-Controlled.. - Chang, Warter.. (1995)   (Correct)

Similar documents based on text:   More   All
0.2:   Acceleration of First and Higher Order Recurrences on.. - Schlansker, Kathail (1993)   (Correct)
0.2:   Enhancing Instruction Level Parallelism Through.. - Bringmann (1995)   (Correct)
0.2:   Tolerating Data Access Latency with Register Preloading - William Chen (1992)   (Correct)

Related documents from co-citation:   More   All
6:   Trace Scheduling: A Technique for Global Microcode Compaction (context) - Fisher - 1981
6:   IMPACT: An architectural framework for multiple-instruction-issue processors - Chang, Mahlke et al. - 1991
6:   Dependence Graphs and Compiler Optimization (context) - Kuck - 1981

BibTeX entry:   (Update)

S. A. Mahlke, W. Y. Chen, J. C. Gyllenhaal, W. W. Hwu, P. P. Chang, and T. Kiyohara, "Compiler Code Transformations for Superscalar-Based High-Performance Systems," in Proceedings of Supercomputing `92, Nov. 1992. http://citeseer.ist.psu.edu/mahlke92compiler.html   More

@inproceedings{ mahlke92compiler,
    author = "Scott A. Mahlke and William Y. Chen and John C. Gyllenhaal and Wen-mei W. Hwu and Pohua P. Chang and Tokuzo Kiyohara",
    title = "Compiler Code Transformations for Superscalar-Based High-Performance Systems",
    booktitle = "Proceedings Supercomputing '92",
    publisher = "IEEE",
    address = "Minn., MN",
    pages = "808--817",
    year = "1992",
    url = "citeseer.ist.psu.edu/mahlke92compiler.html" }
Citations (may not include all citations):
1399   Compilers: Principles (context) - Aho, Sethi et al. - 1986
407   Trace scheduling: A technique for global microcode compactio.. (context) - Fisher - 1981
353   Software pipelining: An effective scheduling technique for V.. (context) - Lam - 1988
296   Free Software Foundation (context) - Stallman, porting - 1989
217   The PERFECT club benchmarks: Effective performance evaluatio.. - Berry - 1989
173   Bulldog: A Compiler for VLIW Architectures (context) - Ellis - 1985
171   Dependence graphs and compiler optimizations (context) - Kuck, Kuhn et al. - 1981
160   IMPACT: An architectural framework for multiple-instruction-.. - Chang, Mahlke et al. - 1991
130   A VLIW architecture for a trace scheduling compiler (context) - Colwell, Nix et al. - 1987
104   The Structure of Computers and Computations (context) - Kuck - 1978
90   Optimal loop parallelization (context) - Aiken, Nicolau - 1988
37   The Cydra 5 departmental supercomputer (context) - Rau, Yen et al. - 1989
33   KAP User's Guide (context) - Associates, Champaign - 1988
31   The superblock: An effective structure for VLIW and supersca.. (context) - Hwu - 1992
25   Instruction-level parallel processing (context) - Fisher, Rau - 1991
6   Compilation of arithmetic expressions for parallel computati.. (context) - Baer, Bovet - 1968
5   Combining as a compilation technique for VLIW architectures (context) - Nakatani, Ebcioglu - 1989
5   Code compaction for parallel architectures (context) - Anantha, Long - 1990
2   An instructionlevel performance analysis of the Multiflow TR.. (context) - Schuette, Shen - 1991



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.crhc.uiuc.edu/Impact/people/graduated/mahlke/mahlke_pubs.html):   More
Design And Implementation Of A Portable Global Code Optimizer - Mahlke (1992)   (Correct)
The Importance of Prepass Code Scheduling for.. - Chang, Lavery.. (1994)   (Correct)
Using Profile Information to Assist Classic Code Optimizations - Chang (1991)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC