MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Performance analysis of parallelizing compilers on the Perfect Benchmarks programs (1992) [114 citations — 18 self]

Download:
Download as a PDF | Download as a PS
by William Blume, Rudolf Eigenmann
IEEE Transactions on Parallel and Distributed Systems
ftp://ftp.csrd.uiuc.edu/pub/CSRD_Reports/reports/1218.ps.gz
Add To MetaCart

Abstract:

We have studied the effectiveness of parallelizing compilers and the underlying transformation techniques. This paper reports the speedups of the Perfect Benchmarks TM codes that result from automatic parallelization. We have further measured the performance gains caused by individual restructuring techniques. Specific reasons for the successes and failures of the transformations are discussed, and potential improvements that result in measurably better program performance are analyzed. Our most important findings are that available restructurers often cause insignificant performance gains in real programs and that only few restructuring techniques contribute to this gain. However, we can also show that there is potential for advancing compiler technology so that many of the most important loops in these programs can be parallelized.

Citations

676 A data locality optimizing algorithm – Wolf, Lam - 1991
296 Advanced compiler optimizations for supercomputers – Padua, Wolfe - 1986
213 The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers – Berry, Chen, et al. - 1989
204 Supernode partitioning – Irigoin, Triolet - 1988
173 More iteration space tiling – Wolfe - 1989
169 Scanning polyhedra with DO loops – Ancourt, Irigoin - 1991
137 Practical dependence testing – Goff, Kennedy, et al. - 1991
110 Efficient and exact data dependence analysis – Maydan, Hennessy, et al.
105 Optimal loop parallelization – Aiken, Nicolau - 1988
102 Experience in the automatic parallelization of four perfect benchmark programs – Hoeflinger, Li, et al. - 1992
80 Compiler algorithms for synchronization – Midkiff, Padua - 1987
59 Runtime compilation methods for multicomputers – Wu, Saltz, et al. - 1991
26 An Effectiveness Study of Parallelizing Compiler Techniques – Eigenmann, Blume - 1991
24 The PERFECT club benchmarks: E ective performance evaluation of supercomputers – Sameh, Clementi, et al. - 1989
23 Vectorizing Compilers: A Test Suite and Results – Callahan, Dongarra, et al. - 1988
18 On reducing data synchronization in multiprocessed loops – Li, Abu-Sufah - 1987
16 Machine-Independent Evaluation of Parallelizing Compilers – Petersen, Padua - 1992
15 An Evaluation of Automatic and Interactive Parallel Programming Tools – Cheng, Pase - 1991
14 and Pen-Chung Yew. A Scheme to Enforce Data Dependence on Large Multiprocessor Systems – Zhu - 1987
12 Automatic Recognition of Induction & Recurrence Relations by Abstract Interpretation – Ammarguellat, Harrison - 1990
11 Removal of redundant dependences in DOACROSS loops with constant dependences – Krothapalli, Sadayappan - 1991
8 Performance Evaluation of three Automatic Vectorizer Packages – Arnold - 1982
7 The Effect of Restructuring Compilers on Program Performance for High-Speed Computers – Cytron, Kuck, et al. - 1985
6 Cedar Fortran and its Restructuring Compiler – Eigenmann, Hoeflinger, et al. - 1990
6 A Comparison Study of Automatically Vectorizing Fortran Compilers – Nobayashi, Eoyang - 1989
5 Optimization of data/control conditions in task graphs – Girkar, Polychronopoulos - 1992
4 An Evaluation of Vector FORTRAN 200 Generated by CYBER 205 and ETA-10 Pre-Compilation Tools – Braswell, Keech - 1988
3 Programmiertechniken fur die Vektorisierung – Detert - 1987
3 Homogeneous Boolean Algebras may have non-simple automorphism groups – Kasahara, Honda, et al. - 1990
3 A comparative study of KAP and VAST: two automatic preprocessors with Fortran 8x Output. Supercomputer 28 – Luecke, Coyle, et al. - 1988
2 Restructuring Fortran Programs for Cedar. to appear in Concurrency: Practice and Experience – Eigenmann, Hoeflinger, et al. - 1992
2 QCD Optimization Report – Hoeflinger - 1991
1 GTS: Parrallelization and vectorization of tight recurrences – Ayguad'e, Labarta, et al.
1 QCD Optimization Report – inger - 1991
1 A compilation scheme for macro-data ow computation on hierarchical multiprocessor systems – Kasahara, Honda, et al. - 1990