Optimizing compilers have become an essential component in achieving high levels of performance. Various simple and sophisticated optimizations are implemented at different stages of compilation to yield significant improvements, but little work has been done in characterizing the effectiveness of optimizers, or in understanding where most of this improvement comes from. In this paper we study the performance impact of optimization in the context of our methodology for CPU performance characterization based on the abstract machine model. The abstract machine model considers all machines to be different implementations of the same high level language machine; in previous research, we have used this model as a basis to analyze machine and benchmark performance. In this paper, we: 1) show that our model can be extended to characterize the performance improvement provided by optimizers and to predict the run time of optimized programs; 2) measure the effectiveness of several optimizing compilers in implementing different optimization techniques; and 3) analyze the optimization opportunities present in the Fortran SPEC benchmarks and other benchmarks. 1.
|
676
|
A data locality optimizing algorithm
– Wolf, Lam
- 1991
|
|
344
|
Dependence Analysis for Supercomputing
– Banerjee
- 1988
|
|
342
|
Register allocation and spilling via graph coloring
– Chaitin
- 1982
|
|
296
|
Advanced compiler optimizations for supercomputers
– Padua, Wolfe
- 1986
|
|
293
|
Automatic Translation of FORTRAN Programs to Vector Form
– Allen, Kennedy
- 1987
|
|
240
|
Software prefetching
– Callahan, Kennedy, et al.
- 1991
|
|
200
|
Improving register allocation for subscripted variables
– Callahan, Carr, et al.
- 1990
|
|
135
|
Software methods for improvement of cache performance on supercomputer applications
– Porterfield
- 1989
|
|
124
|
Compiler optimization for Fortran D on MIMD distributed-memory machines
– Hiranandani, Kennedy, et al.
- 1991
|
|
105
|
An empirical study of FORTRAN programs
– Knuth
- 1971
|
|
100
|
On estimating and enhancing cache effectiveness
– Ferrante, Sarkar, et al.
- 1991
|
|
90
|
Register allocation by priority-based coloring
– Chow, Hennessey
- 1984
|
|
78
|
Analysis of benchmark characteristics and benchmark performance prediction
– Saavedra, Smith
- 1996
|
|
76
|
Reduced instruction set computers
– PATTERSON
- 1985
|
|
75
|
Supercomputer performance evaluation and the Perfect Benchmarks
– CYBENKO, KIPP, et al.
- 1990
|
|
65
|
A portable machine-independent global optimizer — design and measurements
– Chow
- 1983
|
|
60
|
Machine characterization BASed on an abstract high level machine
– Saavedra-Barrera, Smith, et al.
- 1989
|
|
47
|
CPU performance evaluation and execution time prediction using narrow spectrum benchmarking
– Saavedra-Barrera
- 1992
|
|
41
|
An empirical investigation of the effectiveness of and limitations of automatic parallelization
– Singh, Hennessy
- 1991
|
|
23
|
Vectorizing Compilers: A Test Suite and Results
– Callahan, Dongarra, et al.
- 1988
|
|
21
|
An instruction timing model of cpu performance
– Peuto, Shustek
- 1977
|
|
19
|
Effectiveness of a machine-level, global optimizer
– Johnson, Miller
- 1986
|
|
14
|
Computer benchmarking: Paths and pitfalls
– DONGARRA, MARTIN, et al.
- 1987
|
|
12
|
An Analysis of MIPS and SPARC Instruction Set Utilization on the SPEC Benchmarks, Proceeding of the 4th Architectural Support for Programming Languages and Operating Systems
– Cmelik
- 1991
|
|
12
|
Interprocedural Optimization: Experimental Results
– Richardson, Ganapathi
- 1989
|
|
11
|
Measurement of program improvement algorithms
– Cocke, Markstein
- 1980
|
|
9
|
Engineering a RISC compiler system
– Chow, Himelstein, et al.
- 1986
|
|
7
|
Benchmarking and The Abstract Machine Characterization Model
– Saavedra-Barrera, Smith
- 1990
|
|
6
|
A Comparison Study of Automatically Vectorizing Fortran Compilers
– Nobayashi, Eoyang
- 1989
|
|
6
|
SPEC Newsletter: Benchmark Results
– SPEC
- 1990
|
|
5
|
Understanding Supercomputer Benchmarks", Datamation
– Worlton
- 1984
|
|
4
|
An Evaluation of Vector FORTRAN 200 Generated by CYBER 205 and ETA-10 Pre-Compilation Tools
– Braswell, Keech
- 1988
|
|
4
|
An Analytical Look at Linear Performance Models
– Ponder
- 1990
|
|
3
|
Language- and machine-independent global optimization on intermediate code
– Bal, Tanenbaum
- 1986
|
|
2
|
Vector Optimization on the Cyber 205
– Arnold
- 1983
|
|
2
|
The relative effects of optimization on instruction architecture performance
– Cuderman, Flynn
- 1989
|
|
2
|
Where are the Optimizing Compilers
– Wolfe, Macke
- 1985
|
|
1
|
Optimizing Compilers Are Here (mostly
– Jazayeri, Haden
- 1986
|
|
1
|
Directed Benchmarks for CPU Architecture Evaluation
– Lindsay, Bell
|
|
1
|
Here Are (Some of) the Optimizing Compilers
– Muchnick
- 1986
|
|
1
|
The SPEC and Perfect Club Benchmarks
– Saavedra-Barrera
- 1990
|
|
1
|
Analysis and Performance of Instruction Sets
– Shustek
- 1978
|