by Carlos Molina, Jordi Tubella
International Conference on Supercomputing
ftp://ftp.ac.upc.es/pub/reports/DAC/1998/UPC-DAC-1998-22.ps.Z
Add To MetaCart
Abstract:
A mechanism for dynamic instruction-level reuse in superscalar microprocessors is presented. The underlying concept that the mechanism exploits is the run-time removal of redundant computations, and in particular the elimination of common subexpressions and invariants. Removing redundant computation is a target of optimizing compilers but sometimes they do not succeed due to their limited knowledge of the data. Moreover, the proposed mechanism can also remove quasi-redundant computations, such as subexpressions that often produce the same result but sometimes they differ, depending on the data values, and thus they cannot be eliminated by the compiler. Experimental results for the Spec95 show that on average the mechanism can avoid the execution of about 32 % of the dynamic instructions and provides a 1.10 speedup in a superscalar microprocessor. An extensive evaluation of different configurations and a comparison with previous schemes is presented, as well as the performance potential of a perfect reuse engine.
Citations
|
556
|
Structure and Interpretation of Computer Programs
– Abelson, Sussman, et al.
- 1984
|
|
351
|
Evaluating Future Microprocessors: the SimpleScalar Tool Set
– Burger, Austin, et al.
- 1996
|
|
231
|
Alternative implementations of twolevel adaptive branch prediction
– Yeh, Patt
- 1992
|
|
200
|
Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers
– Sohi
- 1990
|
|
155
|
Dynamic Instruction Reuse
– Sodani, Sohi
- 1997
|
|
54
|
An empirical analysis of instruction repetition
– Sodani, Sohi
- 1998
|
|
44
|
Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation
– Richardson
- 1992
|
|
36
|
Understanding the differences between value prediction and instruction reuse
– Sodani, Sohi
- 1998
|
|
32
|
An architectural alternative to optimizing compilers
– Harbison
- 1982
|
|
32
|
Exploiting basic block value locality with block reuse
– Huang, Lilja
|
|
27
|
Exploiting Trivial and Redundant Computation
– Richardson
- 1993
|
|
10
|
On Division and Reciprocal Caches
– Oberman, Flynn
- 1995
|
|
7
|
Low Power Data Processing by Elimination of Redundant Computations
– Azam, Franzon, et al.
- 1997
|
|
2
|
Developing a tool for memoizing functions in C
– McNamee, Hall
- 1998
|
|
2
|
Dynamic Elimination of PointerExpressions
– Weinberg, Nagle
- 1998
|
|
1
|
The Performance Potential of Data Value Reuse
– González, Tubella, et al.
- 1998
|