by Scott A. Mahlke, Richard E. Hank, Roger A. Bringmann, John C. Gyllenhaal, David M. Gallagher, Wen-mei W. Hwu
http://www.crhc.uiuc.edu/ece412/papers/micro-94-prebranch.pdf
Add To MetaCart
Abstract:
Branch instructions are recognized as a major impediment to exploiting instruction level parallelism. Even with sophisticated branch prediction techniques, many frequently executed branches remain di cult to predict. An architecture supporting predicated execution may allow the compiler to remove many of these hard-to-predict branches, reducing the number of branch mispredictions and thereby improving performance. We present an in-depth analysis of the characteristics of those branches which are frequently mispredicted and examine the e ectiveness of an advanced compiler to eliminate these branches. Over the benchmarks studied, an average of 27 % of the dynamic branches and 56 % of the dynamic branch mispredictions are eliminated with predicated execution support. 1
Citations
|
560
|
Trace scheduling: A technique for global microcode compaction
– Fisher
- 1981
|
|
365
|
A Study of Branch Prediction Strategies
– Smith
- 1981
|
|
333
|
Limits of instruction-level parallelism
– Wall
- 1991
|
|
264
|
Effective compiler support for predicated execution using the hyperblock
– Mahlke, Lin, et al.
- 1992
|
|
216
|
Branch prediction strategies and branch target buffer design,” Computer
– Lee, Smith
- 1984
|
|
214
|
Conversion of control dependence to data dependence
– ALLEN, KENNEDY, et al.
|
|
145
|
Predicting conditional branch directions from previous runs of a program
– Fisher, Freudenberger
- 1992
|
|
142
|
Branch prediction for free
– Ball, Larus
- 1993
|
|
136
|
Two-level adaptive training branch prediction
– Yeh, Patt
- 1991
|
|
134
|
Highly Concurrent Scalar Processing
– Hsu
- 1986
|
|
103
|
Limits on multiple instruction issue
– Smith, Johnson, et al.
- 1989
|
|
94
|
Overlapped loop support in the Cydra 5
– Dehnert, Hsu, et al.
- 1989
|
|
91
|
The cydra 5 departmental supercomputer
– Rau, Yen, et al.
- 1989
|
|
86
|
On predicated execution
– Park, H, et al.
- 1991
|
|
62
|
Single instruction stream parallelism is greater than two
– Butler, Yeh, et al.
- 1991
|
|
47
|
Comparing software and hardware schemes for reducing the cost of branches
– Hwu, Conte, et al.
- 1989
|
|
32
|
Guarded execution and branch prediction in dynamic ILP processors
– Pnevmatikatos, Sohi
- 1994
|
|
21
|
HPL PlayDoh architecture speci cation: Version 1.0
– Kathail, Schlansker, et al.
- 1994
|
|
10
|
The Superblock: An e ective technique for VLIW and superscalar compilation
– Hwu, Mahlke, et al.
- 1993
|