8 citations found. Retrieving documents...
Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. 25th International Symposium on Microarchitecture, pages 129--139, December 1992. PI n997

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Branch Prediction and Multithreading (V2) - Hily, Seznec (1996)   (Correct)

....to a synchronization (lock, barrier, and only for parallel applications (we did not simulate here the impact of context switches) Our study was focused on branch prediction, so we compared three different strategies: 2bit, gselect and gshare. Our model was inspired by the GAg predictor in [YP92b] It has the advantage of being simple, and should allow the prediction in one cycle. It should reduce the risk of bad predictions due to the use of non updated tables, which is important as we do not deal with the number of cycles normally necessary for the target address calculation. A ....

.... register with the lower order bits of the address [McF93] gselect and gshare are names introduced by McFarling in [McF93] When Return Address Stacks (RAS) are present, we used 12 entry stacks as in the DEC 21164 [Cor94] which should perform well and is a good compromise with complexity [YP92b] A stack is implemented as a circular register file, which means that when the end PI n997 10 S ebastien Hily Andr e Seznec STACK PC PC BTB direction inst. type taken BTB next inst. select PC (A) B) STACK PC Context i PHT inst. type direction next inst. select ....

Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. 25th International Symposium on Microarchitecture, pages 129--139, December 1992. PI n997


Dynamic Feature Selection for Hardware Prediction - Fern, Givan, Falsafi.. (2000)   (6 citations)  (Correct)

....SPECint95 suite. All benchmarks were simulated until completion and Table 1 shows the inputs and the number of static and dynamic branches for each of the benchmarks used. All predictor counters were updated directly after each prediction based on the true branch outcome. It was demonstrated in [30] that this counter update method as opposed to speculative updates or updating only after a branch has been resolved does not significantly impact the resulting accuracies. The GAp and PAp predictors utilized two bit saturating counters and were simulated using history lengths ranging from 8 to ....

Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. In Proceedings of the 25th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 25), pages 129--139, November 19921.


Fast Accurate Instruction Fetch and Branch Prediction - Calder, Grunwald (1994)   (8 citations)  (Correct)

.... were used as a mechanism for branch prediction, effectively predicting the prior behavior of a branch even small BTB s were found to be very effective [10, 15, 17] More recently, there has been considerable interest in using BTB s to reduce instruction misfetch penalties; for example Yeh et al. [21] propose using a very large BTB to improve prediction accuracy and reduce misfetch penalties. In fact, their BTB records a multitude of useful information to support wide issue processors. Wide issue processors fetch multiple instructions, roughly the size of a basic block. If a basic block ....

....instruction misfetch penalties, and why some architectures combine both BTB s and these accurate prediction mechanisms. 3 A BTB based instruction Fetch Architecture Figure 1 is a schematic representation of the branch prediction and instruction fetch architecture suggested by Yeh and Patt [21]. The current instruction address is concurrently offered to the instruction cache (not shown) providing the actual instruction, and to the BTB. A 32 entry return address stack handles return instructions. There are three important types of branches: direct or indirect branches, conditional ....

[Article contains additional citation context not shown here]

Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. In 25th Workshopon Microprogrammingand Microarchitecture, pages 129--139, Portland, Or, December 1992. ACM.


Branch Prediction Architectures for 64-bit Address Space - Brad Calder (1993)   (1 citation)  (Correct)

.... were used as a mechanism for branch prediction, effectively predicting the prior behavior of a branch even small BTB s were found to be very effective [13, 8, 12] More recently, there has been considerable interest in using BTB s to reduce instruction misfetch penalties; for example Yeh et al. [16] propose using a 16KB BTB to improve prediction accuracy and reduce misfetch penalties. In fact, their BTB records a multitude of useful information to support wide issue processors. Wide issue prcoessors fetch multiple instructions, roughly the size of a basic block. If a basic block address is ....

....as in the case of a BTB. There are myriad variations on the general idea; typically, the cache contains from 32 to 512 entries with varying degrees of associativity. The address of a branch site is used as a tag in the BTB, and matching data is used to predict the branch. As mentioned, Yeh [16] also uses this information to direct the instruction fetch for returns, unconditional jumps and indirect jumps. Other designs take the BTB and eliminate the site and target addresses from the table; hence the table only predicts the direction for conditional branches. These designs use the branch ....

[Article contains additional citation context not shown here]

Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. In 19th Annual International Symposium on Microarchitecture, pages 129--139, Portland, Or, December 1992. ACM.


The Precomputed Branch Architecture - Calder, Grunwald (1999)   (Correct)

....Originally, BTB s were used as a mechanism for branch prediction, effectively predicting the prior outcome of a branch [21, 29, 32] and providing the target address. Researchers have proposed associating additional branch prediction information with each BTB entry to improve branch prediction [41], and a variation of this technique has been implemented in the Intel Pentium and PentiumPro architectures. The problem with this technique is the branch prediction information can only be used on a BTB hit, or when when a branch address is found in the BTB. We call designs that associate branch ....

....simulation to compare the Precomputed Branch architecture to a design that makes aggressive use of branch target buffers. We simulated the decoupled branch architecture proposed in [5] because this architecture provides better overall branch performance than the coupled models proposed in [41] for the design space considered in this paper. In this section, we describe this architecture and follow that with a detailed description of the Precomputed Branch architecture. 3.1 A BTB based Instruction Fetch Architecture Figure 1 is a schematic representation of a conventional branch ....

[Article contains additional citation context not shown here]

Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. In 25th International Symposium on Microarchitecture, pages 129--139, Portland, Or, December 1992. ACM.


Reducing Indirect Function Call Overhead In C++ Programs - Calder, Grumwald (1994)   (77 citations)  (Correct)

....subroutines called from a single indirect function call. This makes branches easier to predict. Some dynamic branch prediction mechanisms achieve 95 Gamma97 prediction accuracy [21, 27, 29] This level of accuracy is needed for super scalar processors issuing several instructions per cycle [28]. The most relevant prior work was on predicting the destination of indirect function calls with hardware conducted by David Wall [26] while examining limits to instruction level parallelism. He states Little work has been done on predicting the destinations of indirect jumps, but it might pay ....

Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. In 19th Annual International Symposium on Microarchitecture, pages 129--139, Portland, Or, December 1992. ACM.


Managing Abstraction-Induced Complexity - David Keppel (1993)   (Correct)

....state. Fixed abstractions allow clients to perform branches without worrying about implementation details of how the processor pipeline is updated. Performance is typically good if branches are rare or the pipeline is short, but performance suffers with deep pipelines and frequent branches [YP92]. Fixed abstractions are generally simple to implement and use. They also provide control over just the operations that the client wants performed. Fixed implementation are also efficient when the implementation is a good match to the client s use of the service. However, fixed abstractions ....

....thread is mostly I O bound [PS83] Architectures provide adaptive branch prediction, where the hardware keeps information about each branch instruction. When a given instruction is executed, the predictor attempts to update pipeline state in a way that reflects common use of the branch instruction [HP90, YP92]. Adaptive systems generally perform well to the extent that they can deduce the needs of the client. However, adaptive systems are only as good as the tuning information that they deduce, and they can lag behind when they rely on past behavior to predict future needs. An adaptive system that ....

Tse-Yu Yeh and Yale N. Patt. A Comprehensive Instruction Fetch Mechanism for a Processor Supporting Speculative Execution. Proceedings of Micro-25 (0-8186-3175-9/92, IEEE), 1992.


Reducing Branch Costs via Branch Alignment - Calder, Grunwald (1994)   (41 citations)  (Correct)

....a form of branch alignment. Their variant of branch alignment only considered if then else constructs. Later, Bray and Flynn [4] extended the work of McFarling et al. while examining various branch target buffer (BTB) architectures. Yet, they also only examined if then else constructs. Yeh et al. [26] commented that with trace scheduling, taken branches could only be reduced from 62 of the executed conditional branches to 50 of executed conditional branches. The earlier study by Hwu and Chang [18] showed a 58 fall through rate after branch alignment. The papers by McFarling and ....

....prediction,such as branch target buffers (BTB) and pattern history tables (PHT) to accurately predict 90 95 of the branches. Originally, BTB s were used as a mechanism for branch prediction, effectively predicting the prior behavior of a branch even small BTB s were found to be very effective [4, 17, 20, 22, 26]. The Intel Pentium is an example of a current architecture using BTB s it has a 256 entry BTB organized as a 64 line four way associative cache. Only branches that are taken are entered into the BTB. If a branch address appears in the BTB, the stored address is used to fetch future ....

[Article contains additional citation context not shown here]

Tse-Yu Yeh and Yale N. Patt. A comprehensive instruction fetch mechanism for a processor supporting speculative execution. In 25th Annual International Symposium on Microarchitecture, pages 129-- 139, Portland, Or, December 1992. ACM.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC