(Enter summary)
Abstract: Instruction cache miss latency is becoming an increasingly important performance bottleneck, especially
for commercial applications. Although instruction prefetching is an attractive technique for
tolerating this latency, we find that existing prefetching schemes are insufficient for modern superscalar
processors since they fail to issue prefetches early enough (particularly for non-sequential accesses)
. To overcome these limitations, we propose a new instruction prefetching technique whereby... (Update)
Context of citations to this paper: More
...1400 pf d 5000 5000: bz R,1400 E 1160: 1400: main( 1060: 1000: 1140: Figure 2. Example of prefetch insertion. prefetch indirect jumps [6]. However, since our experimental results indicate that the marginal performance benefit of supporting indirect prefetches is quite small...
.... instructions ahead of branch points if they are not covered by next N line prefetching and if they are estimated to cause cache misses [21]. Luk and Mowry compare their hybrid technique to next N line prefetching on its own and Markov prefetching. Next N line prefetching...
Cited by: More
Cache Prefetching - Berg (2002)
(Correct)
Cooperative Prefetching: Compiler and Hardware Support for.. - Luk, Mowry (1998)
(Correct)
Active bibliography (related documents): More All
0.2: Code Placement using Temporal Profile Information - Gloy (1998)
(Correct)
0.2: Branch History Guided Instruction Prefetching - Srinivasan, Davidson, Tyson.. (2001)
(Correct)
0.1: High-Performance Frontends for Trace Processors - Jacobson (1999)
(Correct)
Similar documents based on text: More All
0.4: "prefetching For Reducing Cache Misses" - Prepared By Ayse
(Correct)
0.4: Design and Evaluation of a Compiler Algorithm for Prefetching - Mowry, Lam, Gupta (1992)
(Correct)
0.1: Tolerating Latency by Prefetching Java Objects - Cahoon, McKinley (1999)
(Correct)
Related documents from co-citation: More All
2: Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Assoc..
- Jouppi - 1990
2: Prefetching using markov predictors
- Joseph, Grunwald - 1997
2: Contrasting characteristics and cache performance of technical and multi-user co.. (context) - Maynard, Donnelly et al. - 1994
BibTeX entry: (Update)
C.-K. Luk and T. C. Mowry. Compiler and hardware support for automatic instruction prefetching: A cooperative approach. Technical Report CMU-CS-98-140, Carnegie Mellon University, June 1998. http://citeseer.ist.psu.edu/mowry98compiler.html More
@misc{ luk98compiler,
author = "C. Luk and T. Mowry",
title = "Compiler and hardware support for automatic instruction prefetching: A
cooperative approach",
text = "C.-K. Luk and T. C. Mowry. Compiler and hardware support for automatic
instruction prefetching: A cooperative approach. Technical Report CMU-CS-98-140,
Carnegie Mellon University, June 1998.",
year = "1998",
url = "citeseer.ist.psu.edu/mowry98compiler.html" }
Citations (may not include all citations):
866
Techniques and Tools (context) - Aho, Sethi et al. - 1986
443
Improving direct-mapped cache performance by the addition of..
- Jouppi - 1990
222
MIPS RISC Architecture (context) - Kane, Heinrich - 1992
136
superscalar microprocessor (context) - Yeager - 1996
104
Prefetching using markov predictors
- Joseph, Grunwald - 1997
87
Computing Surveys (context) - Smith - 1982
67
Contrasting characteristics and cache performance of technic.. (context) - Maynard, Donnelly et al. - 1994
59
Branch history table prediction of moving target branches du.. (context) - Kaeli, Emma - 1991
36
Sequential program prefetching in memory hierarchies (context) - Smith - 1978
34
Prefetching in supercomputer instruction caches (context) - Smith, Hsu - 1992
29
Data prefetching on the HP PA (context) - Santhanam, Gornish et al. - 1997
29
Compiler techniques for data prefetching on the PowerPC (context) - Bernstein, Cohen et al. - 1995
11
Instruction prefetching of system codes with layout optimize..
- Xia, Torrellas - 1996
4
Wrong-path prefetching (context) - Pierce, Mudge - 1996
2
The Postgres95 User Manuel v (context) - Yu, Chen - 1996
Documents on the same site (http://reports-archive.adm.cs.cmu.edu/cs1998.html): More
Pthreads for Dynamic Parallelism - Narlikar, Blelloch (1998)
(Correct)
Verification of Floating-Point Adders - Chen, Bryant (1998)
(Correct)
Proving Correctness of a Controller Algorithm for the RAID.. - Vaziri, Lynch, Wing (1997)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC