(Enter summary)
Abstract: Despite large caches, main-memory access latencies still cause significant
performance losses in many applications. Numerous hardware and software
prefetching schemes have been proposed to tolerate these latencies.
Software prefetching typically provides better prefetch accuracy than hardware,
but is limited by prefetch instruction overheads and the compiler's
limited ability to schedule prefetches sufficiently far in advance to cover
level-two cache miss latencies. Hardware prefetching can be... (Update)
Cited by: More
Improving Cache Locality for Thread-Level Speculation Systems - Fung (2005)
(Correct)
Performance Implications Of Future-Generation Memory Systems.. - Fertig (2003)
(Correct)
Active bibliography (related documents): More All
3.6: Guided Region Prefetching: A Cooperative.. - Wang, Burger.. (2003)
(Correct)
3.6: Next-Generation Memory Systems - Wang (2004)
(Correct)
0.6: Predictor-Directed Data Prefetching for Pointer-based Applications - Sair (2003)
(Correct)
Similar documents based on text: More All
0.7: Reducing DRAM Latencies with an Integrated Memory Hierarchy Design - Lin, al. (2001)
(Correct)
0.4: A Static Filter for Reducing Prefetch Traffic - Srinivasan, Tyson, Davidson (1999)
(Correct)
0.4: An Automated Method for Software Controlled Cache Prefetching - Zucker, Lee, Flynn (1998)
(Correct)
Related documents from co-citation: More All
2: Prefetching using markov predictors
- Joseph, Grunwald - 1997
2: ective hardware-based data prefetching for highperformance processors (context) - Chen, Baer - 1995
BibTeX entry: (Update)
Z. Wang, D. Burger, K. S. McKinley, S. K. Reinhardt, and C. C. Weems, \Guided region prefetching: A cooperative hardware/software approach," in Proceedings of the 30th Annual International Symposium on Computer Architecture, 2003, pp. 388-397. http://citeseer.ist.psu.edu/598342.html More
@misc{ wang03guided,
author = "Zhenlin Wang and Doug Burger and Kathryn S. McKinley and Steven K. Reinhardt
and Charles C. Weems",
title = "Guided Region Prefetching: A Cooperative Hardware/Software Approach",
text = "Z. Wang, D. Burger, K. S. McKinley, S. K. Reinhardt, and C. C. Weems, \Guided
region prefetching: A cooperative hardware/software approach, in Proceedings
of the 30th Annual International Symposium on Computer Architecture, 2003,
pp. 388-397.",
year = "2003",
url = "citeseer.ist.psu.edu/598342.html" }
Citations (may not include all citations):
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991
443
Improving direct-mapped cache performance by the addition of..
- Jouppi - 1990
344
Design and evaluation of a compiler algorithm for prefetchin..
- Mowry, Lam et al. - 1992
162
Improving data locality with loop transformations
- McKinley, Carr et al. - 1996
149
Software prefetching (context) - Callahan, Kennedy et al. - 1991
137
Compiler optimizations for improving data locality
- Carr, McKinley et al. - 1994
121
An architecture for software-controlled data prefetching (context) - Klaiber, Levy - 1991
104
Prefetching using Markov predictors
- Joseph, Grunwald - 1997
104
Compiler-based prefetching for recursive data structures
- Luk, Mowry - 1996
98
Evaluating stream buffers as a secondary cache replacement (context) - Palacharla, Kessler - 1994
90
Reducing memory latency via non-blocking and prefetching cac..
- Chen, Baer - 1992
87
Computing Surveys (context) - Smith - 1982
73
Dependence based prefetching for linked data structures
- Roth, Moshovos et al. - 1998
67
The simplescalar tool set version
- Burger, Austin - 1997
48
Speculative precomputation: Long-range prefetching of delinq..
- Collins, Wang et al. - 2001
46
Basic block distribution analysis to find periodic behavior ..
- Sherwood, Perelman et al. - 2001
46
Precise miss analysis for program transformations with cache..
- Ghosh, Martonosi et al. - 1998
41
SPAID: Software prefetching in pointer- and call-intensive e..
- Lipasti, Schmidt et al. - 1995
38
Fixed and adaptive sequential prefetching in shared-memory m.. (context) - Dahlgren, Dubois et al. - 1993
38
Effective jump-pointer prefetching for linked data structure.. (context) - Roth, Sohi - 1999
37
Tolerating memory latency through software-controlled pre-ex..
- Luk - 2001
34
A prefetching technique for irregular accesses to linked dat..
- Karlsson, Dahlgren et al. - 2000
33
Load latency tolerance in dynamically scheduled processors
- Srinivasan, Lebeck - 1999
32
Speeding up irregular applications in shared memory multipro..
- Zhang, Torrellas - 1995
28
Data prefetching by dependence graph precomputation
- Annavaram, Patel et al. - 2001
22
Effectiveness of hardware-based stride and sequential prefet.. (context) - Dahlgren, Stenstrom - 1995
21
Generalized correlation-based hardware prefetching (context) - Charney, Reeves - 1995
20
Dead-block prediction and deadblock correlating prefetchers (context) - Lai, Fide et al. - 2001
18
Reducing DRAM latencies with an integrated memory hierarchy ..
- Lin, Reinhardt et al. - 2001
15
pull: Data movement for linked data structures (context) - Yang, Lebeck - 1997
15
Predictor-directed stream buffers
- Sherwood, Sair et al. - 2000
14
Data flow analysis for software prefetching linked data stru..
- Cahoon, McKinley - 2001
12
Using a user-level memory thread for correlation prefetching
- Solihin, Lee et al. - 2002
12
Exploring the design space of future CMPs
- Huh, Burger et al. - 2001
10
A compiler-assisted data prefetch controller
- Vanderwiel, Lijia - 1999
10
Distributed predictive cache design for high performance mem.. (context) - Alexander, Kedem - 1996
8
Design and evaluation of compiler algorithms for pre-executi..
- Kim, Yeung - 2002
6
content-directed data prefetching mechanism (context) - Cooksey, Jordan et al. - 2002
6
Efficient discovery of regular stride patterns in irregular .. (context) - Wu - 2002
6
Effective hardware based data prefetching (context) - Chen, Baer - 1995
5
Memory-side prefetching for linked data structures
- Hughes, Adve - 2001
3
integrated hardwaresoftware scheme shared memory multiproces.. (context) - Veidenbaum, hardware et al. - 1994
3
An effective programmable prefetch engine for highperformanc.. (context) - Chen - 1995
3
An overview of the SPHINX speech recoginition system (context) - Lee, Hon et al. - 1990
3
Hybrid compilerhardware prefetching multiprocessor using low.. (context) - Dubois, hardware et al. - 1997
3
Simple and effective array prefetching for Java
- Cahoon, McKinley - 2002
Documents on the same site (http://www.cs.utexas.edu/users/dburger/cv.html): More
Parallelizing Appbt for a Shared-Memory Multiprocessor - Burger (1995)
(Correct)
The SimpleScalar Tool Set, Version 2.0 - Burger, Austin (1997)
(Correct)
Accuracy vs. Performance in Parallel Simulation of.. - Burger, Wood (1995)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC