(Enter summary)
Abstract: Software-controlled data prefetching is a promising technique for
improving the performance of the memory subsystem to match
today's high-performance processors. While prefetching is useful in
hiding the latency, issuing prefetches incurs an instruction overhead
and can increase the load on the memory subsystem. As a result,
care must be taken to ensure that such overheads do not exceed the
benefits. (Update)
Cited by: More
Memory Latency Rediction via Data Prefetching and Data Forwarding .. - Poulsen (1994)
(Correct)
Data Trace Cache: An Application Specific Cache.. - Ramaswamy, Sreeram.. (2005)
(Correct)
Software Methods to Improve Data Locality and Cache Behavior - Beyls (2004)
(Correct)
Active bibliography (related documents): More All
0.5: Chief: A Simulation Environment for Studying Parallel Systems - Pavlos Konas (1994)
(Correct)
0.4: A Blocked All-Pairs Shortest-Paths Algorithm - Gayathri Venkataraman Sartaj
(Correct)
0.4: Dynamic Access Ordering for Symmetric Shared-Memory Multiprocessors - McKee (1994)
(Correct)
Similar documents based on text: More All
0.4: Hybrid Compiler/Hardware Prefetching for Multiprocessors.. - Skeppstedt, Dubois (1997)
(Correct)
0.4: Compiler and Hardware Support for Automatic Instruction.. - Mowry, Luk (1998)
(Correct)
0.4: A Compiler-Assisted Data Prefetch Controller - VanderWiel, Lilja
(Correct)
Related documents from co-citation: More All
31: Tolerating latency through software-controlled prefetching in shared-memory mult..
- Mowry, Gupta - 1991
27: Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Assoc..
- Jouppi - 1990
23: Software prefetching (context) - Callahan, Kennedy et al. - 1991
BibTeX entry: (Update)
T. C. Mowry, M. S. Lam, and A. Gupta. Design and evaluation of a compiler algorithm for prefetching. In ASPLOS-V, pages 62--73, October 1992. http://citeseer.ist.psu.edu/mowry92design.html More
@inproceedings{ mowry92design,
author = "Todd C. Mowry and Monica S. Lam and Anoop Gupta",
title = "Design and evaluation of a compiler algorithm for prefetching",
booktitle = "Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating System ({ASPLOS})",
journal = "SIGPLAN Notices",
volume = "27",
number = "9",
publisher = "ACM Press",
address = "New York, NY",
isbn = "0-89791-534-8",
pages = "62--73",
year = "1992",
url = "citeseer.ist.psu.edu/mowry92design.html" }
Citations (may not include all citations):
2441
Johns Hopkins University Press (context) - Golub, Van Loan - 1989
496
Splash: Stanford parallel applications for shared memory (context) - Singh, Weber et al. - 1991
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991 ACM DBLP
376
The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991 ACM DBLP
353
Software pipelining: An effective scheduling technique for v.. (context) - Lam - 1988 ACM DBLP
249
Tolerating latency through softwarecontrolled prefetching in..
- Mowry, Gupta - 1991
217
NASA Ames Research Center (context) - Bailey, Barton et al. - 1991
216
Strategies for cache and local memory management by global p.. (context) - Gannon, Jalby et al. - 1988 ACM DBLP
176
Some Scheduling Techniques and an Easily Schedulable Horizon.. (context) - Rau, Glaeser - 1981 ACM
149
Software prefetching (context) - Callahan, Kennedy et al. - 1991 ACM DBLP
130
A vliw architecture for a trace scheduling compiler (context) - Colwell, Nix et al. - 1987 ACM DBLP
122
An effective on-chip preloading scheme to reduce data access.. (context) - Baer, Chen - 1991 ACM DBLP
121
Architecture for softwarecontrolled data prefetching (context) - Klaiber, Levy - 1991
109
Comparative evaluation of latency reducing and tolerating te..
- Gupta, Hennessy et al. - 1991 ACM DBLP
107
Software Methods for Improvement of Cache Performance on Sup.. (context) - Porterfield - 1989 ACM
83
CompilerDirected Data Prefetching in Multiprocessors with Me..
- Gornish, Granston et al. - 1990
82
On estimating and enhancing cache effectiveness (context) - Ferrante, Sarkar et al. - 1991 ACM DBLP
50
Data access microarchitectures for superscalar processors wi..
- Chen, Mahlke et al. - 1991 ACM DBLP
49
Overlapped loop support in the cydra (context) - Dehnert, Hsu et al. - 1989
42
Lockup free instruction fetchprefetch cache organization (context) - free, prefetch et al. - 1981
41
Sharlit: A tool for building optimizers
- Tjiang, Hennessy - 1992
41
The impact of hierarchical memory systems on linear algebra .. (context) - Gallivan, Jalby et al. - 1987
28
Technical Report CSL-TR (context) - Smith, pixie - 1991
17
The Effectiveness of Caches and Data Prefetch Buffers in Lar.. (context) - Lee - 1987 ACM
11
Compile time analysis for data prefetching (context) - Gornish - 1989
9
The influence of memory hierarchy on algorithm organization:.. (context) - Gannon, Jalby - 1987
8
The organization of matrices and matrix operations in a page.. (context) - McKeller, Coffman - 1969
4
Automatic program transformations for virtual memory compute.. (context) - Abu-Sufah, Kuck et al. - 1979
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.eecg.toronto.edu/~tcm/CourseECE1760.html): More
The Performance Impact of Flexibility in the Stanford FLASH.. - Heinrich (1994)
(Correct)
Synchronization and Communication in the T3E Multiprocessor - Scott (1996)
(Correct)
An Integrated Compile-Time/Run-Time Software.. - Dwarkadas, Cox.. (1996)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC