See this document in CiteSeerX!

Design and Evaluation of a Compiler Algorithm for Prefetching (1992)  (Make Corrections)  (344 citations)
Todd C. Mowry, Monica S. Lam, Anoop Gupta
SIGPLAN Notices



  Home/Search   Context   Related

Links:   ACM   DBLP

 
View or download:
toronto.edu/~tcm/tcm_pap...mowry92.ps.Z
toronto.edu/~tcm/tcm_pap...mowry92.ps.Z
cmu.edu/~tcm/tcm_p...wry_asplos92.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  toronto.edu/~tcm/CourseECE1760 (more)
From:  toronto.edu/~tcm/CourseECE1760
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Software-controlled data prefetching is a promising technique for improving the performance of the memory subsystem to match today's high-performance processors. While prefetching is useful in hiding the latency, issuing prefetches incurs an instruction overhead and can increase the load on the memory subsystem. As a result, care must be taken to ensure that such overheads do not exceed the benefits. (Update)

Cited by:   More
Memory Latency Rediction via Data Prefetching and Data Forwarding .. - Poulsen (1994)   (Correct)
Data Trace Cache: An Application Specific Cache.. - Ramaswamy, Sreeram.. (2005)   (Correct)
Software Methods to Improve Data Locality and Cache Behavior - Beyls (2004)   (Correct)

Active bibliography (related documents):   More   All
0.5:   Chief: A Simulation Environment for Studying Parallel Systems - Pavlos Konas (1994)   (Correct)
0.4:   A Blocked All-Pairs Shortest-Paths Algorithm - Gayathri Venkataraman Sartaj   (Correct)
0.4:   Dynamic Access Ordering for Symmetric Shared-Memory Multiprocessors - McKee (1994)   (Correct)

Similar documents based on text:   More   All
0.4:   Hybrid Compiler/Hardware Prefetching for Multiprocessors.. - Skeppstedt, Dubois (1997)   (Correct)
0.4:   Compiler and Hardware Support for Automatic Instruction.. - Mowry, Luk (1998)   (Correct)
0.4:   A Compiler-Assisted Data Prefetch Controller - VanderWiel, Lilja   (Correct)

Related documents from co-citation:   More   All
31:   Tolerating latency through software-controlled prefetching in shared-memory mult.. - Mowry, Gupta - 1991
27:   Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Assoc.. - Jouppi - 1990
23:   Software prefetching (context) - Callahan, Kennedy et al. - 1991

BibTeX entry:   (Update)

T. C. Mowry, M. S. Lam, and A. Gupta. Design and evaluation of a compiler algorithm for prefetching. In ASPLOS-V, pages 62--73, October 1992. http://citeseer.ist.psu.edu/mowry92design.html   More

@inproceedings{ mowry92design,
    author = "Todd C. Mowry and Monica S. Lam and Anoop Gupta",
    title = "Design and evaluation of a compiler algorithm for prefetching",
    booktitle = "Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating System ({ASPLOS})",
    journal = "SIGPLAN Notices",
    volume = "27",
    number = "9",
    publisher = "ACM Press",
    address = "New York, NY",
    isbn = "0-89791-534-8",
    pages = "62--73",
    year = "1992",
    url = "citeseer.ist.psu.edu/mowry92design.html" }
Citations (may not include all citations):
2441   Johns Hopkins University Press (context) - Golub, Van Loan - 1989
496   Splash: Stanford parallel applications for shared memory (context) - Singh, Weber et al. - 1991
474   A data locality optimizing algorithm (context) - Wolf, Lam - 1991  ACM   DBLP
376   The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991  ACM   DBLP
353   Software pipelining: An effective scheduling technique for v.. (context) - Lam - 1988  ACM   DBLP
249   Tolerating latency through softwarecontrolled prefetching in.. - Mowry, Gupta - 1991
217   NASA Ames Research Center (context) - Bailey, Barton et al. - 1991
216   Strategies for cache and local memory management by global p.. (context) - Gannon, Jalby et al. - 1988  ACM   DBLP
176   Some Scheduling Techniques and an Easily Schedulable Horizon.. (context) - Rau, Glaeser - 1981  ACM
149   Software prefetching (context) - Callahan, Kennedy et al. - 1991  ACM   DBLP
130   A vliw architecture for a trace scheduling compiler (context) - Colwell, Nix et al. - 1987  ACM   DBLP
122   An effective on-chip preloading scheme to reduce data access.. (context) - Baer, Chen - 1991  ACM   DBLP
121   Architecture for softwarecontrolled data prefetching (context) - Klaiber, Levy - 1991
109   Comparative evaluation of latency reducing and tolerating te.. - Gupta, Hennessy et al. - 1991  ACM   DBLP
107   Software Methods for Improvement of Cache Performance on Sup.. (context) - Porterfield - 1989  ACM
83   CompilerDirected Data Prefetching in Multiprocessors with Me.. - Gornish, Granston et al. - 1990
82   On estimating and enhancing cache effectiveness (context) - Ferrante, Sarkar et al. - 1991  ACM   DBLP
50   Data access microarchitectures for superscalar processors wi.. - Chen, Mahlke et al. - 1991  ACM   DBLP
49   Overlapped loop support in the cydra (context) - Dehnert, Hsu et al. - 1989
42   Lockup free instruction fetchprefetch cache organization (context) - free, prefetch et al. - 1981
41   Sharlit: A tool for building optimizers - Tjiang, Hennessy - 1992
41   The impact of hierarchical memory systems on linear algebra .. (context) - Gallivan, Jalby et al. - 1987
28   Technical Report CSL-TR (context) - Smith, pixie - 1991
17   The Effectiveness of Caches and Data Prefetch Buffers in Lar.. (context) - Lee - 1987  ACM
11   Compile time analysis for data prefetching (context) - Gornish - 1989
9   The influence of memory hierarchy on algorithm organization:.. (context) - Gannon, Jalby - 1987
8   The organization of matrices and matrix operations in a page.. (context) - McKeller, Coffman - 1969
4   Automatic program transformations for virtual memory compute.. (context) - Abu-Sufah, Kuck et al. - 1979



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.eecg.toronto.edu/~tcm/CourseECE1760.html):   More
The Performance Impact of Flexibility in the Stanford FLASH.. - Heinrich (1994)   (Correct)
Synchronization and Communication in the T3E Multiprocessor - Scott (1996)   (Correct)
An Integrated Compile-Time/Run-Time Software.. - Dwarkadas, Cox.. (1996)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC