See this document in CiteSeerX!

Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors (1991)  (Make Corrections)  (249 citations)
Todd Mowry, Anoop Gupta
Journal of Parallel and Distributed Computing



  Home/Search   Context   Related

Links:   ACM   DBLP

 
View or download:
cmu.edu/~tcm/tcm_p...mowry_jpdc91.ps.gz
toronto.edu/~tcm/tcm_papers...jpdc.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/~tcm/Papers (more)
From:  toronto.edu/~tcm/Papers
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Although the provision of coherent caches in many recent machines has alleviated the problem somewhat, cache misses still occur frequently enough that they significantly lower performance. In this paper we evaluate the effectiveness of non-binding software-controlled lyrefetching, as proposed in the Stanford DASH Multiprocessor, to address this problem.... (Update)

Cited by:   More
Memory Latency Rediction via Data Prefetching and Data Forwarding .. - Poulsen (1994)   (Correct)
Optimizing Compiler for a CELL Processor - Eichenberger, O'Brien, O'Brien.. (2005)   (Correct)
Software Methods to Improve Data Locality and Cache Behavior - Beyls (2004)   (Correct)

Similar documents (at the sentence level):
7.1%:   Comparative Evaluation of Latency Reducing and.. - Gupta, Hennessy.. (1991)   (Correct)

Active bibliography (related documents):   More   All
0.5:   Performance Evaluation of Memory Consistency Models.. - Gharachorloo, Gupta.. (1991)   (Correct)
0.5:   Architectural Support for Parallel Reductions in.. - Garzaran.. (2001)   (Correct)
0.3:   Two Techniques to Enhance the Performance of Memory.. - Kourosh Gharachorloo (1991)   (Correct)

Similar documents based on text:   More   All
0.3:   Optimizing Supercompilers for - Supercomputers The Mit   (Correct)
0.3:   Compiler-Based Prefetching for Recursive Data Structures - Luk (1996)   (Correct)
0.3:   The Interaction of Software Prefetching with ILP.. - Ranganathan, Pai.. (1997)   (Correct)

Related documents from co-citation:   More   All
35:   Design and evaluation of a compiler algorithm for prefetching - Mowry, Lam et al. - 1992
28:   The Stanford Dash Multiprocessor (context) - Lenoski, Laudon et al. - 1992
27:   Lockup-Free Instruction Fetch/Prefetch Cache Organization (context) - KROFT - 1981

BibTeX entry:   (Update)

Todd Mowry and Anoop Gupta. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, 12:87--106, June 1991. http://citeseer.ist.psu.edu/mowry91tolerating.html   More

@article{ mowry91tolerating,
    author = "T. Mowry and A. Gupta",
    title = "Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors",
    journal = "Journal of Parallel and Distributed Computing",
    volume = "12",
    number = "2",
    publisher = "Academic Press",
    address = "San Diego, New York, Boston, London, Syndey, Tokyo, Toronto",
    pages = "87--106",
    year = "1991",
    url = "citeseer.ist.psu.edu/mowry91tolerating.html" }
Citations (may not include all citations):
357   The directorybased cache coherence protocol for the DASH mul.. (context) - Lenoski, Laudon et al. - 1990
213   Weak ordering - a new definition - Adve, Hill - 1990  DBLP
212   April: A processor architecture for multiprocessing - Agarwal, Lim et al. - 1990  ACM   DBLP
130   Memory consistency and event ordering in scalable shared-mem.. (context) - Gharachorloo, Lenoski et al. - 1990  ACM   DBLP
83   Compiler-Directed Data Prefetching in Multiprocessors with M.. - Gornish, Granston et al. - 1990  ACM   DBLP
72   MASA: A multithreaded processor architecture for parallel sy.. (context) - Halstead, Tetsuya - 1988  ACM   DBLP
48   Evaluating the performance of four snooping cache coherency .. (context) - Eggers, Katz - 1989  ACM   DBLP
48   Portable Programs for Parallel Processors (context) - Lusk, Overbeek - 1987  ACM
42   Lockup free instruction fetchprefetch cache organization (context) - free, prefetch et al. - 1981
36   Cache coherence protocols: Evaluation using a multiprocessor.. (context) - Archibald, Baer - 1986  ACM   DBLP
33   Parallel MIMD Computation: The HEP Supercomputer and Its' Ap.. (context) - Kowalik - 1985
31   Computing Surveys (context) - Smith, memories - 1982
31   Data prefetching in shared memory multiprocessors (context) - Lee, Yew et al. - 1987
21   Design of Scalable Shared-Memory Multiprocessors: The DASH A.. (context) - Lenoski, Gharachorloo et al. - 1990
20   Multiprocessor cache design considerations (context) - Lee, Yew et al. - 1987  ACM   DBLP
15   Concurrent miss resolution in multiprocessor caches (context) - Scheurich, Dubois - 1988  DBLP
11   Software Methods for Improvement of Cache Performance on Sup.. (context) - Porterfield - 1989  ACM
8   Technical Report CSL-TR (context) - Goldschmidt, Davis et al. - 1990
7   and improvement of the cache behavior of shared data in cach.. (context) - Torrellas, Lam et al. - 1990
7   Vectorization of a particle simulation method for hypersonic.. (context) - McDonald, Baganoff - 1988
6   Parallel Distributed-Time Logic Simulation (context) - Soule, Gupta - 1989  ACM
5   and event ordering in multiprocessors (context) - Dubois, Scheurich et al. - 1988
5   The Effectiveness of Caches and Data Prefetch Buffers' in La.. (context) - Lee - 1987
4   Performance evaluation of memory consistency models for shar.. - Gharachorloo, Gupta et al. - 1991  ACM
2   Experimental Parallel Computing Architectures: Volume 1 - Sp.. (context) - Kuck, Davidson et al. - 1987



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.cmu.edu/~tcm/Papers.html):   More
Predicting Data Cache Misses in Non-Numeric Applications.. - Mowry, Luk (1997)   (Correct)
Automatic Compiler-Inserted I/O Prefetching for.. - Mowry, Demke, Krieger (1996)   (Correct)
Informing Loads: Enabling Software To Observe And.. - Horowitz.. (1995)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC