(Enter summary)
Abstract: Techniques that can cope with the large latency of memory accesses
are essential for achieving high processor utilization in large-scale
shared-memory multiprocessors. In this paper, we consider four architectural
techniques that address the latency problem: (i) hardware
coherent caches, (ii) relaxed memory consistency, (iii) softwarecontrolled
prefetching, and (iv) multiple-context support. While
some studies of benefits of the individual techniques have been
done, no study evaluates all of... (Update)
Cited by: More
Memory Latency Rediction via Data Prefetching and Data Forwarding .. - Poulsen (1994)
(Correct)
Fast Accurate Simulation of Large Shared - Memory Multiprocessors Revised
(Correct)
High-Performance Frontends for Trace Processors - Jacobson (1999)
(Correct)
Similar documents (at the sentence level):
10.3%: Tolerating Latency Through Software-Controlled Prefetching in.. - Mowry, Gupta (1991)
(Correct)
5.3%: Performance Evaluation of Memory Consistency Models.. - Gharachorloo, Gupta.. (1991)
(Correct)
5.3%: Memory Consistency Models for Shared-Memory Multiprocessors - Gharachorloo (1995)
(Correct)
Active bibliography (related documents): More All
0.4: Exploiting Thread-Level Parallelism On . . . - Lo (1998)
(Correct)
0.3: Architectural and Implementation Tradeoffs in the Design of.. - James Laudon (1992)
(Correct)
0.2: Balanced Multithreading: Increasing Throughput via a.. - Tune, Kumar, Tullsen, .. (2004)
(Correct)
Similar documents based on text: More All
0.3: A Comparative Evaluation of Software Techniques to Hide Memory.. - Lizy Kurian (1995)
(Correct)
0.3: Optimizing Supercompilers for - Supercomputers The Mit
(Correct)
0.3: Comparative Evaluation of Latency Tolerance Techniques for.. - Mowry, Chan, Lo (1998)
(Correct)
Related documents from co-citation: More All
46: Tolerating latency through software-controlled prefetching in shared-memory mult..
- Mowry, Gupta - 1991
28: SPLASH: Stanford parallel applications for shared memory (context) - Singh, Weber et al. - 1992
24: How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Progr.. (context) - Lamport - 1979
BibTeX entry: (Update)
Anoop Gupta, John Hennessy, Kourosh Gharachorloo, Todd Mowry, and WolfDietrich Weber. Comparative evaluation of latency reducing and tolerating techniques. In Proceedings of the 18th International Conference on Computer Architecture, pages 254--263. IEEE, May 1991. http://citeseer.ist.psu.edu/gupta91comparative.html More
@inproceedings{ gupta91comparative,
author = "A. Gupta and J. Hennessy and K. Gharachorloo and T. Mowry and W.-D. Weber",
title = "Comparative Evaluation of Latency Reducing and Tolerating Techniques",
booktitle = "Proceedings of the 18th International Symposium on Computer Architecture ({ISCA})",
journal = "ACM Computer Architecture News, SIGARCH",
volume = "19",
number = "3",
publisher = "ACM Press",
address = "New York, NY",
pages = "254--265",
year = "1991",
url = "citeseer.ist.psu.edu/gupta91comparative.html" }
Citations (may not include all citations):
468
Memory consistency and event ordering in scalable shared-mem..
- Gharachorloo, Lenoski et al. - 1990 ACM DBLP
357
The directory-based cache coherence protocol for the DASH mu.. (context) - Lenoski, Laudon et al. - 1990 ACM DBLP
249
Tolerating latency through softwarecontrolled prefetching in..
- Mowry, Gupta - 1991
213
Weak ordering - A new definition
- Adve, Hill - 1990 DBLP
212
April: A processor architecture for multiprocessing
- Agarwal, Lim et al. - 1990 ACM DBLP
165
Memory access buffering in multiprocessors (context) - Dubois, Scheurich et al. - 1986 ACM DBLP
157
Architecture and applications of the HEP multiprocessor comp.. (context) - Smith - 1981
155
Cache coherence protocols: Evaluation using a multiprocessor.. (context) - Archibald, Baer - 1986 ACM DBLP
107
Software Methods for Improvement of Cache Performance on Sup.. (context) - Porterfield - 1989 ACM
92
Performance evaluation of memory consistency models for shar..
- Gharachorloo, Gupta et al. - 1991 ACM
90
The IBM research parallel processor prototype (context) - Pfister, Brantley et al. - 1985
83
Compilerdirected data prefetching in multiprocessors with me..
- Gornish, Granston et al. - 1990
72
MASA: A multithreaded processor architecture for parallel sy.. (context) - Halstead, Fujita - 1988 ACM DBLP
55
Exploring the benefits of multiple hardware contexts in a mu.. (context) - Weber, Gupta - 1989 ACM DBLP
48
Portable Programs for Parallel Processors (context) - Lusk, Overbeek - 1987 ACM
48
Evaluating the performance of four snooping cache coherency .. (context) - Eggers, Katz - 1989 ACM DBLP
43
Performance tradeoffs in multithreaded processors
- Agarwal - 1989 ACM DBLP
42
Lockup free instruction fetchprefetch cache organization (context) - free, prefetch et al. - 1981
31
Data prefetching in shared memory multiprocessors (context) - Lee, Yew et al. - 1987
17
The Effectiveness of Caches and Data Prefetch Buffers in Lar.. (context) - Lee - 1987 ACM
10
Technical Report no (context) - Goodman, sequential - 1989
8
Technical Report CSL-TR (context) - Goldschmidt, Davis et al. - 1990
7
The Butterfly parallel processor (context) - Schmidt - 1987
7
and improvement of the cache behavior of shared data in cach.. (context) - Torrellas, Lam et al. - 1990
7
Vectorization of a particle simulation method for hypersonic.. (context) - McDonald, Baganoff - 1988
6
Parallel distributed-time logic simulation (context) - Soule, Gupta - 1989 ACM
6
Hierarchical cachebu architecture shared memory multiprocess.. (context) - Hierarchical, architecture et al. - 1987
6
Toward dataflowvon Neumann hybrid architecture (context) - dataflow, architecture et al. - 1988
4
Analysis of multithreaded architectures for parallel computi.. (context) - Saavedra-Barrera, Culler et al. - 1990 ACM DBLP
1
How to make a multiprocessor computer that correctly execute.. (context) - Lampon - 1979 DBLP
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.eecg.toronto.edu/~tcm/Papers.html): More
Informing Loads: Enabling Software To Observe And.. - Horowitz.. (1995)
(Correct)
Compiler-Based Prefetching for Recursive Data Structures - Luk (1996)
(Correct)
Informing Memory Operations: Providing Memory Performance.. - Horowitz (1996)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC