MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  y

Download:
pdf | ps
by Hazim Abdel-shafi, Sarita V. Adve, Vikram S. Adve
http://www-sal.cs.uiuc.edu/~vadve/Papers/hpca97remote-write.ps.gz
Add To MetaCart

Abstract:

Prefetching is a widely used consumer-initiated mechanism to hide communication latency in sharedmemory multiprocessors. However, prefetching is inapplicable or insufficient for some communication patterns such as irregular communication, pipelined loops, and synchronization. For these cases, a combination of two fine-grain, producer-initiated primitives (referred to as remote-writes) is better able to reduce the latency of communication. This paper demonstrates experimentally that remote writes provide significant performance benefits in cache-coherent sharedmemory multiprocessors with and without prefetching. Further, the combination of remote writes and prefetching is able to eliminate most of the memory system overhead in the applications, except misses due to cache conflicts. 1

Citations

67 et al., “The Stanford DASH Multiprocessor – Lenoski - 1992
65 Synchronization without contention – Mellor-Crummey, Scott - 1991
59 et al. The SPLASH-2 Programs: Characterization and Methodological Considerations – Woo - 1995
58 Dynamic Self-Invalidation: Reducing Coherence Overhead in Shared-Memory Multiprocessors – Lebeck, Wood - 1995
46 Reactive synchronization algorithms for multiprocessors – Lim, Agarwal - 1994
26 Using Write Caches to Improve Performance of Cache Coherence Protocols in Shared-Memory Multiprocessors – Dahlgren, Stenstrom - 1995
19 A compiler algorithm that reduces read latency in ownership-based cache coherence protocols – Skeppstedt, Stenström - 1995
14 Memory Latency Reduction via Data Prefetching and Data Forwarding in Shared Memory Multiprocessors – Poulsen - 1994
13 The impact of speeding up critical sections with data prefetching and forwarding – Trancoso, Torrellas - 1996
8 et al. Integrating Message-Passing and Shared-Memory: Early Experience – Kranz - 1993
8 et al. SPLASH: Stanford Parallel Applications for Shared-Memory – Singh - 1992
5 et al. The Efficient Simulation of Parallel Computer Systems – Covington - 1991
5 et al. Cooperative shared memory: software and hardware for scalable multiprocessors – Hill - 1993
4 Data Prefetching and Data Forwarding in Shared-Memory Multiprocessors – Poulsen, Yew - 1994
4 The Effects of Interconnection Networks on the Performance of Shared-Memory Multiprocessors – Rajagopalan - 1995
2 et al. Efficient Synchronization Primitives for Large-Scale Cache-Coherent Multiprocessors – Goodman - 1989
2 et al. Competitive Snoopy Caching – Karlin - 1988
2 et al. Data forwarding in scalable shared-memory multiprocessors – Koufaty - 1995
2 et al. The Impact of Instruction-Level Parallelism on Multiprocessor Performance and Simulation Methodolgy – Pai - 1996
1 et al. Internal Organization of the Alpha 21164 – Edmondson - 1995
1 Mowry et al. Design and Evaluation of a Compiler Algorithm for Prefetching – C - 1992
1 et al. Architectural Mechanisms for Explicit Communication in Shared Memory Multiprocessors – Ramachandran - 1995
1 et al. Distance-Adaptive Update Protocols for Scalable Shared-Memory Multiprocessors – Raynaud - 1996
1 et al. The KSR1: Experimentation and Modeling of Poststore – Rosti - 1993
1 et al. The Performance Advantages of Integrating Block Data Transfer in Cache-Coherent Multiprocessors – Woo - 1994