MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Reducing Controller Contention in Shared-Memory Multiprocessors Using Combining and Two-Phase Routing

Download:
Download as a PDF | Download as a PS
by Sarah A. M. Talbot, Paul H. J. Kelly
http://www.doc.ic.ac.uk/~samt/improving_tech.letter.ps.gz
Add To MetaCart

Abstract:

In simple cache coherency protocols, serialisation can occur when many simultaneous accesses are made to data held in a single node, and when many accesses involve a common "home " node controller. This is ameliorated in various designs with a hierarchical or clustered structure. In this paper we investigate the idea of routing requests via an intermediate "proxy " node where combining is used to reduce contention. We present a hashing-based proxy placement scheme, and evaluate a "reactive " approach which invokes proxying only when contention occurs. Simulation results using various benchmarks show that the hotspot contention which occurs in pathological examples can be dramatically reduced, while performance on well-behaved applications is essentially unaffected.

Citations

724 The SPLASH-2 programs: Characterization and methodological considerations – Woo, Ohara, et al. - 1995
680 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and – Jouppi - 1990
95 Effects of communication latency, overhead, and bandwidth in a cluster architecture – Martin, Vahdat, et al. - 1997
91 The Wisconsin multicube: A new largescale cache-coherent multiprocessor – Goodman, Woest - 1988
81 The NYU Ultracomputer { designing an MIMD shared memory parallel computer – Gottlieb, Grishman, et al. - 1983
63 The IBM research parallel processor prototype (RP3): Introduction and architecture – PFISTER, BRANTLEY, et al. - 1985
50 The S3.mp Scalable Shared Memory Multiprocessor – Nowatzyk - 1995
38 Abhijit Sahay, Klaus Erik Schauser, Eunice Santos, Ramesh Subramonian, and Thorsten von Eicken. LogP: Towards a realistic model of parallel computation – Culler, Karp, et al. - 1993
33 The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors – Holt, Heinrich, et al. - 1995
30 A survey of PRAM simulation techniques – Harris - 1994
30 Application and Architectural Bottlenecks in Large Scale Distributed Shared Memory Machines – Holt, Singh, et al. - 1996
30 et al., “Coherence Controller Architectures for SMPBased CC-NUMA – Michael - 1997
22 The cache coherence protocol of the data diffusion machine – Haridi, Hagersten - 1989
20 Extending the Scalable Coherent Interface for Large-Scale Shared-Memory Multiprocessors – Johnson - 1993
15 The GLOW Cache Coherence Protocol Extensions for Widely Shared Data – Kaxiras, Goodman - 1996
14 Distributed-Directory Protocol – Stanford - 1990
14 Optimality of a two-phase strategy for routing in interconnection networks – Valiant - 1983
13 Eager combining: a coherency protocol for increasing effective network and memory bandwidth in shared-memory multiprocessors – Bianchini, LeBlanc - 1994
4 Building the 4 Processor SB-PRAM Prototype – Bach, Braun, et al. - 1997
4 A cache coherence mechanism for scalable, shared-memory multiprocessors – Scott - 1991
3 Using proxies to reduce cache controller contention in large shared-memory multiprocessors – Bennett, Kelly, et al. - 1996
3 Scalable Coherent Interface – SCI - 1993
1 Development and validation of an analytical model of a distributed cache coherency protocol – Bennett, Field, et al. - 1996