MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Relaxed Consistency and Coherence Granularity in DSM Systems: A Performance Evaluation (1997) [44 citations — 7 self]

Download:
Download as a PDF | Download as a PS
by Yuanyuan Zhou, Liviu Iftode, Jaswinder Pal Singh, Kai Li, Brian R. Toonen, Ioannis Schoinas, Mark D. Hill, David A. Wood
In Proceedings of the 6th ACM Symposium on Principles and Practice of Parallel Programming
http://www.cs.princeton.edu/~yzhou/paper/ppopp97.ps
Add To MetaCart

Abstract:

During the past few years, two main approaches have been taken to improve the performance of software shared memory implementations: relaxing consistency models and providing ne-grained access control. Their performance tradeos, however, are not well understood. This paper studies these tradeos on a platform that provides access control in hardware but runs coherence protocols in software. We compare the performance of three protocols across four coherence granularities, using 12 applications on a 16-node cluster of workstations. Our results show that no single combination of protocol and granularity performs best for all the applications. The combination of a sequentially consistent (SC) protocol and ne granularity works well with 7 of the 12 applications. The combination of a multiple-writer, homebased lazy release consistency (HLRC) protocol and page granularity works well with 8 out of the 12 applications. For applications that suer performance losses in moving to coarser granularity under sequential consistency, the performance can usually be regained quite eectively using relaxed protocols, particularly HLRC. We also nd that the HLRC protocol performs substantially better than a single-writer lazy release consistent (SW-LRC) protocol at coarse granularity for many irregular applications. For our applications and platform, when we use the original versions of the applications ported directly from hardware-coherent shared memory, we nd that the SC protocol with 256-byte granularity performs best on average. However, when the best versions of the applications are compared, the balance shifts in favor of HLRC at page granularity. 1

Citations

847 Memory coherence in shared virtual memory systems – Li, Hudak - 1989
801 How to Make a Multiprocessor Computer that Correctly Executes Multiprocess Programs – Lamport - 1979
784 Myrinet: A Gigabit-per-second Local Area Network – Boden, Cohen, et al. - 1995
637 Memory consistency and event ordering in scalable shared-memory multiprocessors – Gharachorloo, Lenoski, et al. - 1990
530 Implementation and performance of Munin – Carter, Bennett, et al. - 1991
477 TreadMarks: Distributed shared memory on standard workstations and operating systems – Keleher, Dwarkadas, et al. - 1994
360 The Midway distributed shared memory system – Bershad, Zekauskas, et al. - 1993
323 Tempest and Typhoon: User-Level Shared Memory – Reinhardt, Larus, et al. - 1994
291 High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet – Pakin, Lauria, et al. - 1995
202 Shasta: A Low Overhead, Software-Only Approach for Supporting Fine-Grain Shared Memory – Scales, Gharachorloo, et al. - 1996
179 A Shared Virtual Memory System for Parallel Computing – Li - 1988
137 Performance evaluation of two home-based lazy release consistency protocols for shared memory virtual memory systems – Zhou, Iftode, et al. - 1996
133 Scope consistency: A bridge between release consistency and entry consistency – Iftode, Singh, et al. - 1996
121 Adaptive software cache management for distributed shared memory architectures – Bennett, Carter, et al. - 1990
107 Analysis of cache invalidation patterns in multiprocessors – Weber, Gupta - 1989
99 The Relative Importance of Concurrent Writers and Weak Consistency Models – Keleher - 1996
93 Improving Release-Consistent Shared Virtual Memory Using Automatic Update – Iftode, Dubnicki, et al. - 1996
82 SoftFLASH: Analyzing the Performance of Clustered Distributed Virtual Shared Memory – Erlichson, Nuckolls, et al. - 1996
65 Lazy consistency for software distributed shared memory – KELEHER, COX, et al. - 1992
60 Application restructuring and performance portability across shared virtual memory and hardwarecoherent multiprocessors – Jiang, Shan, et al. - 1997
54 Understanding Application Performance on Shared Virtual Memory Systems – Iftode, Singh, et al.
49 LogP Performance Assessment of Fast Network Interfaces – Culler, Liu, et al.
28 Implementing Fine-Grain Distributed Shared Memory On Commodity SMP Workstations – Schoinas, Falsafi, et al. - 1996
16 Typhoon-zero implementation: The vortex module – Pfile - 1995
15 A coherency model for virtually shared memory – Borrmann, Herdieckerhoff - 1990
8 Fine-grain Access for Distributed Shared Memory – Schoinas, Falsafi, et al. - 1994
2 Typhoon-Zero Implementation: The Vortex Module – Robert - 1995
2 PerformanceEvaluation of Two Home-Based Lazy Release Consistency Protocols for Shared Virtual Memory Systems – Zhou, Iftode, et al. - 1996
1 Decoupled Hardware Support for Distributed Shared Memory – Reinhard, Pfile, et al. - 1996