MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  A performance study of memory consistency models (1992) [42 citations — 2 self]

Download:
Download as a PDF | Download as a PS
by Richard N. Zucker, Richard N. Zucker, Jean-loup Baer, Jean-loup Baer
In Proceedings of the 19th Annual International Symposium on Computer Architecture
http://web.cps.msu.edu/~wrightr7/cps822/zucker.ps
Add To MetaCart

Abstract:

Recent advances in technology are such that the speed of processors is increasing faster than memory latency is decreasing. Therefore the relative cost of a cache miss is becoming more important. However, the full cost of a cache miss need not be paid every time in a multiprocessor. The frequency with which the processor must stall on a cache miss can be reduced by using a relaxed model of memory consistency. In this paper, we present the results of instruction-level simulation studies on the relative performance benefits of using different models of memory consistency. Our vehicle of study is a shared-memory multiprocessor with processors and associated write-back caches connected to global memory modules via an Omega network. The benefits of the relaxed models, and their increasing hardware complexity, are assessed with varying cache size, line size, and number of processors. We find that substantial benefits can be accrued by using relaxed models but the magnitudes of the benefits depend on the architecture being modeled, the benchmarks, and how the code is scheduled. We did not find any major difference in levels of improvement among the various relaxed models. 1

Citations

801 How to Make a Multiprocessor Computer that Correctly Executes Multiprocess Programs – Lamport - 1979
637 Memory consistency and event ordering in scalable shared-memory multiprocessors – Gharachorloo, Lenoski, et al. - 1990
487 The cache performance and optimizations of blocked algorithms – LAM, ROTHBERG, et al. - 1991
338 The Directory-Based Cache Coherence Protocol for the Dash Multiprocessor – Lenoski - 1990
274 Lockup-free instruction fetch/prefetch cache organisation – Kroft - 1981
264 Tolerating Latency Through SoftwareControlled Prefetching in Shared-Memory Multiprocessors – Mowry, Gupta - 1991
236 Cooperating Sequential Processes – DIJKSTRA - 1968
220 A New Solution to Coherence Problems in Multicache Systems – Censier, Feautrier - 1978
207 Weak Ordering -- A New Def-inition – Adve, Hill - 1990
196 Memory access buffering in multiprocessors – Dubois, Scheurich, et al. - 1986
151 Reducing memory latency via nonblocking and prefetching caches – Chen, Baer - 1992
132 Performance evaluation of memory consistency models for shared-memory multiprocessors – Gharachorloo, Gupta, et al. - 1991
109 Comparative evaluation of latency reducing and tolerating techniques – Gupta, Hennessy, et al. - 1991
106 Two techniques to enhance the performance of memory consistency models – Gharachorloo, Gupta, et al. - 1991
80 The effect of sharing on the cache and bus performance of parallel programs – Eggers, Katz - 1989
43 Programming for different memory consistency models – Gharachorloo, Adve, et al. - 1992
22 Implementing sequential consistency in cache-based systems – Adve, Hill - 1990
12 PCP: A parallel extension of C that is 99% fat free – Brooks - 1988
11 The Cerberus Multiprocessor Simulator – Axelrod, Darmohray - 1989
6 Gaussian techniques on shared-memory multiprocessors – Darmohray - 1988
3 Parallel Quicksand: Sorting on the Sequent – Kahan, Ruzzo - 1991
2 On synchronization patterns of parallel programs – Baer, Zucker - 1991
1 32 User's Guide. A Read/Write Statistics Program Reads Hit Rate (%) by line and cache size 16K cache 64K cache 8 bytes 16 bytes 64 bytes 8 bytes 16 bytes 64 – Ridge