See this document in CiteSeerX!

Identification And Optimization Of Sharing Patterns For Scalable Shared-Memory Multiprocessors (1998)  (Make Corrections)  (4 citations)
Stefanos Kaxiras



  Home/Search   Context   Related

 
View or download:
wisc.edu/galileo/kaxir...kaxiras2.ps.gz
wisc.edu/galileo/kaxir...kaxiras1.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  wisc.edu/~arch/uwarch/index (more)
From:  128.105.7.11/~kaxiras/kaxiras
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Distributed shared-memory architectures typically employ a directory-based protocol to maintain cache coherence. Identifying sharing patterns in parallel programs and applying specialized optimizations can increase cache-coherence protocol efficiency and yield performance improvements. In this thesis, I propose and study both optimizations to sharing patterns and techniques to identify sharing patterns. The main thrust of the thesis is GLOW, a comprehensive optimization for wide sharing---a... (Update)

Context of citations to this paper:   More

.... this extra traffic, the release of the lock is substantially slowed down by the fact that SCI must invalidate the sharers one at a time [Kax98]. Other coherence protocols typically use broadcast to invalidate sharers, which would improve the release performance of test test...

.... optimizations have been proposed to address some of these 0 7803 9802 5 2000 10.00 (c) 2000 IEEE specific patterns [Per93] Kax98] More general schemes have also been proposed, but they remain costly in hardware, they require on chip modification, or large extension...

Cited by:   More
Memory Dependence Prediction - Andreas Ioannis Moshovos   (Correct)
The Coherence Predictor Cache: A Resource-Efficient and .. - Nilsson, Landin..   (Correct)
Hardware Prediction for Data Coherency of Scientific Codes on.. - Acquaviva, Jalby (2000)   (Correct)

Active bibliography (related documents):   More   All
1.3:   Kiloprocessor Extensions to SCI - Stefanos Kaxiras (1996)   (Correct)
1.2:   Kaxiras@cs.wisc.edu - Sc Edu   (Correct)
1.1:   Request Combining in Multiprocessors with Arbitrary.. - Lebeck, Sohi (1994)   (Correct)

Similar documents based on text:   More   All
0.1:   DataScalar: A Memory-Centric Approach to Computing - Stefanos Kaxiras Doug   (Correct)
0.1:   DataScalar Architectures and the SPSD Execution Model - Burger, Kaxiras, Goodman (1996)   (Correct)
0.1:   Simultaneous Multithreaded DSPs: Scaling from High.. - Kaxiras, Berenbaum..   (Correct)

Related documents from co-citation:   More   All
3:   Alternative implementations of two-level adaptive branch predictions - Yeh, Patt - 1992
3:   Cooperative Shared Memory: Software and Hardware for Scalable Multiprocessors - Hill, Larus et al. - 1992
2:   The NYU Ultracomputer -- Designing a MIMD Shared-Memory Parallel Machine (context) - Gottlieb, Grishman et al. - 1983

BibTeX entry:   (Update)

Stefanos Kaxiras. Identification and Optimization of Sharing Patterns for Scalable Shared-Memory Multiprocessors. PhD thesis, University of Wisconsin, Madison, WI, 1998. http://citeseer.ist.psu.edu/kaxiras98identification.html   More

@misc{ kaxiras98identification,
  author = "S. Kaxiras",
  title = "Identification and Optimization of Sharing Patterns for Scalable Shared-Memory
    Multiprocessors",
  text = "Stefanos Kaxiras. Identification and Optimization of Sharing Patterns for
    Scalable Shared-Memory Multiprocessors. PhD thesis, University of Wisconsin,
    Madison, WI, 1998.",
  year = "1998",
  url = "citeseer.ist.psu.edu/kaxiras98identification.html" }
Citations (may not include all citations):
3972   Introduction to Algorithms (context) - Cormen, Leiserson et al. - 1990
723   Memory Coherence in Shared Virtual Memory Systems - Li, Hudak - 1986
723   Memory Coherence in Shared Virtual Memory Systems - Li, Hudak - 1989
606   How to Make a Multiprocessor Computer that Correctly Execute.. (context) - Lamport - 1979
496   SPLASH: Stanford Parallel Applications for Shared Memory (context) - Singh, Weber et al. - 1992
478   The Stanford DASH Multiprocessor (context) - Lenoski - 1992
326   TreadMarks: Shared Memory Computing on Networks of Workstati.. - Amza - 1996
268   Tempest and Typhoon: UserLevel Shared Memory - Reinhardt, Larus et al. - 1994
241   A Study of Branch Prediction Strategies (context) - Smith - 1981
222   The SGI Origin: A cc-NUMA Highly Scalable Server (context) - Laudon, Lenoski - 1997
213   Weak Ordering - A New Definition - Adve, Hill - 1990
204   Munin: Distributed Shared Memory Based on Type-Specific Memo.. - Carter, Bennett et al. - 1990
195   A New Solution to Coherence Problems in Multicache Systems (context) - Censier, Feautrier - 1978
191   The MIT Alewife Machine: A Large-Scale Distributed-Memory Mu.. - Agarwal - 1992
173   Hot Spot' Contention and Combining in Multistage Interconnec.. (context) - Pfister, Norton - 1985
170   LimitLESS Directories: A Scalable Cache Coherence Scheme - Chaiken, Kubiatowicz et al. - 1991
166   The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel C.. - Reinhardt, Hill et al. - 1993
159   The NYU Ultracomputer Designing a MIMD Shared-Memory Paralle.. (context) - Gottlieb, Grishman et al. - 1983
156   An evaluation of Directory schemes for Cache Coherence - Agarwal, Horowitz et al. - 1988
147   Alternative Implementations of Two-Level Adaptive Branch Pre.. - Yeh, Patt - 1992
145   DDM --- A Cache-Only Memory Architecture - Hagersten, Landin et al. - 1992
131   Fine-grain Access Control for Distributed Shared Memory - Schoinas - 1994
130   Memory Consistency and Event Ordering in Scalable Shared-Mem.. (context) - Gharachorloo, Lenoski et al. - 1990
127   Highly Parallel Computing (context) - Almasi, Gottlieb - 1994
114   CRAY TD System Architecture Overview (context) - Inc, Architecture - 1993
102   Dynamic Speculation and Synchronization of Data Depenences - Moshovos, Breach et al. - 1997
96   A Characterization of Sharing in Parallel Programs and its A.. (context) - Eggers, Katz - 1988
95   Application-Specific Protocols for User-Level Shared Memory - Falsafi, Lebeck et al. - 1994
85   Cache Write Policies and Performance - Jouppi - 1993
81   A Case for Networks of Workstations: {NOW (context) - Anderson, Culler et al. - 1994
77   STiNG: A CC-NUMA Computer System for the Commercial Marketpl.. - Lovett, Clapp - 1996
76   The Wisconsin Multicube: A New Large-Scale CacheCoherent Mul.. (context) - Goodman, Woest - 1988
70   Adaptive Cache Coherency for Detecting Migratory Shared Data - Robert, Fowler - 1993
64   Missing the Memory Wall: The Case for Processor/Memory Integ.. (context) - Saulsbury, Pong et al. - 1996
62   Distributing Hot-Spot Addressing in LargeScale Multiprocesso.. (context) - Yew, Tzeng et al. - 1987
61   Improving the Accuracy and Performance of Memory Communicati.. - Tyson, Austin - 1997
61   Where is Time Spent in MessagePassing and Shared-Memory Prog.. - Chandra, Larus et al. - 1994
57   An Argument for Simple COMA - Saulsbury, Wilkinson et al. - 1995
52   An Adaptive Cache Coherence Protocol Optimized for Migratory.. (context) - Stenstrm, Brorsson et al. - 1993
46   Streamlining Inter-operation Memory Communication via Data D.. (context) - Moshovos, Sohi - 1997
45   The case for Intelligent RAM - Patterson - 1997
45   Dynamic Self-Invalidation: Reducing Coherence Overhead in Sh.. - Lebeck, Wood - 1995
42   Synchronization Without Contention (context) - Mellor-Crummey, Scott - 1991
41   A Data Cache with Multiple Caching Strategies Tunned to Diff.. (context) - Gonzalez, Aliagas et al. - 1997
38   Programming for Different Memory Consistency Models - Gharachorloo, Adve et al. - 1992
36   Multiprocessors Should Support Simple Memory Consistency Mod.. - Hill - 1998
35   Performance of the SCI Ring (context) - Scott, Goodman et al. - 1992
30   Tempest: A Substrate for Portable Parallel Programs - Hill, Larus et al. - 1995
30   Cost-Effective Parallel Computing - Wood, Hill - 1995
27   Mechanisms for Cooperative Shared Memory - Wood - 1993
26   Kendall Square research technical Summary (context) - Research - 1992
26   Using Prediction to Accelerate Coherence Protocols - Mukherjee, Hill - 1998
24   Implementing Sequential Consistency In Cache-Based Systems - Adve, Hill - 1990
22   Efficient Synchronization: Let Them Eat QOLB (context) - Kgi, Burger et al. - 1997
20   Implementing Fine-Grain Distributed Shared Memory on Commodi.. - Schoinas - 1996
18   Synchronization and Communication in the T3E Multiprocessor (context) - Scott - 1996
18   Analysis of Cache Invalidation Patterns in Multiprocessors (context) - Weber, Gupta - 1989
17   An Evaluation of Fine-Grain Producer-Initiated Communication.. - Abdel-Shafi, Hall et al. - 1997
17   Data Forwarding in Scalable Shared-MemoryMultiprocessors - Koufaty, Chen et al. - 1995
16   Tutorial: Interconnection Networks for Parallel and Distribu.. (context) - Wu, Feng - 1984
16   The CM-5 Connection Machine: A Scalable Supercomputer (context) - Hillis, Tucker - 1993
16   Software Caching on Cache-Coherent Multiprocessors - Bianchini, LeBlanc - 1992
15   Predictability of Load/Store Instruction Latencies (context) - Abraham, Sugumar et al. - 1993
15   Comparative Performance Evaluation of CacheCoherent NUMA and.. (context) - Stenstrm, Joe et al. - 1992
14   Interconnection Networks for Multiprocessors and Multicomput.. (context) - Varma, Raghavendra - 1994
14   Dynamic Pointer Allocation for Scalable Cache Coherence Dire.. - Simoni, Horowitz - 1991
12   The NAS Parallel Benchmarks: Summary and Preliminary Results (context) - Bailey - 1991
12   the Inclusion Properties for Multi-Level Cache Hierarchies (context) - Baer, Wang - 1988
12   A Set of Efficient Synchronization Primitives for a Large-Sc.. (context) - Goodman, Vernon et al. - 1989
11   Introducing Memory into Switch Elements of Multiprocessor In.. (context) - Mizrahi, Baer et al. - 1989
10   Extending the Scalable Coherent Interface for Large-Scale Sh.. - Johnson - 1993
10   The SPLASH-2 Programs: Characterization and Methodological C.. (context) - Woo, Ohara et al. - 1995
10   Address Translation Mechanisms in Network Interfaces - Schoinas, Hill - 1998
10   Distributed Vector Architectures: Beyond a Single Vector-IRA.. (context) - Kaxiras, Sugumar et al. - 1997
10   Techniques for Reducing Overheads of Shared-Memory Multiproc.. (context) - Kgi, Aboulenein et al. - 1995
9   Request Combining in Multiprocessors with Arbitrary Intercon.. - Lebeck, Sohi - 1994
8   Accuracy vs. Performance in Parallel Simulation of Interconn.. - Burger, Wood - 1995
8   An Evaluation of Directory Protocols for Medium-Scale Shared.. - Mukherjee, Hill - 1994
8   Eager Combining: A Coherency Protocol for Increasing Effecti.. - Bianchini, LeBlanc - 1994
8   The GLOW Cache Coherence Protocol Extensions for Widely-shar.. (context) - Kaxiras, Goodman - 1996
7   The Scalable Tree Protocol---a Cache Coherence Approach for .. - Nilsson, Stenstrm - 1992
7   Architectural Choices for Multi-Level Cache Hierarchies (context) - Baer, Wang - 1987
6   Adaptive Migratory Scheme for Distributed Shared Memory - Kim, Vaidya - 1997
6   A New Approach to Cache Management - Tyson - 1995
5   Categorizing Network Traffic in Update-Based Protocols on Sc.. - Bianchini, LeBlanc et al. - 1996
5   A Study of Three Dynamic Approaches to Handle Widely-shared .. (context) - Kaxiras, Gjessing et al. - 1998
5   Effects of Architectural and Technological Advances on the H.. (context) - Abandah, Davidson - 1998
5   Performance of Pruning-Cache Directories for Large-Scale Mul.. (context) - Scott, Goodman - 1993
4   Simulation of the SCI Transport Layer on the Wisconsin Wind .. - Burger, Goodman - 1995
4   Interconnect Topologies with Point-to-Point Rings (context) - Johnson, Goodman - 1992
3   The Exemplar System (context) - Corporation - 1994
3   Toward the Design of Large-Scale, Shared-Memory Multiprocess.. - Scott - 1992
3   Characterizing Shared-Memory Applications: A Case Study of t.. - Abandah - 1997
3   Two Economical Directory Schemes for Large-Scale Cache-Coher.. (context) - Maa, Pradhan et al. - 1991
2   Paragon Technical Summary (context) - Corporation - 1993
2   The GLOW Cache Coherence Extensions for Widely-shared Data (context) - Kaxiras, Goodman - 1996
2   Shared Memory Computer Method and Apparatus (context) - Sullivan, Cohn - 1987
2   Using Proxies to Reduce Controller Contention in Large Share.. - Bennett, Kelly et al. - 1996
2   Kiloprocessor Extensions to SCI - Kaxiras - 1996
2   Distributed Vector Architectures: Fine Grain Parallelism wit.. (context) - Kaxiras, Sugumar - 1997
2   IEEE Standard for Cache Optimization for Large Numbers of Pr.. (context) - James, Kaxiras - 1995
1   Stable Performance for cc-NUMA Using First Touch Page Placem.. - Talbot, Kelly
1   The Use of Instruction-Based Prediction in Hardware SharedMe.. (context) - Kaxiras - 1998
1   Improving Request-Combining for Widely-shared Data in Shared.. (context) - Kaxiras, Goodman - 1998
1   Software Combining Algorithms for Distributed Hot-Spot Addre.. (context) - Yew, Tang - 1990

Documents on the same site (http://www.cs.wisc.edu/~arch/uwarch/index.html):
Mechanisms for Distributed Shared Memory - Reinhardt (1996)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC