(Enter summary)
Abstract: Distributed shared-memory architectures typically employ a directory-based protocol to maintain
cache coherence. Identifying sharing patterns in parallel programs and applying specialized
optimizations can increase cache-coherence protocol efficiency and yield performance
improvements. In this thesis, I propose and study both optimizations to sharing patterns and
techniques to identify sharing patterns.
The main thrust of the thesis is GLOW, a comprehensive optimization for wide sharing---a... (Update)
Context of citations to this paper: More
.... this extra traffic, the release of the lock is substantially slowed down by the fact that SCI must invalidate the sharers one at a time [Kax98]. Other coherence protocols typically use broadcast to invalidate sharers, which would improve the release performance of test test...
.... optimizations have been proposed to address some of these 0 7803 9802 5 2000 10.00 (c) 2000 IEEE specific patterns [Per93] Kax98] More general schemes have also been proposed, but they remain costly in hardware, they require on chip modification, or large extension...
Cited by: More
Memory Dependence Prediction - Andreas Ioannis Moshovos
(Correct)
The Coherence Predictor Cache: A Resource-Efficient and .. - Nilsson, Landin..
(Correct)
Hardware Prediction for Data Coherency of Scientific Codes on.. - Acquaviva, Jalby (2000)
(Correct)
Active bibliography (related documents): More All
1.3: Kiloprocessor Extensions to SCI - Stefanos Kaxiras (1996)
(Correct)
1.2: Kaxiras@cs.wisc.edu - Sc Edu
(Correct)
1.1: Request Combining in Multiprocessors with Arbitrary.. - Lebeck, Sohi (1994)
(Correct)
Similar documents based on text: More All
0.1: DataScalar: A Memory-Centric Approach to Computing - Stefanos Kaxiras Doug
(Correct)
0.1: DataScalar Architectures and the SPSD Execution Model - Burger, Kaxiras, Goodman (1996)
(Correct)
0.1: Simultaneous Multithreaded DSPs: Scaling from High.. - Kaxiras, Berenbaum..
(Correct)
Related documents from co-citation: More All
3: Alternative implementations of two-level adaptive branch predictions
- Yeh, Patt - 1992
3: Cooperative Shared Memory: Software and Hardware for Scalable Multiprocessors
- Hill, Larus et al. - 1992
2: The NYU Ultracomputer -- Designing a MIMD Shared-Memory Parallel Machine (context) - Gottlieb, Grishman et al. - 1983
BibTeX entry: (Update)
Stefanos Kaxiras. Identification and Optimization of Sharing Patterns for Scalable Shared-Memory Multiprocessors. PhD thesis, University of Wisconsin, Madison, WI, 1998. http://citeseer.ist.psu.edu/kaxiras98identification.html More
@misc{ kaxiras98identification,
author = "S. Kaxiras",
title = "Identification and Optimization of Sharing Patterns for Scalable Shared-Memory
Multiprocessors",
text = "Stefanos Kaxiras. Identification and Optimization of Sharing Patterns for
Scalable Shared-Memory Multiprocessors. PhD thesis, University of Wisconsin,
Madison, WI, 1998.",
year = "1998",
url = "citeseer.ist.psu.edu/kaxiras98identification.html" }
Citations (may not include all citations):
3972
Introduction to Algorithms (context) - Cormen, Leiserson et al. - 1990
723
Memory Coherence in Shared Virtual Memory Systems
- Li, Hudak - 1986
723
Memory Coherence in Shared Virtual Memory Systems
- Li, Hudak - 1989
606
How to Make a Multiprocessor Computer that Correctly Execute.. (context) - Lamport - 1979
496
SPLASH: Stanford Parallel Applications for Shared Memory (context) - Singh, Weber et al. - 1992
478
The Stanford DASH Multiprocessor (context) - Lenoski - 1992
326
TreadMarks: Shared Memory Computing on Networks of Workstati..
- Amza - 1996
268
Tempest and Typhoon: UserLevel Shared Memory
- Reinhardt, Larus et al. - 1994
241
A Study of Branch Prediction Strategies (context) - Smith - 1981
222
The SGI Origin: A cc-NUMA Highly Scalable Server (context) - Laudon, Lenoski - 1997
213
Weak Ordering - A New Definition
- Adve, Hill - 1990
204
Munin: Distributed Shared Memory Based on Type-Specific Memo..
- Carter, Bennett et al. - 1990
195
A New Solution to Coherence Problems in Multicache Systems (context) - Censier, Feautrier - 1978
191
The MIT Alewife Machine: A Large-Scale Distributed-Memory Mu..
- Agarwal - 1992
173
Hot Spot' Contention and Combining in Multistage Interconnec.. (context) - Pfister, Norton - 1985
170
LimitLESS Directories: A Scalable Cache Coherence Scheme
- Chaiken, Kubiatowicz et al. - 1991
166
The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel C..
- Reinhardt, Hill et al. - 1993
159
The NYU Ultracomputer Designing a MIMD Shared-Memory Paralle.. (context) - Gottlieb, Grishman et al. - 1983
156
An evaluation of Directory schemes for Cache Coherence
- Agarwal, Horowitz et al. - 1988
147
Alternative Implementations of Two-Level Adaptive Branch Pre..
- Yeh, Patt - 1992
145
DDM --- A Cache-Only Memory Architecture
- Hagersten, Landin et al. - 1992
131
Fine-grain Access Control for Distributed Shared Memory
- Schoinas - 1994
130
Memory Consistency and Event Ordering in Scalable Shared-Mem.. (context) - Gharachorloo, Lenoski et al. - 1990
127
Highly Parallel Computing (context) - Almasi, Gottlieb - 1994
114
CRAY TD System Architecture Overview (context) - Inc, Architecture - 1993
102
Dynamic Speculation and Synchronization of Data Depenences
- Moshovos, Breach et al. - 1997
96
A Characterization of Sharing in Parallel Programs and its A.. (context) - Eggers, Katz - 1988
95
Application-Specific Protocols for User-Level Shared Memory
- Falsafi, Lebeck et al. - 1994
85
Cache Write Policies and Performance
- Jouppi - 1993
81
A Case for Networks of Workstations: {NOW (context) - Anderson, Culler et al. - 1994
77
STiNG: A CC-NUMA Computer System for the Commercial Marketpl..
- Lovett, Clapp - 1996
76
The Wisconsin Multicube: A New Large-Scale CacheCoherent Mul.. (context) - Goodman, Woest - 1988
70
Adaptive Cache Coherency for Detecting Migratory Shared Data
- Robert, Fowler - 1993
64
Missing the Memory Wall: The Case for Processor/Memory Integ.. (context) - Saulsbury, Pong et al. - 1996
62
Distributing Hot-Spot Addressing in LargeScale Multiprocesso.. (context) - Yew, Tzeng et al. - 1987
61
Improving the Accuracy and Performance of Memory Communicati..
- Tyson, Austin - 1997
61
Where is Time Spent in MessagePassing and Shared-Memory Prog..
- Chandra, Larus et al. - 1994
57
An Argument for Simple COMA
- Saulsbury, Wilkinson et al. - 1995
52
An Adaptive Cache Coherence Protocol Optimized for Migratory.. (context) - Stenstrm, Brorsson et al. - 1993
46
Streamlining Inter-operation Memory Communication via Data D.. (context) - Moshovos, Sohi - 1997
45
The case for Intelligent RAM
- Patterson - 1997
45
Dynamic Self-Invalidation: Reducing Coherence Overhead in Sh..
- Lebeck, Wood - 1995
42
Synchronization Without Contention (context) - Mellor-Crummey, Scott - 1991
41
A Data Cache with Multiple Caching Strategies Tunned to Diff.. (context) - Gonzalez, Aliagas et al. - 1997
38
Programming for Different Memory Consistency Models
- Gharachorloo, Adve et al. - 1992
36
Multiprocessors Should Support Simple Memory Consistency Mod..
- Hill - 1998
35
Performance of the SCI Ring (context) - Scott, Goodman et al. - 1992
30
Tempest: A Substrate for Portable Parallel Programs
- Hill, Larus et al. - 1995
30
Cost-Effective Parallel Computing
- Wood, Hill - 1995
27
Mechanisms for Cooperative Shared Memory
- Wood - 1993
26
Kendall Square research technical Summary (context) - Research - 1992
26
Using Prediction to Accelerate Coherence Protocols
- Mukherjee, Hill - 1998
24
Implementing Sequential Consistency In Cache-Based Systems
- Adve, Hill - 1990
22
Efficient Synchronization: Let Them Eat QOLB (context) - Kgi, Burger et al. - 1997
20
Implementing Fine-Grain Distributed Shared Memory on Commodi..
- Schoinas - 1996
18
Synchronization and Communication in the T3E Multiprocessor (context) - Scott - 1996
18
Analysis of Cache Invalidation Patterns in Multiprocessors (context) - Weber, Gupta - 1989
17
An Evaluation of Fine-Grain Producer-Initiated Communication..
- Abdel-Shafi, Hall et al. - 1997
17
Data Forwarding in Scalable Shared-MemoryMultiprocessors
- Koufaty, Chen et al. - 1995
16
Tutorial: Interconnection Networks for Parallel and Distribu.. (context) - Wu, Feng - 1984
16
The CM-5 Connection Machine: A Scalable Supercomputer (context) - Hillis, Tucker - 1993
16
Software Caching on Cache-Coherent Multiprocessors
- Bianchini, LeBlanc - 1992
15
Predictability of Load/Store Instruction Latencies (context) - Abraham, Sugumar et al. - 1993
15
Comparative Performance Evaluation of CacheCoherent NUMA and.. (context) - Stenstrm, Joe et al. - 1992
14
Interconnection Networks for Multiprocessors and Multicomput.. (context) - Varma, Raghavendra - 1994
14
Dynamic Pointer Allocation for Scalable Cache Coherence Dire..
- Simoni, Horowitz - 1991
12
The NAS Parallel Benchmarks: Summary and Preliminary Results (context) - Bailey - 1991
12
the Inclusion Properties for Multi-Level Cache Hierarchies (context) - Baer, Wang - 1988
12
A Set of Efficient Synchronization Primitives for a Large-Sc.. (context) - Goodman, Vernon et al. - 1989
11
Introducing Memory into Switch Elements of Multiprocessor In.. (context) - Mizrahi, Baer et al. - 1989
10
Extending the Scalable Coherent Interface for Large-Scale Sh..
- Johnson - 1993
10
The SPLASH-2 Programs: Characterization and Methodological C.. (context) - Woo, Ohara et al. - 1995
10
Address Translation Mechanisms in Network Interfaces
- Schoinas, Hill - 1998
10
Distributed Vector Architectures: Beyond a Single Vector-IRA.. (context) - Kaxiras, Sugumar et al. - 1997
10
Techniques for Reducing Overheads of Shared-Memory Multiproc.. (context) - Kgi, Aboulenein et al. - 1995
9
Request Combining in Multiprocessors with Arbitrary Intercon..
- Lebeck, Sohi - 1994
8
Accuracy vs. Performance in Parallel Simulation of Interconn..
- Burger, Wood - 1995
8
An Evaluation of Directory Protocols for Medium-Scale Shared..
- Mukherjee, Hill - 1994
8
Eager Combining: A Coherency Protocol for Increasing Effecti..
- Bianchini, LeBlanc - 1994
8
The GLOW Cache Coherence Protocol Extensions for Widely-shar.. (context) - Kaxiras, Goodman - 1996
7
The Scalable Tree Protocol---a Cache Coherence Approach for ..
- Nilsson, Stenstrm - 1992
7
Architectural Choices for Multi-Level Cache Hierarchies (context) - Baer, Wang - 1987
6
Adaptive Migratory Scheme for Distributed Shared Memory
- Kim, Vaidya - 1997
6
A New Approach to Cache Management
- Tyson - 1995
5
Categorizing Network Traffic in Update-Based Protocols on Sc..
- Bianchini, LeBlanc et al. - 1996
5
A Study of Three Dynamic Approaches to Handle Widely-shared .. (context) - Kaxiras, Gjessing et al. - 1998
5
Effects of Architectural and Technological Advances on the H.. (context) - Abandah, Davidson - 1998
5
Performance of Pruning-Cache Directories for Large-Scale Mul.. (context) - Scott, Goodman - 1993
4
Simulation of the SCI Transport Layer on the Wisconsin Wind ..
- Burger, Goodman - 1995
4
Interconnect Topologies with Point-to-Point Rings (context) - Johnson, Goodman - 1992
3
The Exemplar System (context) - Corporation - 1994
3
Toward the Design of Large-Scale, Shared-Memory Multiprocess..
- Scott - 1992
3
Characterizing Shared-Memory Applications: A Case Study of t..
- Abandah - 1997
3
Two Economical Directory Schemes for Large-Scale Cache-Coher.. (context) - Maa, Pradhan et al. - 1991
2
Paragon Technical Summary (context) - Corporation - 1993
2
The GLOW Cache Coherence Extensions for Widely-shared Data (context) - Kaxiras, Goodman - 1996
2
Shared Memory Computer Method and Apparatus (context) - Sullivan, Cohn - 1987
2
Using Proxies to Reduce Controller Contention in Large Share..
- Bennett, Kelly et al. - 1996
2
Kiloprocessor Extensions to SCI
- Kaxiras - 1996
2
Distributed Vector Architectures: Fine Grain Parallelism wit.. (context) - Kaxiras, Sugumar - 1997
2
IEEE Standard for Cache Optimization for Large Numbers of Pr.. (context) - James, Kaxiras - 1995
1
Stable Performance for cc-NUMA Using First Touch Page Placem..
- Talbot, Kelly
1
The Use of Instruction-Based Prediction in Hardware SharedMe.. (context) - Kaxiras - 1998
1
Improving Request-Combining for Widely-shared Data in Shared.. (context) - Kaxiras, Goodman - 1998
1
Software Combining Algorithms for Distributed Hot-Spot Addre.. (context) - Yew, Tang - 1990
Documents on the same site (http://www.cs.wisc.edu/~arch/uwarch/index.html):
Mechanisms for Distributed Shared Memory - Reinhardt (1996)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC