(Enter summary)
Abstract: Efficient locking synchronization primitives are essential for achieving high
performance in fine-grain, shared-memory parallel programs. One function of
locking primitives is to enable exclusive access to shared data and critical sections
of code. In this dissertation, I make the following six contributions. (1) I propose
a framework, the synchronization period, in which to reason about the
inefficiencies of locking primitives. (2) I identify four previously proposed locking
mechanisms (local... (Update)
Context of citations to this paper: More
.... dependent on the cache line size; if the shared data does not fit in the cache line, that shared data will not benefit from collocation [12]. We have devised a novel synchronization architecture as a solution to the processor synchronization problems when encountered in a System...
Cited by: More
Shared Memory Parallelization of Data Mining Algorithms.. - Jin, Yang, Agrawal (2004)
(Correct)
A System-on-a-Chip Lock Cache with Task Preemption Support - Akgul, Lee (2001)
(Correct)
Similar documents (at the sentence level):
5.6%: Efficient Synchronization: Let Them Eat QOLB - Kägi, Burger, Goodman (1997)
(Correct)
Active bibliography (related documents): More All
1.2: Speculation-Based Techniques for Lockfree Execution of Lock-Based .. - Rajwar (2002)
(Correct)
1.1: An Analysis of the Interactions of Overhead-Reducing .. - Kagi, Aboulenein.. (1995)
(Correct)
0.8: SOFTQOLB: An Ultra-Efficient Synchronization Primitive for.. - Kägi, Goodman
(Correct)
System load high. Please wait...
Timeout. Please try your query later.
Similar documents based on text: More All
0.2: Selecting Locking Primitives for Parallel Programs - Paul Mckenney Sequent (1996)
(Correct)
0.1: Improving the Throughput of Synchronization by Insertion of.. - Rajwar, Kagi, Goodman (2000)
(Correct)
0.1: Journals - Robert
(Correct)
Related documents from co-citation: More All
2: Parallel data mining for association rules on shared-memory multi-processors
- Zaki, Ogihara et al. - 1996
2: Scalable parallel data mining for association rules
- Han, Karypis et al. - 1997
2: Efficient synchronization: Let them eat QOLB (context) - Kagi, Burger et al. - 1997
BibTeX entry: (Update)
A. Kagi, Mechanisms for Efficient Shared-Memory LockBased Synchronization, Ph.D. Thesis, Computer Sciences, University of Wisconsin, Madison, 1999. http://citeseer.ist.psu.edu/kagi99mechanisms.html More
@misc{ kagi99mechanisms,
author = "A. Kagi",
title = "Mechanisms for Efficient Shared-Memory LockBased Synchronization",
text = "A. Kagi, Mechanisms for Efficient Shared-Memory LockBased Synchronization,
Ph.D. Thesis, Computer Sciences, University of Wisconsin, Madison, 1999.",
year = "1999",
url = "citeseer.ist.psu.edu/kagi99mechanisms.html" }
Citations (may not include all citations):
595
Active Messages: A mechanism for integrated communication an..
- von Eicken, Culler et al. - 1992
478
The Stanford DASH multiprocessor (context) - Lenoski, Laudon et al. - 1992
406
TreadMarks: Distributed shared memory on standard workstatio..
- Keleher, Cox et al. - 1994
376
The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991
367
Computer Architecture: A Quantitative Approach (context) - Hennessy, Patterson - 1995
358
The Tera computer system
- Alverson, Callahan et al. - 1990
326
TreadMarks: Shared memory computing on networks of workstati..
- Amza, Cox et al. - 1996
301
The Midway distributed shared memory system (context) - Bershad, Zekauskas et al. - 1993
275
Virtual memory mapped network interface for the SHRIMP multi..
- Blumrich, Li et al. - 1994
268
Tempest and Typhoon: User-level shared memory
- Reinhardt, Larus et al. - 1994
239
Algorithms for scalable synchronization on shared-memory mul.. (context) - Mellor-Crummey, Scott - 1991
227
Kernighan and Dennis M (context) - Brian - 1988
213
Weak ordering---a new definition
- Adve, Hill - 1990
212
The MIT Alewife machine: Architecture and performance
- Agarwal, Bianchini et al. - 1995
199
An introduction to input/output automata (context) - Lynch, Tuttle - 1989
197
The performance of spin lock alternatives for shared-memory .. (context) - Anderson - 1990
187
Ethernet: Distributed packet switching for local computer ne.. (context) - Metcalfe, Boggs - 1976
181
ACM Transactions on Programming Languages and Systems (context) - Herlihy - 1991
173
Protocol verification as a hardware design aid
- Dill, Drexler et al. - 1992
171
EEL: Machine-independent executable editing (context) - Larus, Schnarr - 1995
166
The Wisconsin Wind Tunnel: Virtual prototyping of parallel c..
- Reinhardt, Hill et al. - 1993
159
The NYU ultracomputer---designing an MIMD shared memory para.. (context) - Gottlieb, Grishman et al. - 1983
157
Architecture and applications of the HEP multiprocessor comp.. (context) - Smith - 1981
141
Asynchronous distributed simulation via a sequence of parall.. (context) - Chandy, Misra - 1981
139
The ALOHA system---another alternative for computer communic.. (context) - Abramson - 1970
138
SPLASH: Stanford parallel applications for shared memory (context) - Singh, Weber et al. - 1992
130
Memory consistency and event ordering in scalable shared-mem.. (context) - Gharachorloo, Lenoski et al. - 1990
112
Efficient synchronization primitives for large-scale cache-c.. (context) - Goodman, Vernon et al. - 1989
111
Myrinet: A gigabit -per-second local area network (context) - Boden, Cohen et al. - 1995
111
Using cache memory to reduce processor-memory traffic (context) - Goodman - 1983
110
Portable Programs for Parallel Processors (context) - Boyle, Butler et al. - 1987
109
Comparative evaluation of latency reducing and tolerating te..
- Gupta, Hennessy et al. - 1991
107
The DASH prototype: Implementation and performance (context) - Lenoski, Laudon et al. - 1992
107
A new solution of Dijkstra's concurrent programming problem (context) - Lamport - 1974
103
software-only approach for supporting fine-grain shared memo.. (context) - Scales, Gharachorloo et al. - 1996
101
The SPLASH-2 programs: Characterization and methodological c..
- Woo, Ohara et al. - 1995
92
Cooperative shared memory: Software and hardware for scalabl..
- Hill, Larus et al. - 1993
92
Cooperative shared memory: Software and hardware for scalabl..
- Hill, Larus et al. - 1992
90
The IBM research parallel processor prototype (context) - Pfister, Brantley et al. - 1985
87
Complete computer system simulation: The SimOS approach
- Rosenblum, Herrod et al. - 1995
85
CM-5 Technical Summary (context) - Corporation - 1991
83
Basic techniques for the efficient coordination of very larg..
- Gottlieb, Lubachevsky et al. - 1983
79
The design and evaluation of a shared object system for dist..
- Scales, Lam - 1994
78
Data prefetching in multiprocessor vector cache memories (context) - Fu, Patel - 1991
77
STiNG: A CC-NUMA computer system for the commercial marketpl..
- Lovett, Clapp - 1996
76
The Wisconsin Multicube: A new large-scale cache-coherent mu.. (context) - Goodman, Woest - 1988
74
Transactional memory: Architectural support for lock-free da..
- Herlihy, Eliot et al. - 1993
70
Cache memories (context) - Smith - 1982
70
Dynamic decentralized cache schemes for MIMD parallel proces.. (context) - Rudolph, Segall - 1984
66
Parallel Computer Architecture: A Hardware/Software Approach (context) - Culler, Singh - 1999
64
Cache invalidation patterns in shared-memory multiprocessors (context) - Gupta, Weber - 1992
62
The Symmetry multiprocessor system (context) - Lovett, Thakkar - 1988
60
MIPS RISC Architecture (context) - Kane, Heinrich - 1992
60
KSR1 Principles of Operation (context) - Research, Cambridge - 1991
59
Application restructuring and performance portability on sha..
- Jiang, Shan et al. - 1997
57
Efficient synchronization on multiprocessors with shared mem..
- Kruskal, Rudolph et al. - 1988
57
Efficient synchronization on multiprocessors with shared mem..
- Kruskal, Rudolph et al. - 1986
52
Decoupled hardware support for distributed shared memory (context) - Reinhardt, Pfile et al. - 1996
49
Pentium Pro Family Developer's Manual (context) - Corporation - 1996
48
PROTEUS: A high-performance parallel-architecture simulator
- Brewer, Dellarocas et al. - 1991
42
Synchronization without contention (context) - Mellor-Crummey, Scott - 1991
41
Integration of message passing and shared memory in the Stan..
- Heinlein, Gharachorloo et al. - 1994
41
The PowerPC Architecture: A Specification for a New Family o.. (context) - May, Silha et al. - 1994
40
Adjustable block size coherent caches
- Dubnicki, LeBlanc - 1992
39
Toward Scalable Cache Only Memory Architectures
- Hagersten - 1992
38
Are wait-free algorithms fast
- Attiya, Lynch et al. - 1994
38
Programming for different memory consistency models
- Gharachorloo, Adve et al. - 1992
37
and node granularity issues for large-scale multiprocessors (context) - Rothberg, Singh et al. - 1993
35
Performance of the SCI ring (context) - Scott, Goodman et al. - 1992
34
Coherent network interfaces for fine-grain communication
- Mukherjee, Falsafi et al. - 1996
34
shared-memory (context) - Nikhil, parallel - 1994
32
contention and combining in multistage interconnection netwo.. (context) - Pfister, Norton et al. - 1985
31
user-level DMA for the SHRIMP network interface (context) - Blumrich, Dubnicki et al. - 1996
31
Adaptive backoff synchronization techniques (context) - Agarwal, Cherian - 1989
30
Cost-effective parallel computing
- Wood, Hill - 1995
28
Cray TD new dimension Cray Research (context) - Kessler, new et al. - 1993
28
A new approach to exclusive data access in shared memory mul.. (context) - Jensen, Hagensen et al. - 1987
28
The shared memory server (context) - Forin, Barrera et al. - 1989
27
Mechanisms for cooperative 209 shared memory
- Wood, Chandra et al. - 1993
27
Synchronization with multiprocessor caches (context) - Lee, Ramachandran - 1990
26
The Monarch parallel processor hardware design (context) - Rettberg, Crowther et al. - 1990
24
Process coordination with fetchand -increment (context) - Freudenthal, Gottlieb - 1991
24
The performance implications of spin-waiting alternatives fo.. (context) - Anderson - 1989
23
Transparent support for wait-free transactions
- Moir - 1997
22
Efficient synchronization: Let them eat QOLB (context) - Kgi, Burger et al. - 1997
22
Data prefetching and data forwarding in shared memory multip..
- Poulsen, Yew - 1994
22
Science of Computer Programming (context) - Moir, Anderson et al. - 1995
22
performance in parallel simulation of interconnection networ.. (context) - Burger, Wood - 1995
21
Exploiting spatial locality in data caches using spatial foo..
- Kumar, Wilkerson - 1998
21
Synchronization and communication TE multiprocessor (context) - Scott, communication et al. - 1996
21
Synchronization algorithms for shared-memory multiprocessors (context) - Graunke, Thakkar - 1990
19
Runtime spatial locality detection and optimization
- Johnson, Merten et al. - 1997
17
An evaluation of fine-grain producer-initiated communication..
- Abdel-Shafi, Hall et al. - 1997
17
Data forwarding in scalable shared-memory multiprocessors
- Koufaty, Chen et al. - 1995
17
A performance study of software and hardware data prefetchin.. (context) - Chen, Baer - 1994
16
long-lived renaming improved and simplified (context) - Moir - 1998
15
SPARC RISC User's Guide: HyperSPARC Edition (context) - Technology, TX - 1993
14
Coordinating parallel processors: A partial unification (context) - Gottlieb, Kruskal - 1981
12
Alpha AXP architecture (context) - Sites - 1992
12
Department of Electrical and Computer Engineering (context) - Pai, Ranganathan et al. - 1997
12
Alpha 21364 to ease memory bottleneck (context) - Gwennap - 1998
11
Hardware support for synchronization in the Scalable Coheren..
- Aboulenein, Goodman et al. - 1994
11
Architectural implications of a family of irregular applicat..
- O'Hallaron, Shewchuk et al. - 1998
11
Formal specification of memory models (context) - Sindhu, Frailong et al. - 1991
11
Hardware support for synchronization in the Scalable Coheren..
- Aboulenein, Goodman et al. - 1992
10
A scheme to enforce data dependence on large multiprocessor .. (context) - Zhu, Yew - 1987
10
User-level DMA without operating system kernel modification
- Markatos, Katevenis - 1997
10
Techniques for reducing the overheads of shared-memory multi.. (context) - Kgi, Aboulenein et al. - 1995
9
Recoverable user-level mutual exclusion
- Bohannon, Lieuwen et al. - 1995
9
An analysis of synchronization mechanisms in shared-memory m..
- Woest, Goodman - 1991
9
Reactive Synchronization Algorithms for Multiprocessors (context) - Lim - 1995
9
The impact of speeding up critical sections with data prefet..
- Trancoso, Torrellas - 1996
9
Multiprocessor cache synchronization: Issues (context) - Bitar, Despain - 1986
9
Reactive synchronization algorithms for multiprocessors (context) - Lim, Agarwal - 1994
8
Multiple reservations and the Oklahoma update (context) - Stone, Stone et al. - 1993
8
Implementation of atomic primitives on distributed shared me..
- Michael, Scott - 1995
8
VLSI assist for a multiprocessor (context) - Beck, Kasten et al. - 1987
8
Architecture of the VPP500 parallel supercomputer (context) - Utsumi, Ikeda et al. - 1994
7
Efficient software synchronization on large cache coherent m..
- Magnusson, Landin et al. - 1994
7
Software Structures for Ultraparallel Computing (context) - Rudolph - 1981
7
Architecture of the IBM System (context) - Case, Padegs - 1978
6
The Mercury interconnect architecture: A cost-effective infr.. (context) - Weber, Gold et al. - 1997
5
Architectural mechanisms for explicit communication in share..
- Ramachandran, Shah et al. - 1995
5
Architecture of the IBM System (context) - Amdahl, Blaauw et al. - 1964
5
Optimized Multiprocessor Communication and Synchronization U.. (context) - Heinlein - 1998
5
CICO: A practical shared-memory programming performance mode..
- Larus, Chandra et al. - 1994
5
Building FIFO and priority-queueing spin locks from atomic s.. (context) - Craig - 1993
5
Mechanisms for Distributed Shared Memory
- Reinhardt - 1996
5
MP-LOCKs: Replacing h/w synchronization primitives with mess..
- Kuo, Carter et al. - 1999
4
Fine Grain Distributed Shared Memory on Clusters of Workstat..
- Schoinas - 1997
4
Identification and Optimization of Sharing Patterns for Scal..
- Kaxiras - 1998
4
Simulation of the SCI transport layer on the Wisconsin Wind ..
- Burger, Goodman - 1995
3
CYC SPARC RISC User's Guide (context) - Semiconductor, CA et al. - 1990
3
An efficient shared memory layer for distributed memory mach..
- Scales, Lam - 1994
2
Spin-lock synchronization on the Butterfly and KSR
- Zhang, Castaeda et al. - 1994
1
Synchronization primitive implementation including the bus a.. (context) - Glew - 1991
1
MP-LOCKs: Replacing hardware synchronization primitives with.. (context) - Kuo, Carter et al. - 1998
1
Fine-grain control for distributed shared memory (context) - Schoinas, Falsafi et al. - 1994
Documents on the same site (http://www.cs.wisc.edu/~galileo/pubs.html): More
System-Level Implications of Processor-Memory Integration - Burger (1997)
(Correct)
Hardware Techniques To Improve The Performance Of The.. - Burger (1998)
(Correct)
Appeared in ASPLOS-III, April 1989, pp. 64-75 - Efficient Synchronization..
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC