40 citations found. Retrieving documents...
B-H. Lim and A. Agarwal. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth International ConferenceonArchitectural Support for Programming Languages and Operating Systems, San Jose, CA, October 1994.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Synchronization Support Using Full/Empty Tagged Shared Memory .. - Vlassov, Moritz   (Correct)

....synchronization and multithreading partly in hardware and partly in software. Hardware support includes full empty bits, special memory operations and a special full empty trap. The software support includes trap handler routines for fast context switching and waiting for synchronization [13]. As illustrated in [28] for the Alewife machine, the fine grain synchronization expressed with synchronizing data structures (J structures) and supported at low level with the full empty word level synchronization, outperforms the course grain barrier synchronization. The experience reported in ....

B.-H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pages 25--35, San Jose, CA, October 1994. ACM.


Non-Blocking Timeout in Scalable Queue-Based Spin Locks - Michael Scott Department (2002)   (3 citations)  (Correct)

....of commercial OLTP codes. We also plan to develop variants that block in the scheduler on timeout [9, 16] cooperate with the scheduler to avoid preemption while in acritical section [6, 10] or adapt dynamically between test and set and queue based locking in response to observed contention [11]. In a related vein, we are developing a tool to help verify the correctness of locking algorithms by transforming source code automatically into input for a model checker. 17 5Acknowledgments Ourthanks to Paul Martin and Mark Moir of Sun Labs, Boston, for their help in obtaining results on the ....

B.-H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems,pages 25--35, San Jose, CA, October 1994.


Combining Funnels: A new twist on an old tale... - Shavit, Zemach (1998)   (Correct)

....is discussed in detail in Section 4 of this paper and in [23] There is no obvious way of getting around these difficulties. For example, combining trees are not amenable to an adaptive strategy which might shrink the tree when average load is low (e.g. the reactive locks of Lim and Agarwal [16]) since there is no easy way to lower the number of nodes and still limit simultaneous access to a node to no more than two processors. Furthermore, decentralized algorithms for dynamically changing tree size (see for example the Reactive Diffracting Trees of Della Libera and Shavlt [6] tend to ....

....of the independent choices made by individual processors. Adaptive algorithms, allowing the data structure to change behavior to accommodate different access frequencies, have been used both in locking (see Karlin et al. 14] and Lim and Agarwal [17] and for more general fetch and operations [16]. The work of Lim and Agarwal [16] showed the performance benefit of dynamically switching between lock ing an object and using (static) combining trees, based on whether the overhead of the latter justifies the added potential for parallelism. Combining funnels take this idea one step further by ....

[Article contains additional citation context not shown here]

B.H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Sixth International Conference on Architectural Support for Programning Languages and Operating Systems (ASPLOS VI), pp. 25-35, 1994.


Integrating Non-blocking Synchronisation in Parallel.. - Tsigas, Zhang (2002)   (Correct)

....[7] Some evaluation studies have also been performed for speci c data structure implementations. Most of these performance evaluations were using micro benchmarks and were performed on small scale symmetric multiprocessors, as well as distributed memory machines [3, 8, 10, 11, 16] or simulators [10, 13]. Micro benchmarks are useful since they enable easy isolation of performance issues, but the real goal of better synchronisation methods is to improve performance of real applications, which micro benchmarks may not represent well. A substantial number of realistic scalable applications now ....

B. Lim and A. Agarwal, Reactive Synchronization Algorithms for Multiprocessors, in Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25-35, October 1994.


Hierarchical Backoff Locks for Nonuniform Communication.. - Radovic, Hagersten (2003)   (1 citation)  (Correct)

.... of the lock, and contenders ordered after it will not be affected [5, 16, 18] However, the complicated software queuing locks are less efficient for uncontested locks, which have led to the creation of even more complicated adaptive hybrid proposals in the quest for a general purpose solution [14]. Shared memory architectures with a nonuniform memory access time to the shared memory (CC NUMAs) are gaining popularity. Most systems that form NUMA architectures also have the characteristic of a nonuniform communication architecture (NUCA) in which the access time from a processor to other ....

....are used, and the implementation is vulnerable to starvation. Alternative Approaches. The fact that some synchronization algorithms perform well under low contention periods and others under high contention periods is the basic idea behind the reactive synchronization presented by Lim and Agarwal [14]. Reactive algorithms will dynamically switch among several software lock implementations. Typically, spin locks (TATAS EXP) are used during the lowcontention phase, and queue based locks (MCS) are used during the high contention phase [11] Reactive algorithms demonstrate modest performance ....

[Article contains additional citation context not shown here]

B.-H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 25--35, Oct. 1994.


Speculation-Based Techniques for Lockfree Execution of Lock-Based .. - Rajwar (2002)   (Correct)

....queue spins. When the current lock holder leaves the critical section, it simply clears the value pointed to by the address it maintains. Lim and Agarwal proposed reactive synchronization, a technique that attempts to select the software primitive best suited for a given level of lock contention [113]. Woest and Goodman [168] present a quantitative and qualitative comparison of test set, MCS, and QOLB. Kgi et al. 81] were the first to perform a comprehensive performance comparison of various popular synchronization algorithms. The study concluded that for the set of benchmarks used, QOLB ....

Beng-Hong Lim and Anant Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Proceedings of the Sixth Symposium on Architectural Support for Programming Languages and Operating Systems, pages 25--35, October 1994.


Efficient Synchronization for Nonuniform Communication.. - Radovic, Hagersten (2002)   (Correct)

.... Approaches The fact that some synchronization algorithms perform well under low contention periods and other under high3 contention periods is the basic idea behind the reactive synchronization presented by Lim and Agarwal a couple of years after the first proposals for queue based locks [LA94] Reactive synchronization algorithms will dynamically switch among several software lock implementations. Typically, spin locks (TATAS EXP) are used during the lowcontention phase, and queue based locks (MCS) are used during the high contention phase [KBG97] The goal of reactive synchronization ....

B-H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 25--35, October 1994.


A Comparison of Software and Hardware Synchronization - Mechanisms For Distributed   (Correct)

.... The designers of traditional multiprocessors have included hardware support only for simple operations such as compare and swap and load linked store conditional, while high level synchronization primitives such as locks, barriers, and condition variables have been implemented in software [9, 14, 15]. With the advent of directory based distributed shared memory (DSM) multiprocessors with significant flexibility in their cache controllers [7, 12, 17] it is worthwhile considering whether this flexibility should be used to support higher level synchronization primitives in hardware. In ....

....traditional multiprocessors have included hardware support only for simple operations such as test and set, fetch and op, compare and swap, and load linked store conditional. Higher level synchronization primitives such as locks, barriers, and condition variables have been implemented in software [9, 14,15]. However, these designs were all originally based on simple bus based shared memory architectures. Recent scalable shared memory designs have employed distributed directories and increasingly sophisticated and flexible node controllers [7, 12, 17] In particular, as part of maintaining data ....

[Article contains additional citation context not shown here]

Beng-Hong Lim and Anant Agarwal. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 25--35, October 1994.


Integrating Non-blocking Synchronisation in Parallel.. - Tsigas, Zhang (2002)   (Correct)

....[7] Some evaluation studies have also been performed for specic data structure implementations. Most of these performance evaluations were using micro benchmarks and were performed on small scale symmetric multiprocessors, as well as distributed memory machines [3, 10, 11, 8, 16] or simulators [10, 13]. Micro benchmarks are useful since they enable easy isolation of performance issues, but the real goal of better synchronisation methods is to improve performance of real applications, which micro benchmarks may not represent well. A substantial number of realistic scalable applications now ....

B. Lim and A. Agarwal, Reactive Synchronization Algorithms for Multiprocessors, in Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25-35, October 1994.


Lock Coarsening: Eliminating Lock Overhead in Automatically.. - Diniz, Rinard (1996)   (9 citations)  (Correct)

....[11] 9.4 Efficient Synchronization Algorithms Other researchers have addressed the issue of synchronization overhead reduction. This work has concentrated on the development of more efficient implementations of synchronization primitives using various protocols and waiting mechanisms [9, 13]. The research presented in this article is orthogonal to and synergistic with this work. Lock coarsening reduces the lock overhead by reducing the frequency with which the generated parallel code acquires and releases locks, not by providing a more efficient implementation of the locking ....

B-H. Lim and A. Agarwal. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, October 1994. ACM Press.


Evaluating The Performance of Non-Blocking Synchronisation on.. - Tsigas, Zhang (2000)   (Correct)

....power. Some evaluation studies have also been performed for specic data structure implementations [15] Most of these performance evaluations were using micro benchmarks and were performed on small scale symmetric multiprocessors, as well as distributed memory machines [3, 9, 10, 7] or simulators [9, 12]. Micro benchmarks are useful since they enable easy isolation of performance issues, but the real 2 Department of Computing Science, Chalmers University of Technology TR 2000 02 goal of better synchronisation methods is to improve performance of real applications, which microbenchmarks may not ....

B. Lim and A. Agarwal, Reactive Synchronization Algorithms for Multiprocessors, in Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25-35, October 1994.


The Performance of Concurrent Red-Black Tree Algorithms - Hanke (1998)   (Correct)

....lock costs that we performed on several types of Sun and SGI machines. For this, we measured the time to execute a loop that acquires a lock, then waits some constant time, releases the lock, and again delays some time. This test program is similar to that used by Anderson [2] or Lim and Agarwal [22]. In order to be able to transfer the measured time to the psim model, we executed analogous loops for the arithmetic operations and converted the results to psim time. Given in psim time the measured lock costs were about 20 to 30 units of time. 4.2.1 Comparison of the Algorithms In the first ....

B.-H. Lim and A. Agarwal. Reactive synchronization algorithms for multiprocessors. In Proc. 6th Symposium on Architectural Support for Programming Languages and Operating Systems, pages 25--35, 1994.


Combining Funnels: A Dynamic Approach To Software Combining - Shavit, Zemach (2000)   (1 citation)  (Correct)

....node handle this increased parallelism and contention effectively complicates the protocol used in the tree nodes and increases latency. Combining trees are also not easily amenable to an adaptive strategy which shrinks the tree when average load is low (e.g. the reactive locks of Lim and Agarwal [19]) since there is no clear way to lower the number of nodes and at the same time limit simultaneous access to a node to no more than two processors. Furthermore, decentralized algorithms for dynamically changing tree size (see for example the Reactive Diffracting Trees of Della Libera and Shavit ....

....provides the basis for an adaptive combining structure. Adaptive algorithms, allowing the data structure to change behavior to accommodate different access frequencies, have been used both in locking (see Karlin et al. 17] and Lim and Agarwal [20] and for more general fetch and Phi operations [19]. The work of Lim and Agarwal [19] showed the performance benefit of dynamically switching between locking an object and using (static) combining trees, based on whether the overhead of the latter justifies the added potential for parallelism. Combining funnels take this idea one step further by ....

[Article contains additional citation context not shown here]

B.H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25--35, 1994.


Eliminating Synchronization Overhead in Automatically.. - Diniz, Rinard (1999)   (1 citation)  (Correct)

.... to inline procedures in C programs [Chang et al. 1992] to drive instruction scheduling algorithms [Chen et al. 1994] to help place code so as to minimize the impact on the memory hierarchy [Pettis and Hansen 1990] to minimize the overhead associated with cache coherency in CC NUMA machines [Chilimbi and Larus 1994], to aid in register allocation [Morris 1991; Wall 1986] and to direct the compiler to frequently executed parts of the program so that the compiler can apply further optimizations [Fernandez 1995; Anderson et al. 1997] Brewer [Brewer 1995] describes a system that uses statistical modeling to ....

....programs [Plevyak et al. 1995] Because access region expansion is designed to reduce the overhead in sequential executions of such programs, it does not address the trade off between lock overhead and waiting overhead. The goal is simply to minimize the lock overhead. Lim and Agarwal [Lim and Agarwal 1994] developed a reactive synchronization mechanism for synchronization operations in multiprocessors. The basic idea is to change the implementation of the locking constructs based on the observed contention. This reactive synchronization mechanism resembles dynamic feedback in that it uses a dynamic ....

LIM, B.-H. AND AGARWAL, A. 1994. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM Press, New York, NY, 25--37.


Synchronization Transformations for Parallel Computing - Diniz, Rinard (1997)   (11 citations)  (Correct)

....Efficient Synchronization Algorithms Other researchers have addressed the issue of synchronization overhead reduction. This work has concentrated on the development of more efficient implementations of synchronization primitives using various protocols and waiting mechanisms [Goodman et al. 1989; Lim and Agarwal 1994]. The research presented in this article is orthogonal to and synergistic with this work. Lock elimination reduces the lock overhead by reducing the frequency with which the generated parallel code acquires and releases locks, not by providing a more efficient implementation of the locking ....

LIM, B.-H. AND AGARWAL, A. 1994. Reactive synchronization algorithms for multiprocessors. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, San Jose, CA.


Evaluating Synchronization on Shared Address Space.. - Kumar, Jiang.. (1999)   (11 citations)  (Correct)

....barrier microbenchmarks on a bus based Sequent Symmetry multiprocessor and a BBN Butterfly distributed memory, non coherent shared address space machine. Since then, other studies of synchronization on cache coherent systems have been performed, but they have either been performed on simulators [8, 6] or have used microbenchmarks to evaluate synchronization performance [8, 4, 7, 5] A lot has changed in the last decade since the classic study [10] On the systems side, scalable, hardware coherent machines with physically distributed memory, not examined in that study, have become very popular ....

....a BBN Butterfly distributed memory, non coherent shared address space machine. Since then, other studies of synchronization on cache coherent systems have been performed, but they have either been performed on simulators [8, 6] or have used microbenchmarks to evaluate synchronization performance [8, 4, 7, 5]. A lot has changed in the last decade since the classic study [10] On the systems side, scalable, hardware coherent machines with physically distributed memory, not examined in that study, have become very popular for moderate to large scale computing. The speeds of processors relative to ....

[Article contains additional citation context not shown here]

Lim, B.-H., and Agarwal, A. Reactive synchronization algorithms for multiprocessors. In Architectural Support for Programming Languages and Operating Systems (San Jose, California, October 4--7, 1994), pp. 25--35.


Scalable Concurrent Priority Queue Algorithms - Shavit, Zemach (1999)   (Correct)

....Though other structures like diffracting trees [32] and counting networks [3] provide efficient implementations of fetch and increment, their operations cannot be readily transformed into the new bounded fetch and increment required for our priority queues. As the research of Lim and Agarwal [22, 23], Della Libera and Shavit [12] and Karlin et al. 19] has shown, the key to delivering good performance over a wide range of concurrency levels, is the ability of a data structure to adapt to the load actually encountered. The adaption techniques of Lim and Agarwal [22] use a centralized form of ....

....of Lim and Agarwal [22, 23] Della Libera and Shavit [12] and Karlin et al. 19] has shown, the key to delivering good performance over a wide range of concurrency levels, is the ability of a data structure to adapt to the load actually encountered. The adaption techniques of Lim and Agarwal [22] use a centralized form of coordination that replaces one entire data structure by another, say, an MCS queuelock with a combining tree, in order to handle higher (respectively, lower) load. Our approach here is to avoid replacing one complete structure with another, as this would require a more ....

[Article contains additional citation context not shown here]

B.H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25--35, 1994.


Combining Funnels: A new twist on an old tale. . . - Shavit, Zemach (1998)   (Correct)

....is discussed in detail in Section 4 of this paper and in [23] There is no obvious way of getting around these difficulties. For example, combining trees are not amenable to an adaptive strategy which might shrink the tree when average load is low (e.g. the reactive locks of Lim and Agarwal [16]) since there is no easy way to 2 lower the number of nodes and still limit simultaneous access to a node to no more than two processors. Furthermore, decentralized algorithms for dynamically changing tree size (see for example the Reactive Diffracting Trees of Della Libera and Shavit [6] tend ....

....the independent choices made by individual processors. Adaptive algorithms, allowing the data structure to change behavior to accommodate different access frequencies, have been used both in locking (see Karlin et al. 14] and Lim and Agarwal [17] and for more general fetch and Phi operations [16]. The work of Lim and Agarwal [16] showed the performance benefit of dynamically switching between locking an object and using (static) combining trees, based on whether the overhead of the latter justifies the added potential for parallelism. Combining funnels take this idea 3 one step further ....

[Article contains additional citation context not shown here]

B.H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25--35, 1994.


Combining Funnels: A new twist on an old tale. . . - Shavit, Zemach (1998)   (Correct)

....The downside of static assignment is that even if the tree is rarely accessed by all P processors simultaneously, its depth must still be log P . This set up is not amenable to an adaptive strategy which might shrink the tree when average load is low (e.g. the reactive locks of Lim and Agarwal[16]) since there is no easy way to lower the number of nodes and still limit simultaneous access to a node to no more than two processors. Furthermore, decentralized algorithms for dynamically changing tree size (see for example the Reactive Diffracting trees of Della Libera and Shavit [6] tend to ....

....of the independent choices made by individual processors. Adaptive algorithms, allowing the data structure to change behavior to accommodate different access frequencies, have been used both in locking (see Karlin et al. 14] and Lim and Agarwal [17] and for more general fetch and Phi operations [16]. The work of Lim and Agarwal [16] showed the performance benefit of dynamically switching between locking an object and using (static) combining trees, based on whether the overhead of the latter justifies the added potential for parallelism. Combining funnels take this idea one step further by ....

[Article contains additional citation context not shown here]

B.H. Lim and A. Agarwal. Reactive Synchronization Algorithms for Multiprocessors. In Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25--35, 1994.


Lock Coarsening: Eliminating Lock Overhead in Automatically.. - Diniz, Rinard (1996)   (9 citations)  (Correct)

No context found.

B-H. Lim and A. Agarwal. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth International ConferenceonArchitectural Support for Programming Languages and Operating Systems, San Jose, CA, October 1994.


Permission to Make Digital Or Hard Copies of All Or Part.. - Personal Or Classroom   (Correct)

No context found.

B.-H. Lim and A. Agarwal. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth Symposium on Architectural Support for Programming Languages and Operating Systems, pages 25--35, Oct. 1994.


A Comparison of Software and Hardware Synchronization.. - Carter, Kuo, Kuramkote (1996)   (9 citations)  (Correct)

No context found.

Beng-Hong Lim and Anant Agarwal. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), pages 25--35, October 1994.


The System-on-a-Chip Lock Cache - Akgul (2004)   (Correct)

No context found.

Lim, B. H. and Agarwal, A., "Reactive Synchronization Algorithms for Multiprocessors, " Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pp. 25--35, October 1994.


Performance Implication of Fine-Grained Synchronization in .. - Merino, Vlassov, al.   (Correct)

No context found.

Lim, B.H. and Agarwal, A.: "Reactive synchronization Algorithms for Multiprocessors", Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, 1994


Reactive Diffracting Trees - Della-Libera, Shavit (1997)   (5 citations)  (Correct)

No context found.

B. H. Lim and A. Agarwal. Reactive synchronization algorithms for multiprocessors. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI),, pages 25--35, October 1994.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC