MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Combining Funnels: A New Twist on an Old Tale (1998) [2 citations — 1 self]

Download:
Download as a PDF | Download as a PS
by Nir Shavit, Asaph Zemach
In Proceedings of the 17th Annual ACM Symposium on Principals of Distributed Computing (PODC), Santa Barbara
http://www.math.tau.ac.il/~asaph/export/cf.ps.gz
Add To MetaCart

Abstract:

We enhance the well established software combining synchronization technique to create combining funnels. Previous software combining methods used a statically assigned tree whose depth was logarithmic in the total number of processors in the system. On a shared memory multiprocessors the new method allows to dynamically build combining trees with depth logarithmic in the actual number of processors accessing the data structure concurrently. The structure is comprised from a series of combining layers through which processor's requests are funneled. These layers use randomization instead of a rigid tree structure to allow processors to find partners for combining. By using an adaptive scheme the funnel can change width and depth to accommodate different access frequencies without requiring global agreement as to its size. Rather, processors choose parameters of the protocol privately, making this scheme very simple to implement and tune. When we add an "elimination " mechanism to the funnel structure, the randomly constructed "tree " is transformed into a "forest " of disjoint (and on average shallower) trees of requests, thus enhancing the level of parallelism and decreasing latency. We present two new linearizable combining funnel based data structures: a fetch-and-add object and a stack. We study the performance of these structures by benchmarking them against the most efficient software implementations of fetchand-add and stacks known to date, combining trees and elimination trees, on a simulated shared memory multiprocessor using Proteus. Our empirical data shows that combining funnel based fetch-and-add outperforms combining trees of fixed height by as much as 70%. In fact, even compared to combining trees optimized for a given load, funnel performance is the same or better. Elimination trees, which are not linearizable, are 10 % faster than funnels under highest load, but as load drops combining funnels adapt their size, giving them a 34 % lead in latency.

Citations

554 Linearizability: A correctness condition for concurrent objects – Herlihy, Wing - 1990
384 Algorithms for scalable synchronization on shared-memory multiprocessors – Mellor-Crummey, Scott - 1991
338 Hierarchical correctness proofs for distributed algorithms – Lynch, Tuttle - 1987
229 PROTEUS: A High-Performance Parallel-Architecture Simulator – Brewer, Dellarocas, et al. - 1991
202 The performance of spin lock alternatives for shared-memory multiprocessors – Anderson - 1990
195 hot spot¨contention and combining in multistage interconnection networks,” in Interconnection networks for high-performance parallel computers. Los Alamitos – Pfister, Norton - 1994
154 The MIT alewife machine : A large-scale distributed-memory multiprocessor – Agarwal, D’Souza, et al. - 1991
146 Efficient synchronization primitives for large-scale cache-coherent multiprocessor – Goodman, Vernon, et al. - 1989
97 Synchronization algorithms for shared-memory multiprocessors – Graunke, Thakkar - 1990
93 Distributing hot-spot addressing in large-scale multiprocessors – Yew, Tzeng, et al. - 1987
92 Basic techniques for the efficient coordination of very large numbers of cooperating sequential processors. Programming Languages and Systems – Gottlieb, Lubachevsky, et al. - 1983
81 E cient Synchronization on Multiprocessors with Shared Memory – Kruskal, Rudolph, et al. - 1988
64 1985]. “The IBM research parallel processor prototype (RP3): Introduction and architecture – Pfister, Brantley, et al.
47 Reactive synchronization algorithms for multiprocessors – Lim, Agarwal - 1994
47 Waiting Algorithms for Synchronization in Large-Scale Multiprocessors – Lim, Agarwal - 1993
47 Adaptive backoff synchronization techniques – AganvaJ, Cherian - 1989
45 Diffracting trees – Shavit, Zemach - 1996
34 Elimination trees and the construction of pools and stacks – Shavit, Touitou - 1995
27 Scalable concurrent counting – Herlihy, Lim, et al. - 1995
26 Processing Hot Spots’ in High Performance Systems – Gawlick - 1985
21 The NYU Ultracomputer – designing an MIMD parallel computer – Gottlieb, Grishman, et al. - 1984
20 et al. The IBM research parallel processor prototype (RP3): Introduction and architecture – Pfister - 1985
11 Empirical Studies of Competitive Spinning for A Shared Memory Multiprocessor – Karlin, Li, et al. - 1991
10 Dellarocas, ‘‘Proteus User Documentation – Brewer, N - 1992
2 The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor. In Scalable Shared &femory Multiprocessors – Nussbaum - 1991
1 A High-Performance ParallelArchitecture Simulator – PROTEUS - 1991