Results 1  10
of
10
The Complexity of Renaming
"... We study the complexity of renaming, a fundamental problem in distributed computing in which a set of processes need to pick distinct names from a given namespace. We prove an individual lower bound of Ω(k) process steps for deterministic renaming into any namespace of size subexponential in k, whe ..."
Abstract

Cited by 15 (10 self)
 Add to MetaCart
(Show Context)
We study the complexity of renaming, a fundamental problem in distributed computing in which a set of processes need to pick distinct names from a given namespace. We prove an individual lower bound of Ω(k) process steps for deterministic renaming into any namespace of size subexponential in k, where k is the number of participants. This bound is tight: it draws an exponential separation between deterministic and randomized solutions, and implies new tight bounds for deterministic fetchandincrement registers, queues and stacks. The proof of the bound is interesting in its own right, for it relies on the first reduction from renaming to another fundamental problem in distributed computing: mutual exclusion. We complement our individual bound with a global lower bound of Ω(k log(k/c)) on the total step complexity of renaming into a namespace of size ck, for any c ≥ 1. This applies to randomized algorithms against a strong adversary, and helps derive new global lower bounds for randomized approximate counter and fetchandincrement implementations, all tight within logarithmic factors. 1
On The Power of Hardware Transactional Memory to Simplify Memory Management
"... Dynamic memory management is a significant source of complexity in the design and implementation of practical concurrent data structures. We study how hardware transactional memory (HTM) can be used to simplify and streamline memory reclamation for such data structures. We propose and evaluate sever ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
(Show Context)
Dynamic memory management is a significant source of complexity in the design and implementation of practical concurrent data structures. We study how hardware transactional memory (HTM) can be used to simplify and streamline memory reclamation for such data structures. We propose and evaluate several new HTMbased algorithms for the “Dynamic Collect ” problem that lies at the heart of many modern memory management algorithms. We demonstrate that HTM enables simpler and faster solutions, with better memory reclamation properties, than prior approaches. Despite recent theoretical arguments that HTM provides no worstcase advantages, our results support the claim that HTM can provide significantly better commoncase performance, as well as reduced conceptual complexity.
On the cost of concurrency in transactional memory
 CoRR
"... The promise of software transactional memory (STM) is to combine an easytouse programming interface with an efficient utilization of the concurrentcomputing abilities provided by modern machines. But does this combination come with an inherent cost? We evaluate the cost of concurrency by measur ..."
Abstract

Cited by 10 (7 self)
 Add to MetaCart
(Show Context)
The promise of software transactional memory (STM) is to combine an easytouse programming interface with an efficient utilization of the concurrentcomputing abilities provided by modern machines. But does this combination come with an inherent cost? We evaluate the cost of concurrency by measuring the amount of expensive synchronization that must be employed in an STM implementation that ensures positive concurrency, i.e., allows for concurrent transaction processing in some executions. We focus on two popular progress conditions that provide positive concurrency: progressiveness and permissiveness. We show that in permissive STMs, providing a very high degree of concurrency, a transaction performs a linear number of expensive synchronization patterns with respect to its readset size. In contrast, progressive STMs provide a very small degree of concurrency but, as we demonstrate, can be implemented using at most one expensive synchronization pattern per transaction. However, we show that even in progressive STMs, a transaction has to “protect ” (e.g., by using locks or strong synchronization primitives) a linear amount of data with respect to its writeset size. Our results suggest that looking for high degrees of concurrency in STM implementations may bring a considerable synchronization cost.
An Ω(n log n) Lower Bound on the Cost of Mutual Exclusion
, 2006
"... We prove an Ω(n log n) lower bound on the number of nonbusywaiting memory accesses by any deterministic algorithm solving n process mutual exclusion that communicates via shared registers. The cost of the algorithm is measured in the state change cost model, a variation of the cache coherent model. ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
We prove an Ω(n log n) lower bound on the number of nonbusywaiting memory accesses by any deterministic algorithm solving n process mutual exclusion that communicates via shared registers. The cost of the algorithm is measured in the state change cost model, a variation of the cache coherent model. Our bound is tight in this model. We introduce a novel information theoretic proof technique. We first establish a lower bound on the information needed by processes to solve mutual exclusion. Then we relate the amount of information processes can acquire through shared memory accesses to the cost they incur. We believe our proof technique is flexible and intuitive, and may be applied to a variety of other problems and system models.
Tight Bounds for Asynchronous Renaming
, 2011
"... This paper presents the first tight bounds on the complexity of sharedmemory renaming, a fundamental problem in distributed computing in which a set of processes need to pick distinct identifiers from a small namespace. We first prove an individual lower bound of Ω(k) process steps for deterministi ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
This paper presents the first tight bounds on the complexity of sharedmemory renaming, a fundamental problem in distributed computing in which a set of processes need to pick distinct identifiers from a small namespace. We first prove an individual lower bound of Ω(k) process steps for deterministic renaming into any namespace of size subexponential in k, where k is the number of participants. The bound is tight: it draws an exponential separation between deterministic and randomized solutions, and implies new tight bounds for deterministic concurrent fetchandincrement counters, queues and stacks. The proof is based on a new reduction from renaming to another fundamental problem in distributed computing: mutual exclusion. We complement this individual bound with a global lower bound of Ω(k log(k/c)) on the total step complexity of renaming into a namespace of size ck, for any c ≥ 1. This result applies to randomized algorithms against a strong adversary, and helps derive new global lower bounds for randomized approximate counter implementations, that are tight within logarithmic factors. On the algorithmic side, we give a protocol that transforms any sorting network into a strong adaptive renaming algorithm, with expected cost equal to the depth of the sorting network. This gives a tight adaptive renaming algorithm with expected step complexity O(log k), where k is the contention in the current execution. This algorithm is the first to achieve sublinear time, and it is timeoptimal as per our randomized lower bound. Finally, we use this renaming protocol to build monotoneconsistent counters with logarithmic step complexity and linearizable fetchandincrement registers with polylogarithmic cost.
Lower bounds for restricteduse objects
 In TwentyFourth ACM Symposium on Parallel Algorithms and Architectures
, 2012
"... ABSTRACT Concurrent objects play a key role in the design of applications for multicore architectures, making it imperative to precisely understand their complexity requirements. For some objects, it is known that implementations can be significantly more efficient when their usage is restricted. ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
ABSTRACT Concurrent objects play a key role in the design of applications for multicore architectures, making it imperative to precisely understand their complexity requirements. For some objects, it is known that implementations can be significantly more efficient when their usage is restricted. However, apart from the specific restriction of oneshot implementations, where each process may apply only a single operation to the object, very little is known about the complexities of objects under general restrictions. This paper draws a more complete picture by defining a large class of objects for which an operation applied to the object can be "perturbed" L consecutive times, and proving lower bounds on the time and space complexity of deterministic implementations of such objects. This class includes boundedvalue max registers, limiteduse approximate and exact counters, and limiteduse collect and compareandswap objects; L depends on the number of times the object can be accessed or the maximum value it supports. For implementations that use only historyless primitives, we prove lower bounds of Ω(min(log L, n)) on the worstcase step complexity of an operation, where n is the number of processes; we also prove lower bounds of Ω(min(L, n)) on the space complexity of these objects. When arbitrary primitives can be used, we prove that either some operation incurs Ω(min(log L, n)) memory stalls or some operation performs Ω(min(log L, n)) steps. * Supported in part by the Israel Science Foundation (grant number 1227/10). † Supported by the Simons Postdoctoral Fellows Program ‡ Supported in part by the Israel Science Foundation (grant number 1227/10). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. In addition to these deterministic lower bounds, the paper establishes a lower bound on the expected step complexity of restricteduse randomized approximate counting in a weak oblivious adversary model.
Lower bounds in distributed computing
, 2008
"... Distributed computing is the study of achieving cooperative behavior between independent computing processes with possibly conflicting goals. Distributed computing is ubiquitous in the Internet, wireless networks, multicore and multiprocessor computers, teams of mobile robots, etc. In this thesis, ..."
Abstract
 Add to MetaCart
Distributed computing is the study of achieving cooperative behavior between independent computing processes with possibly conflicting goals. Distributed computing is ubiquitous in the Internet, wireless networks, multicore and multiprocessor computers, teams of mobile robots, etc. In this thesis, we study two fundamental distributed computing problems, clock synchronization and mutual exclusion. Our contributions are as follows. 1. We introduce the gradient clock synchronization (GCS) problem. As in traditional clock synchronization, a group of nodes in a bounded delay communication network try to synchronize their logical clocks, by reading their hardware clocks and exchanging messges. We say the distance between two nodes is the uncertainty in message delay between the nodes, and we say the clock skew between the nodes is their difference in logical clock values. GCS studies clock skew as a function of distance. We show that surprisingly, every clock synchronization log D algorithm exhibits some execution in which two nodes at distance one apart have Ω( log log D)
The Price of being Adaptive (Extended Abstract)
, 2015
"... Mutual exclusion is a fundamental distributed coordination problem. Sharedmemory mutual exclusion research focuses on localspin algorithms and uses the remote memory references (RMRs) metric. To ensure the correctness of concurrent algorithms in general, and mutual exclusion algorithms in particu ..."
Abstract
 Add to MetaCart
(Show Context)
Mutual exclusion is a fundamental distributed coordination problem. Sharedmemory mutual exclusion research focuses on localspin algorithms and uses the remote memory references (RMRs) metric. To ensure the correctness of concurrent algorithms in general, and mutual exclusion algorithms in particular, it is often required to prohibit certain reorderings of memory instructions that may compromise correctness, by inserting memory fence (a.k.a. memory barrier) instructions. Memory fences incur nonnegligible overhead and may significantly increase time complexity. A mutual exclusion algorithm is adaptive to total contention (or simply adaptive), if the time complexity of every passage (an entry to the critical section and the corresponding exit) is a function of total contention, that is, the number of processes, k, that participate in the execution in which that passage is performed. We say that an algorithm A is fadaptive (and that f is an adaptivity function of A), if the time complexity of every passage in A is O f(k). Adaptive implementations are desirable when contention is much smaller than the total number of processes, n, sharing the implementation. Recent work [5] presented the first read/write mutual exclusion algorithm with asymptotically optimal complexity under both the RMRs and fences metrics: each passage through the critical section incurs O(log n) RMRs and a constant number of fences. The algorithm works in the popular Total Store Ordering (TSO) model. The algorithm of [5] is nonadaptive, however, and they posed the question of whether there exists an adaptive mutual exclusion algorithm with the same complexities. We provide a negative answer to this question, thus capturing an inherent cost of adaptivity. In fact, we prove a stronger result: adaptive read/write mutual exclusion algo
Lower Bounds for RestrictedUse Objects (Extended Abstract)
, 2012
"... Concurrent objects play a key role in the design of applications for multicore architectures, making it imperative to precisely understand their complexity requirements. For some objects, it is known that implementations can be significantly more efficient when their usage is restricted. However, a ..."
Abstract
 Add to MetaCart
Concurrent objects play a key role in the design of applications for multicore architectures, making it imperative to precisely understand their complexity requirements. For some objects, it is known that implementations can be significantly more efficient when their usage is restricted. However, apart from the specific restriction of oneshot implementations, where each process may apply only a single operation to the object, very little is known about the complexities of objects under general restrictions. This paper draws a more complete picture by defining a large class of objects for which an operation applied to the object can be “perturbed ” L consecutive times, and proving lower bounds on the time and space complexity of deterministic implementations of such objects. This class includes boundedvalue max registers, limiteduse approximate and exact counters, and limiteduse collect and compareandswap objects; L depends on the number of times the object can be accessed or the maximum value it can support. For implementations that use only historyless primitives, we prove lower bounds of Ω(min(log L, n)) on the worstcase step complexity of an operation, where n is the number of processes; we also prove lower bounds of Ω(min(L, n)) on the space complexity of these objects. When arbitrary primitives can be used, we prove that either some operation incurs Ω(min(log L, n)) memory stalls or some operation performs Ω(min(log L, n)) steps. In addition to these deterministic lower bounds, the paper establishes a lower bound on the expected step complexity of restricteduse randomized approximate counting in a weak oblivious adversary model.
The Renaming Problem: Recent Developments and Open Questions
"... The theory of distributed computing centers around a set of fundamental problems, also known as tasks, usually considered in variants of the two classic models of distributed computation: asynchronous sharedmemory and asynchronous messagepassing [50]. These fundamental ..."
Abstract
 Add to MetaCart
The theory of distributed computing centers around a set of fundamental problems, also known as tasks, usually considered in variants of the two classic models of distributed computation: asynchronous sharedmemory and asynchronous messagepassing [50]. These fundamental