Results 1  10
of
12
On the Weakest Failure Detector Ever
 PODC'07
, 2007
"... Many problems in distributed computing are impossible when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
Many problems in distributed computing are impossible when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper asks what information about failures is needed to circumvent any impossibility and sufficient to circumvent some impossibility. In other words, what is the minimal yet nontrivial failure information. We present an abstraction, denoted Υ, that provides very little failure information. In every run of the distributed system, Υ eventually informs the processes that some set of processes in the system cannot be the set of correct processes in that run. Although seemingly weak, for it might provide random information for an arbitrarily long period
Looking for the Weakest Failure Detector for kSet Agreement in Messagepassing Systems
 Is Πk the End of the Road?, INRIA, 2009, http://hal.inria.fr/inria00384993/en/, PI
, 1929
"... Abstract: In the kset agreement problem, each process (in a set of n processes) proposes a value and has to decide a proposed value in such a way that at most k different values are decided. While this problem can easily be solved in asynchronous systems prone to t process crashes when k> t, it ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
Abstract: In the kset agreement problem, each process (in a set of n processes) proposes a value and has to decide a proposed value in such a way that at most k different values are decided. While this problem can easily be solved in asynchronous systems prone to t process crashes when k> t, it cannot be solved when k ≤ t. Since several years, the failure detectorbased approach has been investigated to circumvent this impossibility. While the weakest failure detector class to solve the kset agreement problem in read/write sharedmemory systems has recently been discovered (PODC 2009), the situation is different in messagepassing systems where the weakest failure detector classes are known only for the extreme cases k = 1 (consensus) and k = n − 1 (set agreement). This paper introduces a candidate for the general case. It presents a new failure detector class, denoted Πk, and shows Π1 = Σ × Ω (the weakest class for k = 1), and Πn−1 = L (the weakest class for k = n − 1). Then, the paper investigates the structure of Πk and shows it is the combination of two failures detector classes denoted Σk and Ωk (that generalize the previous “quorums ” and “eventual leaders ” failure detectors classes). Finally, the paper proves that Σk is a necessary requirement (as far as information on failure is concerned) to solve the kset agreement problem in messagepassing systems. The paper presents also a Πn−1based algorithm that solves the (n − 1)set agreement problem. This algorithm provides us with a new algorithmic insight on the way the (n − 1)set agreeement problem can be solved in asynchronous messagepassing systems (insight from the point of view of the nonpartitioning constraint defined by Σn−1).
On set consensus numbers
 In DISC
, 2009
"... Abstract. It is conjectured that the only way a failure detector (FD) can help solving nprocess tasks is by providing kset consensus for some k ∈ {1,..., n} among all the processes. It was recently shown by Zieliński that any FD that allows for solving a given nprocess task that is unsolvable re ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract. It is conjectured that the only way a failure detector (FD) can help solving nprocess tasks is by providing kset consensus for some k ∈ {1,..., n} among all the processes. It was recently shown by Zieliński that any FD that allows for solving a given nprocess task that is unsolvable readwrite waitfree, also solves (n − 1)set consensus. In this paper, we provide a generalization of Zieliński’s result. We show that any FD that solves a colorless task that cannot be solved readwrite kresiliently, also solves kset consensus. More generally, we show that every colorless task T can be characterized by its set consensus number: the largest k ∈ {1,..., n} such that T is solvable (k − 1)resiliently. A task T with set consensus number k is, in the failure detector sense, equivalent to kset consensus, i.e., a FD solves T if and only if it solves kset consensus. As a corollary, we determine the weakest FD for solving kset consensus in every environment, i.e., for all assumptions on when and where failures might occur. 1
Building and Using Quorums Despite Any Number of Process of Crashes
 IN 5TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE (EDCC’05
, 2003
"... Failure detectors of the class denoted eventually suspect all crashed processes in a permanent way (completeness) and ensure that, at any time, no more than n t 1 alive processes are falsely suspected (accuracy), n being the total number of processes. This paper first shows that a simple comb ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Failure detectors of the class denoted eventually suspect all crashed processes in a permanent way (completeness) and ensure that, at any time, no more than n t 1 alive processes are falsely suspected (accuracy), n being the total number of processes. This paper first shows that a simple combination of such a failure detector with a twostep communication pattern can provide the processes with an interesting intersection property on sets of values. As an example illustrating the benefit and the property that such a combination can provide when designing protocols, a leaderbased consensus protocol whose design relies on its systematic use is presented. Then the paper presents a based protocol that builds quorums in systems where up to t processes can crash with t < n.
(Almost) all objects are universal in message passing systems (Extended Abstract)
 IN INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING
, 2005
"... This paper shows that all shared atomic object types that can solve consensus among k>1 processes have the same weakest failure detector in a message passing system with process crash failures. In such a system, object types such as testandset, fetchandadd, andqueue, known to have weak synchr ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
This paper shows that all shared atomic object types that can solve consensus among k>1 processes have the same weakest failure detector in a message passing system with process crash failures. In such a system, object types such as testandset, fetchandadd, andqueue, known to have weak synchronization power in a shared memory system are thus, in a precise sense, equivalent to universal types like compareandswap, known to have the strongest synchronization power. In the particular case of a message passing system of two processes, we show that, interestingly, even a register is in that sense universal.
The Multiplicative Power of Consensus Numbers
, 2010
"... The BorowskyGafni (BG) simulation algorithm is a powerful reduction algorithm that shows that tresilience of decision tasks can be fully characterized in terms of waitfreedom. Said in another way, the BG simulation shows that the crucial parameter is not the number n of processes but the upper bo ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
The BorowskyGafni (BG) simulation algorithm is a powerful reduction algorithm that shows that tresilience of decision tasks can be fully characterized in terms of waitfreedom. Said in another way, the BG simulation shows that the crucial parameter is not the number n of processes but the upper bound t on the number of processes that are allowed to crash. The BG algorithm considers colorless decision tasks in the base read/write shared memory model. (Colorless means that if, a process decides a value, any other process is allowed to decide the very same value.) This paper considers system models made up of n processes prone to up to t crashes, and where the processes communicate by accessing read/write atomic registers (as assumed by the BG) and (differently from the BG) objects with consensus number x, accessible by at most x processes (with x ≤ t < n). Let ASM(n, t, x) denote such a system model. While the BG simulation has shown that the models ASM(n, t, 1) and ASM(t + 1, t, 1) are equivalent, this paper focuses the pair (t, x) of parameters of a system model. Its main result is the following: the system models ASM(n1, t1, x1) and ASM(n2, t2, x2) have the same computational power for colorless ⌋. As can be seen, this contribution complements and extends the BG simulation. It shows that decision tasks if and only if ⌊ t1 x1 t2 x2 consensus numbers have a multiplicative power with respect to failures, namely the system models ASM(n, t ′ , x) and ASM(n, t, 1) are equivalent for colorless decision tasks iff (t × x) ≤ t ′ ≤ (t × x) + (x − 1).
The impossibility of boosting distributed service resilience
 In The 25’th International Conference on Distributed Computing Systems
, 2005
"... We study fresilient services which are guaranteed to operate as long as no more than f of the associated processes fail. We prove two theorems about the impossibility of boosting the resilience of such services. Our first theorem allows any connection pattern between processes and services but assu ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We study fresilient services which are guaranteed to operate as long as no more than f of the associated processes fail. We prove two theorems about the impossibility of boosting the resilience of such services. Our first theorem allows any connection pattern between processes and services but assumes these services to be atomic objects. The theorem says that no distributed system in which processes coordinate using reliable registers and fresilient atomic objects can solve the consensus problem in the presence of f + 1 undetectable process stopping failures. In contrast, we show that it is possible to boost the resilience of systems solving problems easier than consensus: the 2set consensus problem is solvable for 2n processes and 2n−1 failures (i.e., waitfree) using nprocess consensus services resilient to n − 1 failures (i.e., waitfree). We also introduce the larger class of failureoblivious services. These are services that cannot use information about failures, but are not necessarily atomic objects (where each invocation by a process results in a single response to that same process). An important instance of such a service is totally ordered broadcast. We show that the first theorem and its proof generalize to failureoblivious services.
The Computational Structure of Progress Conditions
"... Abstract. Understanding the effect of different progress conditions on the computability of distributed systems is an important and exciting research direction. For a system with n processes, we define exponentially many new progress conditions and explore their properties and strength. We cover a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Understanding the effect of different progress conditions on the computability of distributed systems is an important and exciting research direction. For a system with n processes, we define exponentially many new progress conditions and explore their properties and strength. We cover all the known, symmetric and asymmetric, progress conditions and many new interesting conditions. Together with our technical results, the new definitions provide a deeper understanding of synchronization and concurrency.
The CHT Play  An Informal Note on the Necessary Part of the Proof that is the Weakest Failure Detector for Consensus
"... This note gives a high level and informal account of the necessary part of the proof that# is the weakest failure detector to implement consensus with a majority of correct processes. The proof originally appeared in a widely cited but rarely understood paper by Chandra, Hadzilacos and Toueg. We ..."
Abstract
 Add to MetaCart
This note gives a high level and informal account of the necessary part of the proof that# is the weakest failure detector to implement consensus with a majority of correct processes. The proof originally appeared in a widely cited but rarely understood paper by Chandra, Hadzilacos and Toueg. We describe it here as a play in five acts, preceded by a prologue and followed by an epilogue.
Building and Using P^tBased Quorums despite any Number t OF PROCESS OF CRASHES
, 2003
"... Failure detectors of the class denoted eventually suspect all crashed processes in a permanent way (completeness) and ensure that, at any time, no more than n t 1 alive processes are falsely suspected (accuracy), n being the total number of processes. This paper shows that a simple combinatio ..."
Abstract
 Add to MetaCart
Failure detectors of the class denoted eventually suspect all crashed processes in a permanent way (completeness) and ensure that, at any time, no more than n t 1 alive processes are falsely suspected (accuracy), n being the total number of processes. This paper shows that a simple combination of such a failure detector with a twostep communication pattern can provide the processes with an interesting intersection property on sets of values. As an example illustrating the benefit and the property that such a combination can provide when designing protocols, a leaderbased consensus protocol whose design relies on its systematic use is presented.