Results 1  10
of
64
Byzantine disk paxos: optimal resilience with Byzantine shared memory
 Distributed Computing
, 2006
"... We present Byzantine Disk Paxos, an asynchronous sharedmemory consensus protocol that uses a collection of n> 3t disks, t of which may fail by becoming nonresponsive or arbitrarily corrupted. We give two constructions of this protocol; that is, we construct two different building blocks, each of ..."
Abstract

Cited by 48 (3 self)
 Add to MetaCart
(Show Context)
We present Byzantine Disk Paxos, an asynchronous sharedmemory consensus protocol that uses a collection of n> 3t disks, t of which may fail by becoming nonresponsive or arbitrarily corrupted. We give two constructions of this protocol; that is, we construct two different building blocks, each of which can be used, along with a leader oracle, to solve consensus. One building block is a shared waitfree safe register. The second building block is a regular register that satisfies a weaker termination (liveness) condition than wait freedom: its write operations are waitfree, whereas its read operations are guaranteed to return only in executions with a finite number of writes. We call this termination condition finite writes (FW), and show that consensus is solvable with FWterminating registers and a leader oracle. We construct each of these reliable registers from n> 3t base registers, t of which can be nonresponsive or Byzantine. All the previous waitfree constructions in this model used at least 4t + 1 faultprone registers, and we are not familiar with any prior FWterminating constructions in this model. Categories and Subject Descriptors B.3.2 [Memory Structures]: Design Styles—shared memory; D.4.5 [Operating Systems]: Reliability—faulttolerance;
Active Disk Paxos with infinitely many processes
 In Proceedings of the 21st ACM Symposium on Principles of Distributed Computing (PODC’02
, 2002
"... We present an improvement to the Disk Paxos protocol by Gafni and Lamport which utilizes extended functionality and flexibility provided by Active Disks and supports unmediated concurrent data access by an unlimited number of processes. The solution facilitates coordination by an infinite number of ..."
Abstract

Cited by 48 (8 self)
 Add to MetaCart
(Show Context)
We present an improvement to the Disk Paxos protocol by Gafni and Lamport which utilizes extended functionality and flexibility provided by Active Disks and supports unmediated concurrent data access by an unlimited number of processes. The solution facilitates coordination by an infinite number of clients using finite shared memory. It is based on a collection of readmodifywrite objects with faults, that emulate a new, reliable shared memory abstraction called a ranked register. The required readmodifywrite objects are readily available in Active Disks and in Object Storage Device controllers, making our solution suitable for stateoftheart Storage Area Network (SAN) environments. 1.
Deconstructing paxos
 SIGACT News
"... The Paxos parttime parliament protocol of Lamport provides a very practical way to implement a faulttolerant deterministic service by replicating it over a distributed message passing system. The contribution of this paper is a faithful deconstruction of Paxos that preserves its efficiency in term ..."
Abstract

Cited by 45 (11 self)
 Add to MetaCart
(Show Context)
The Paxos parttime parliament protocol of Lamport provides a very practical way to implement a faulttolerant deterministic service by replicating it over a distributed message passing system. The contribution of this paper is a faithful deconstruction of Paxos that preserves its efficiency in terms of forced logs, messages and communication steps. The key to our faithful deconstruction is the factorisation of the fundamental algorithmic principles of Paxos within two abstractions: weak leader election and roundbased consensus, itself based on a roundbased register abstraction. Using those abstractions, we show how to reconstruct, in a modular manner, known and new variants of Paxos. In particular, we show how to (1) alleviate the need for forced logs if some processes remain up for sufficiently long, (2) augment the resilience of the algorithm against unstable processes, (3) enable single process decision with shared commodity disks, and (4) reduce the number of communication steps during stable periods of the system.
Communicationefficient leader election and consensus with limited link synchrony
 In PODC
, 2004
"... We study the degree of synchrony required to implement the leader election failure detector Ω and to solve consensus in partially synchronous systems. We show that in a system with n processes and up to f process crashes, one can implement Ω and solve consensus provided there exists some (unknown) c ..."
Abstract

Cited by 43 (10 self)
 Add to MetaCart
(Show Context)
We study the degree of synchrony required to implement the leader election failure detector Ω and to solve consensus in partially synchronous systems. We show that in a system with n processes and up to f process crashes, one can implement Ω and solve consensus provided there exists some (unknown) correct process with f outgoing links that are eventually timely. In the special case where f =1, an important case in practice, this implies that to implement Ω and solve consensus it is sufficient to have just one eventually timely link — all the other links in the system, Θ(n2) of them, may be asynchronous. There is no need to know which link p → q is eventually timely, when it becomes timely, or what is its bound on message delay. Surprisingly, it is not even required that the source p or destination q of this link be correct: either p or q may actually crash, in which case the link p → q is eventually timely in a trivial way, and it is useless for sending messages. We show that these results are in a sense optimal: even if
AntiΩ: the weakest failure detector for set agreement
, 2007
"... Number 694 AntiΩ: the weakest failure detector for set agreement ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
(Show Context)
Number 694 AntiΩ: the weakest failure detector for set agreement
ObstructionFree algorithms can be practically waitfree
 In Distributed Algorithms, P. Fraigniaud, Ed. Lecture Notes in Computer Science
, 2005
"... Abstract. The obstructionfree progress condition is weaker than previous nonblocking progress conditions such as lockfreedom and waitfreedom, and admits simpler implementations that are faster in the uncontended case. Pragmatic contention management techniques appear to be effective at facilitatin ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The obstructionfree progress condition is weaker than previous nonblocking progress conditions such as lockfreedom and waitfreedom, and admits simpler implementations that are faster in the uncontended case. Pragmatic contention management techniques appear to be effective at facilitating progress in practice, but, as far as we know, none guarantees progress. We present a transformation that converts any obstructionfree algorithm into one that is waitfree when analyzed in the unknownbound semisynchronous model. Because all practical systems satisfy the assumptions of the unknownbound model, our result implies that, for all practical purposes, obstructionfree implementations can provide progress guarantees equivalent to waitfreedom. Our transformation preserves the advantages of any pragmatic contention manager, while guaranteeing progress. 1
The Alpha of Indulgent Consensus
, 2006
"... This paper presents a simple framework unifying a family of consensus algorithms that can tolerate process crash failures and asynchronous periods of the network, also called indulgent consensus algorithms. Key to the framework is a new abstraction we introduce here, called Alpha, and which precisel ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
This paper presents a simple framework unifying a family of consensus algorithms that can tolerate process crash failures and asynchronous periods of the network, also called indulgent consensus algorithms. Key to the framework is a new abstraction we introduce here, called Alpha, and which precisely captures consensus safety. Implementations of Alpha in shared memory, storage area network, message passing and active disk systems are presented, leading to directly derived consensus algorithms suited to these communication media. The paper also considers the case where the number of processes is unknown and can be arbitrarily large.
Verifying safety properties with the TLA+ proof system.
 In Jurgen Giesl and Reiner Hahnle, editors, Intl. Joint Conf. Automated Reasoning (IJCAR 2010),
, 2010
"... Overview TLAPS, the TLA + proof system, is a platform for the development and mechanical verification of TLA + proofs. The TLA + proof language is declarative, and understanding proofs requires little background beyond elementary mathematics. The language supports hierarchical and nonlinear proof ..."
Abstract

Cited by 17 (7 self)
 Add to MetaCart
(Show Context)
Overview TLAPS, the TLA + proof system, is a platform for the development and mechanical verification of TLA + proofs. The TLA + proof language is declarative, and understanding proofs requires little background beyond elementary mathematics. The language supports hierarchical and nonlinear proof construction and verification, and it is independent of any verification tool or strategy. Proofs are written in the same language as specifications; engineers do not have to translate their highlevel designs into the language of a particular verification tool. A proof manager interprets a TLA + proof as a collection of proof obligations to be verified, which it sends to backend verifiers that include theorem provers, proof assistants, SMT solvers, and decision procedures. The first public release of TLAPS is available from [1], distributed with a BSDlike license. It handles almost all the nontemporal part of TLA + as well as the temporal reasoning needed to prove standard safety properties, in particular invariance and step simulation, but not liveness properties. Intuitively, a safety property asserts what is permitted to happen; a liveness property asserts what must happen; for a more formal overview, see Foundations TLA + is a formal language based on TLA (the Temporal Logic of Actions) It has always been possible to assert correctness properties of systems in TLA + , but not to write their proofs. We have added proof constructs based on a hierarchical style for writing informal proofs
On implementing omega in systems with weak reliability and synchrony assumptions
 DISTRIBUTED COMPUTING
, 2003
"... ..."