Results 1 -
9 of
9
Linear Space Algorithm for On-line Detection of Global Predicates
- Proc. International Workshop on Structures in Concurrency Theory (STRICT '95
, 1995
"... . A fundamental problem in debugging and monitoring is detecting whether the state of a system satisfies some predicate. Cooper and Marzullo defined this problem as P ossibly(\Phi) for distributed computations. This paper presents the first on--line algorithm using linear space which resolve thi ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
. A fundamental problem in debugging and monitoring is detecting whether the state of a system satisfies some predicate. Cooper and Marzullo defined this problem as P ossibly(\Phi) for distributed computations. This paper presents the first on--line algorithm using linear space which resolve this problem in the general case, improving all existing algorithms both in time and space. It is particularly interesting for the detection of P ossibly(\Phi) on potentially infinite computations. To our knowledge, it also the only algorithm of detection which do not make use of vectors of timestamps. The presented algorithm is based on a structural properties of the consistent cuts lattice, leading to a new structure to study distributed computations: the consistent cuts tree. Keywords: Distributed Computation, Causality Relation, Global Predicates, Consistent Cuts, Ideal Lattice, Ideal Tree, On-line Algorithms. 1 Introduction In this paper we study the detection of P ossibly(\Phi) ...
Detecting Locally Stable Predicates without Modifying Application Messages
- In Proceedings of the 7th International Conference on Principles of Distributed Systems (OPODIS), La
, 2003
"... In this paper, we give an ecient algorithm to determine whether a locally stable predicate has become true in an underlying computation. ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
In this paper, we give an ecient algorithm to determine whether a locally stable predicate has become true in an underlying computation.
Message-Optimal and Latency-Optimal Termination Detection Algorithms for Arbitrary Topologies
- in: Proceedings of the 18th Symposium on Distributed Computing (DISC
, 2004
"... Abstract. Detecting termination of a distributed computation is a fundamental problem in distributed systems. We present two optimal algorithms for detecting termination of a non-diffusing distributed computation for an arbitrary topology. Both algorithms are optimal in terms of message complexity a ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
Abstract. Detecting termination of a distributed computation is a fundamental problem in distributed systems. We present two optimal algorithms for detecting termination of a non-diffusing distributed computation for an arbitrary topology. Both algorithms are optimal in terms of message complexity and detection latency. The first termination detection algorithm has to be initiated along with the underlying computation. The message complexity of this algorithm is Θ(N + M) and its detection latency is Θ(D), where N is the number of processes in the system, M is the number of application messages exchanged by the underlying computation, and D is the diameter of the communication topology. The second termination detection algorithm can be initiated at any time after the underlying computation has started. The message complexity of this algorithm is Θ(E + M) and its detection latency is Θ(D), where E is the number of channels in the communication topology. Key words: termination detection, quiescence detection, optimal message complexity, optimal detection latency 1
Consistent Checkpointing in Message Passing Distributed Systems
- Rapporte de Recherche, INRIA - (France) n
, 1995
"... : A global checkpoint of a distributed computation is a a set of local checkpoints (local states), one per process. Determining consistent global checkpoints is an important problem for many distributed applications (e.g. fault-tolerance, distributed debugging, properties detection, etc). This paper ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
: A global checkpoint of a distributed computation is a a set of local checkpoints (local states), one per process. Determining consistent global checkpoints is an important problem for many distributed applications (e.g. fault-tolerance, distributed debugging, properties detection, etc). This paper concentrates on such determinations. A precedence relation on checkpoint intervals (such intervals are sets of events produced by processes between two successive local checkpoints) is introduced and analyzed. It is shown that a local checkpoint is useless (i.e. it cannot participate in any consistent global checkpoint) iff some pattern appears in this precedence relation. Then an adaptive checkpointing algorithm is introduced. This algorithm, assuming processes take local checkpoints independently, requires them to take (as few as possible) additional checkpoints in order that none of previously taken checkpoints be useless. It is based on the prevention of the previously mentioned pattern...
Distributed evaluation : a tool for constructing distributed detection program
- Proc. Int. Conf. on Theory of Computing and Systems, Springer-Verlag, LNCS 601, (Galil, Dolev, Rodeh Ed
, 1992
"... Abstract. Methodological design of distributed programs is of major concern to master parallelism. Due to their role in distributed systems, the class of observation or detection programs, whose aim is to observe or detect properties of an observed program, is very important. The detection of a prop ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract. Methodological design of distributed programs is of major concern to master parallelism. Due to their role in distributed systems, the class of observation or detection programs, whose aim is to observe or detect properties of an observed program, is very important. The detection of a property generally rests upon consistent evaluations of a predicate; such a predicate can be global, i.e. involve states of several processes and channels of the observed program. Unfortunately, in a distributed system, the consistency of an evaluation cannot be trivially obtained. This is a central problem in distributed evaluations. This paper addresses the problem of distributed evaluation, as a basic tool for the design of a general distributed detection program. 1
Improving the efficacy of a termination detection algorithm
"... An important problem in distributed systems is to detect termination of a distributed computation. A distributed computation is said to have terminated when all processes have become passive and all channels have become empty. We focus on two attributes of a termination detection algorithm. First, w ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
An important problem in distributed systems is to detect termination of a distributed computation. A distributed computation is said to have terminated when all processes have become passive and all channels have become empty. We focus on two attributes of a termination detection algorithm. First, whether the distributed computation starts from a single process or from multiple processes: diffusing computation versus non-diffusing computation. Second, whether the detection algorithm should be initiated along with the computation or can be initiated anytime after the computation has started: simultaneous initiation versus delayed initiation. We show that any termination detection algorithm for a diffusing computation can be transformed into a termination detection algorithm for a non-diffusing computation. We also demonstrate that any termination detection algorithm for simultaneous initiation can be transformed into a termination detection algorithm for delayed initiation. We prove the correctness of our transformations, and show that our transformations have only a small impact on the performance of the given termination detection algorithm. Key words: distributed system, termination detection, algorithm transformation, dif-fusing and non-diffusing computations, simultaneous and delayed initiations, worst-case and average-case message complexities
Detecting Arbitrary Stable Properties Using Efficient Snapshots
"... Abstract—A stable property continues to hold in an execution once it becomes true. Detecting arbitrary stable properties efficiently in distributed executions is still an open problem. The known algorithms for detecting arbitrary stable properties and snapshot algorithms used to detect such stable p ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—A stable property continues to hold in an execution once it becomes true. Detecting arbitrary stable properties efficiently in distributed executions is still an open problem. The known algorithms for detecting arbitrary stable properties and snapshot algorithms used to detect such stable properties suffer from drawbacks such as the following: They incur the overhead of a large number of messages per global snapshot, or alter application message headers, or use inhibition, or use the execution history, or assume a strong property such as causal delivery of messages in the system. We solve the problem of detecting an arbitrary stable property efficiently under the following assumptions: P1) The application messages should not be modified, not even by timestamps or
Coherence in Distributed Persistent Object Systems
, 1993
"... Distributed system builders are faced with the task of meeting a variety of requirements on the global behaviour of the target system, such as stability, fault-tolerance and failure recovery, concurrency control, commitment, and consistency of replicated data. Coherence means satisfying these types ..."
Abstract
- Add to MetaCart
Distributed system builders are faced with the task of meeting a variety of requirements on the global behaviour of the target system, such as stability, fault-tolerance and failure recovery, concurrency control, commitment, and consistency of replicated data. Coherence means satisfying these types of requirements, although the subset may vary from system from to system. This paper describes an approach to coherence enforcement in distributed persistent object systems based upon system-wide backtracking. The approach is optimistic in the sense that violations of coherence are resolved rather than prevented---backtracking is the agent of this resolution. The coherence support is realised as a transaction service, supported by the backtrack capability. * This work is supported by JISC/NTI 229, "Distributed SuperComputing: High Speed Scalable Networking", alias WARP. Coherence in Persistent Object Systems 1 W1-93 1 Introduction It is important that persistent object systems support d...
DOI 10.1007/s00446-007-0031-3 A family of optimal termination detection algorithms
"... Abstract An important problem in distributed systems is to detect termination of a distributed computation. A computation is said to have terminated when all processes have become passive and all channels have become empty. In this paper, we present a suite of algorithms for detecting termination of ..."
Abstract
- Add to MetaCart
Abstract An important problem in distributed systems is to detect termination of a distributed computation. A computation is said to have terminated when all processes have become passive and all channels have become empty. In this paper, we present a suite of algorithms for detecting termination of a non-diffusing computation for an arbitrary communication topology under a variety of conditions. All our termination detection algorithms have optimal message complexity. Furthermore, they have optimal detection latency when message processing time is ignored.

