Results 1  10
of
44
Closure and Convergence: A Foundation of FaultTolerant Computing
 IEEE Transactions on Software Engineering
, 1993
"... We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if ..."
Abstract

Cited by 132 (30 self)
 Add to MetaCart
(Show Context)
We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the system state remains within that larger set (Closure). And two, if faults stop occurring, the system eventually reaches a state within the legal set (Convergence). We demonstrate the applicability of our definition for specifying and verifying the faulttolerance properties of a variety of digital and computer systems. Further, using the definition, we obtain a simple classification of faulttolerant systems and discuss methods for their systematic design. as traditionally been studied in the context of specifi...
Hundreds of Impossibility Results for Distributed Computing
 Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract

Cited by 52 (5 self)
 Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.
Conditions on input vectors for consensus solvability in asynchronous distributed systems
 Journal of the ACM
, 2001
"... Abstract. This article introduces and explores the conditionbased approach to solve the consensus problem in asynchronous systems. The approach studies conditions that identify sets of input vectors for which it is possible to solve consensus despite the occurrence of up to f process crashes. The f ..."
Abstract

Cited by 39 (13 self)
 Add to MetaCart
Abstract. This article introduces and explores the conditionbased approach to solve the consensus problem in asynchronous systems. The approach studies conditions that identify sets of input vectors for which it is possible to solve consensus despite the occurrence of up to f process crashes. The first main result defines acceptable conditions and shows that these are exactly the conditions for which a consensus protocol exists. Two examples of realistic acceptable conditions are presented, and proved to be maximal, in the sense that they cannot be extended and remain acceptable. The second main result is a generic consensus sharedmemory protocol for any acceptable condition. The protocol always guarantees agreement and validity, and terminates (at least) when the inputs satisfy the condition with which the protocol has been instantiated, or when there are no crashes. An efficient version of the protocol is then designed for the message passing model that works when f < n/2, and it is shown that no such protocol exists when f ≥ n/2. It is also shown how the protocol’s safety can be traded for its liveness.
Distributed computing with advice: Information sensitivity of graph coloring
 IN 34TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES AND PROGRAMMING (ICALP
, 2007
"... We study the problem of the amount of information (advice) about a graph that must be given to its nodes in order to achieve fast distributed computations. The required size of the advice enables to measure the information sensitivity of a network problem. A problem is information sensitive if litt ..."
Abstract

Cited by 31 (13 self)
 Add to MetaCart
We study the problem of the amount of information (advice) about a graph that must be given to its nodes in order to achieve fast distributed computations. The required size of the advice enables to measure the information sensitivity of a network problem. A problem is information sensitive if little advice is enough to solve the problem rapidly (i.e., much faster than in the absence of any advice), whereas it is information insensitive if it requires giving a lot of information to the nodes in order to ensure fast computation of the solution. In this paper, we study the information sensitivity of distributed graph coloring.
Tight Bounds for Parallel Randomized Load Balancing
 Computing Research Repository
, 1992
"... We explore the fundamental limits of distributed ballsintobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that nonadaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
We explore the fundamental limits of distributed ballsintobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that nonadaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ(loglogn/logloglogn) within the same number of rounds. We present an adaptive symmetric algorithm that achieves a bin load of two in log ∗ n + O(1) communication rounds using O(n) messages in total. Moreover, larger bin loads can be traded in for smaller time complexities. We prove a matching lower bound of (1−o(1))log ∗ n on the time complexity of symmetric algorithms that guarantee small bin loads at an asymptotically optimal message complexity of O(n). The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls are not globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time. As an application, we consider the following problem. Given a fully connected graph of n nodes, where each node needs to send and receive up to n messages, and in each round each node may send one message over each link, deliver all messages as quickly as possible to their destinations. We give a simple and robust algorithm of time complexity O(log ∗ n) for this task and provide a generalization to the case where all nodes initially hold arbitrary sets of messages. Completing the picture, we give a less practical, but asymptotically optimal algorithm terminating within O(1) rounds. All these bounds hold with high probability.
Solving vector consensus with a wormhole
 IEEE Transactions on Parallel and Distributed Systems
, 2005
"... Abstract—This paper presents a solution to the vector consensus problem for Byzantine asynchronous systems augmented with wormholes. Wormholes prefigure a hybrid distributed system model, embodying the notion of an enhanced part of the system with “good ” properties otherwise not guaranteed by the “ ..."
Abstract

Cited by 16 (10 self)
 Add to MetaCart
(Show Context)
Abstract—This paper presents a solution to the vector consensus problem for Byzantine asynchronous systems augmented with wormholes. Wormholes prefigure a hybrid distributed system model, embodying the notion of an enhanced part of the system with “good ” properties otherwise not guaranteed by the “normal ” weak environment. A protocol built for this type of system runs in the asynchronous part, where f out of n 3fþ 1 processes might be corrupted by malicious adversaries. However, sporadically, processes can rely on the services provided by the wormhole for the correct execution of simple operations. One of the nice features of this setting is that it is possible to keep the protocol completely timefree and, in addition, to circumvent the FLP impossibility result by hiding all timerelated assumptions in the wormhole. Furthermore, from a performance perspective, it leads to the design of a protocol with a good time complexity. Index Terms—Distributed systems, Byzantine asynchronous protocols, consensus. 1
A distributed protocol for dynamic address assignment in mobile ad hoc networks
 IEEE Trans. Mobile Computing
, 2006
"... A Mobile Ad hoc NETwork (MANET) is a group of mobile nodes that form a multihop wireless network. The topology of the network can change randomly due to unpredictable mobility of nodes and propagation characteristics. Previously, it was assumed that the nodes in the network were assigned IP address ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
A Mobile Ad hoc NETwork (MANET) is a group of mobile nodes that form a multihop wireless network. The topology of the network can change randomly due to unpredictable mobility of nodes and propagation characteristics. Previously, it was assumed that the nodes in the network were assigned IP addresses a priori. This may not be feasible as nodes can enter and leave the network dynamically. A dynamic IP address assignment protocol like DHCP requires centralized servers that may not be present in MANETs. Hence, we propose a distributed protocol for dynamic IP address assignment to nodes in MANETs. The proposed solution guarantees unique IP address assignment under a variety of network conditions including message losses, network partitioning and merging. Simulation results show that the protocol incurs low latency and communication overhead for an IP address assignment. MANET, address allocation, IPnetworks. Index Terms
A new Adaptive Accrual Failure Detector for Dependable Distributed Systems
 In ACM Symposium on Applied Computing (SAC 2007
, 2007
"... The detection of failures in distributed environments is a crucial part for developing dependable, robust, and selfhealing systems. The contribution of this paper is a new failure detection algorithm that can be described as an adaptive accrual algorithm coupled with features to increase flexiblit ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
(Show Context)
The detection of failures in distributed environments is a crucial part for developing dependable, robust, and selfhealing systems. The contribution of this paper is a new failure detection algorithm that can be described as an adaptive accrual algorithm coupled with features to increase flexiblity and decrease computation costs. Furthermore our evaluation results show a very good detection quality in the case of message losses.
Characterizing topological assumptions of distributed algorithms in dynamic networks
 In Proc. 16th Intl. Conf. on Structural Information and Communication Complexity (SIROCCO
, 2009
"... Abstract. Besides the complexity in time or in number of messages, a common approach for analyzing distributed algorithms is to look at their assumptions on the underlying network. This paper focuses on the study of such assumptions in dynamic networks, where the connectivity is expected to change, ..."
Abstract

Cited by 14 (10 self)
 Add to MetaCart
(Show Context)
Abstract. Besides the complexity in time or in number of messages, a common approach for analyzing distributed algorithms is to look at their assumptions on the underlying network. This paper focuses on the study of such assumptions in dynamic networks, where the connectivity is expected to change, predictably or not, during the execution. Our main contribution is a theoretical framework dedicated to such analysis. By combining several existing components (local computations, graph relabellings, and evolving graphs), this framework allows to express detailed properties on the network dynamics and to prove that a given property is necessary, or sufficient, for the success of an algorithm. Consequences of this work include (i) the possibility to compare distributed algorithms on the basis of their topological requirements, (ii) the elaboration of a formal classification of dynamic networks with respect to these properties, and (iii) the possibility to check automatically whether a network trace belongs to one of the classes, and consequently to know which algorithm should run on it. Key words: Dynamic networks, distributed algorithms, evolving graphs, local interactions, topological assumptions. 1
Communication Algorithms with Advice
, 2009
"... We study the amount of knowledge about a communication network that must be given to its nodes in order to efficiently disseminate information. Our approach is quantitative: we investigate the minimum total number of bits of information (minimum size of advice) that has to be available to nodes, reg ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
We study the amount of knowledge about a communication network that must be given to its nodes in order to efficiently disseminate information. Our approach is quantitative: we investigate the minimum total number of bits of information (minimum size of advice) that has to be available to nodes, regardless of the type of information provided. We compare the size of advice needed to perform broadcast and wakeup (the latter is a broadcast in which nodes can transmit only after getting the source information), both using a linear number of messages (which is optimal). We show that the minimum size of advice permitting the wakeup with a linear number of messages in a nnode network, is Θ(nlog n), while the broadcast with a linear number of messages can be achieved with advice of size O(n). We also show that the latter size of advice is almost optimal: no advice of size o(n) can permit to broadcast with a linear number of messages. Thus an