35 citations found. Retrieving documents...
C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314-321a, Philadelphia, 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
On Scalable and Efficient Distributed Failure Detectors - Gupta, Chandra, Goldszmidt (2001)   (16 citations)  (Correct)

....a guide to designers of failure detector algorithms for real systems. For example, most distributed applications have opted to circumvent the impossibility result by relying on failure detector algorithms that guarantee completeness deterministically while achieving e#ciency only probabilistically [1, 2, 4, 6, 7, 8, 14]. The recent emergence of applications for large scale distributed systems has created a need for failure detector algorithms that minimize the network load (in bytes per second, or equivalently, messages per second with a limit on maximum message size) used, as well as the load imposed on ....

.... are primarily based on the weakness of the model required to implement them, in order to solve the Distributed Consensus Agreement problem [11] Proposals for implementable failure detectors have sometimes assumed network models with weak unreliability semantics e.g. timed asynchronous model [8], quasi synchronous model [2] partial synchrony model [12] etc. These proposals have treated failure detectors only as a tool to e#ciently reach agreement, ignoring their e#ciency from an application designer s viewpoint. For example, most failure detectors such as [12] provide eventual ....

[Article contains additional citation context not shown here]

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314--321a, May 1996.


Timing Failure Detection with a Timely Computing Base - Casimiro, Veríssimo (1999)   (1 citation)  (Correct)

....failures) is absent. Adding a short amount of synchrony, namely by allowing processes to access a local clock with bounded drift rate, it becomes possible to tackle problems with timeliness specifications. In particular, it is possible to detect late events and to construct fail aware services [9]. However, achieving simultaneously the two properties required for perfect timing failure detection is still not possible [4] That is only possible, in fact, if the model over which the TFD service is constructed is, at least in part, synchronous. This has implications both in terms of the ....

Christof Fetzer and Flaviu Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadefphia, USA, May 1996. ACM.


Consensus and Membership in Synchronous and Asynchronous.. - Galleni, Powell (1996)   (2 citations)  (Correct)

....a far better term would have been time bounded . One model of this kind, particularly interesting for its applicability to many existing systems, is the timed asynchronous model [Cristian Schmuck 1995] which allows the implementation of deterministic protocols given that a stability predicate [Fetzer Cristian 1996] holds. The stability predicate takes into account the fact that the system is currently exhibiting a synchronous behavior. Since most existing distributed systems are likely to alternate between long stability periods and comparatively short instability intervals, it is reasonable to assume that ....

....many existing distributed systems are based on operating systems and communication services which achieve timely communication most of the time, such systems are likely to alternate between long stability periods and comparatively short instability intervals. In such systems, a stability predicate [Fetzer Cristian 1996] can be defined which states whether the system is currently exhibiting a synchronous behavior. When the stability predicate holds, agreement can be reached. In a timed asynchronous system the termination conditions are conditionally timed, that is, bounded termination is possible only if a ....

C. Fetzer and F. Cristian, Fail-Awareness in Timed Asynchronous Systems, Department of Computer Science and Engineering, University of California, San Diego, California, Technical Report, NCSE 95-453, February 1996.


On Scalable and Efficient Distributed Failure Detectors - Gupta, Chandra (2001)   (16 citations)  (Correct)

....a guide to designers of failure detector algorithms for real systems. For example, most distributed applications have opted to circumvent the impossibility result by relying on failure detector algorithms that guarantee completeness deterministically while achieving e#ciency only probabilistically [1, 2, 4, 6, 7, 8, 14]. The recent emergence of applications for large scale distributed systems has created a need for failure detector algorithms that minimize the network load (in bytes per second, or equivalently, messages per second with a limit on maximum message size) used, as well as the load imposed on ....

.... are primarily based on the weakness of the model required to implement them, in order to solve the Distributed Consensus Agreement problem [11] Proposals for implementable failure detectors have sometimes assumed network models with weak unreliability semantics e.g. timed asynchronous model [8], quasi synchronous model [2] partial synchrony model [12] etc. These proposals have treated failure detectors only as a tool to e#ciently reach agreement, ignoring their e#ciency from an application designer s viewpoint. For example, most failure detectors such as [12] provide eventual ....

[Article contains additional citation context not shown here]

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314--321a, May 1996.


Timing Failure Detection with a Timely Computing Base - Casimiro, Veríssimo (1999)   (1 citation)  (Correct)

....failures) is absent. Adding a short amount of synchrony, namely by allowing processes to access a local clock with bounded drift rate, it becomes possible to tackle problems with timeliness specifications. In particular, it is possible to detect late events and to construct fail aware services [9]. However, achieving simultaneously the two properties required for perfect timing failure detection is still not possible [4] That is only possible, in fact, if the model over which the TFD service is constructed is, at least in part, synchronous. This has implications both in terms of the ....

Christof Fetzer and Flaviu Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314-- 321a, Philadefphia, USA, May 1996. ACM.


How to Build a Timely Computing Base using Real-Time Linux - Casimiro, Martins.. (2000)   (Correct)

....principle. The idea is to transform unexpected timing failures into crash failures, enforcing a failsilent behavior. The measures are carried out using monitoring mechanisms based on fail awareness techniques, i.e. techniques that allow a component to realize it has suffered a timing failure [10]. As shown in Figure 2, the role of these mechanisms, to which we call self checking mechanisms, is to observe the interactions between the TCB services and the system hardware resources, detect violations of assumptions, and, if that happens, activate a fail silence switch. ############# ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314-- 321a, Philadefphia, USA, May 1996. ACM.


A practical Building block for solving . . . - Hurfin, al.   (Correct)

....to initially disseminate a message, whereas the presented solution needs only a simple broadcast (B) 15 Basically, such an adaptation includes the exchange of initial values between processes within the consensus protocol. 16 When considering the timed asynchronous distributed system model [9], a SGV protocol can be derived from the consensus protocol described in [19] Irisa A Practical Building Block for Solving Agreement Problems Delta Delta Delta 11 procedure propose(v i ) 1) view i [ v i ; 2) state i undecided; r i 0; ts i 0; so, ....

Fetzer C., and Cristian F. Fail-Awareness in Timed Asynchronous Systems. Proc. 15th acm Symposium on Principles of Distributed Computing, ACM Press, may 1996, pp. 314-321.


Fault-Tolerant Distributed Systems: a Modular Approach to the.. - Raynal (1996)   (Correct)

....protocols. 4 This means that if timers are appropriately set and the system stabilizes during a sufficiently long period, the properties required by Eventual Weak failure detectors are satisfied. In such cases protocols using those failure detectors will deliver their results in finite time ([10]) 5 Communication primitives similar to Multisend (m,P) have been used in the V kernel ( 6] and in the Parallel Virtual Machine package PVM ( 12] INRIA Non Blocking Atomic Commitment 9 Reliable multicast in asynchronous systems: AS Rel Multicast(m,P) The aim of this primitive is to ....

C. Fetzer and F. Cristian. Fail-Awareness in Timed Asynchronous Systems. Proc. 15th acm Symposium on Principles of Distributed Computing, ACM Press, may 1996, pp. 314-321.


Thread-based vs Event-based Implementation of a Group.. - Shivakant Mishra And (1998)   (1 citation)  (Correct)

....are unblocked by the pthread cond signal function call, when needed. Progress of the Timewheel group communication service is dependent on the individual clocks of different processors being synchronized. A group member whose clock is not synchronized with the other members leaves the group (See [4] for details) So, to ensure progress, the timer thread used by the clock synchronization protocol needs to be scheduled as soon as its timer expires. We give this thread the highest priority. The other five threads are of equal priorities. A consequence of unequal priorities is that race ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In 15th ACM PODC, Philadelphia, PA, May 1996.


The Timely Computing Base - Veríssimo, Casimiro (1999)   (Correct)

..... moreover, since all processes can be informed by the TCB of failure occurrences, switching to a fail safe state can be done in a controlled way in the case of distributed or replicated applications. Several examples of applications with a fail safe state can be encountered in the literature[29, 20, 21]. In the examples described in [20] and [21] the authors show how a fail safe application can be implemented in the timed asynchronous model. The detection of timing failures is done by using fail aware services but the assurance of crucial safety properties also requires communication by time in ....

....informed by the TCB of failure occurrences, switching to a fail safe state can be done in a controlled way in the case of distributed or replicated applications. Several examples of applications with a fail safe state can be encountered in the literature[29, 20, 21] In the examples described in [20] and [21] the authors show how a fail safe application can be implemented in the timed asynchronous model. The detection of timing failures is done by using fail aware services but the assurance of crucial safety properties also requires communication by time in the realm of the application. ....

Christof Fetzer and Flaviu Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadefphia, USA, May 1996. ACM.


Available Fail-Safe Systems - Essame, Arlat, Powell (1997)   (1 citation)  (Correct)

....drift exceeds a predefined bound. Consequently, the local clock of an operational unit has a bounded rate of drift from real time. Thus, the system satisfies the timed asynchronous model [6] Since delayed messages can impair safety, a fail aware datagram service similar to the one described in [7] is available for message delivery. Thus delayed messages can be thrown away. 3.1 Replication Both availability and safety are key issues in railway systems. To achieve the availability requirements, some functions are replicated on different components of the system. Three modes of replication ....

C. Fetzer and F. Cristian, "Fail-Awareness in Timed Asynchronous Systems", in Proc. 15th ACM Symp. on Principles of Distributed Computing, (Philadelphia, USA), pp.314-321, May 1996.


On the Role of Time in Distributed Systems - Veríssimo (1997)   (1 citation)  (Correct)

....on fairly synchronous kernel level communication servers. Hybrid architectural models can cast these differences, to the advantage of protocols, as described in [30] However, a large scale infrastructure is not fully synchronous either, and it would be dangerous to suppose so, as conjectured in [1, 11]. Recent developments have consolidated solutions to the problems raised in this section. A formal basis to address models that are neither fully asynchronous nor fully synchronous was laid down, in the form of a new class of time related models. Tools that further help manage time have appeared, ....

....this is where lied, for example, one of the differences between the timedasynchronous and the quasi synchronous approaches. The timed asynchronous model uses a TFD based on probabilistic round trip delay measurement, that makes weak assumptions about the underlying infrastructure serving the TFD[11]: uncertain communication bounds; no global time. The quasi synchronous approach uses a time triggered TFD, based on stronger assumptions: global time; deterministic communication[2] The type of TFD chosen for a given system depends on the type of application, and of infrastructure available. 8 ....

Cristof Fetzer and Flaviu Cristian. Fail-awareness in timedasynchronous systems. In 15th ACM Symposium on Principles of Distributed Computing, pages 314--321, Philadelphia, USA, October 1996.


A Comparison of Timed Asynchronous Systems and Asynchronous.. - Fetzer (1999)   Self-citation (Fetzer)   (Correct)

.... mechanism described in [12] Second, we define a weaker failure detector that allows one to provide the same properties as the timed asynchronous model (modulo process recoveries) Hence, one can use the same techniques that were introduced for the timed asynchronous model, e.g. fail awareness [10, 12], to program systems with failure detector . EGFIH J In this section we define a variant of the timed asynchronous system model . We call this variant 1 . Its main restriction is that we exclude in runs in which processes recover after a crash, i.e. in 16 a crashed ....

FETZER, C., AND CRISTIAN, F. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing (Philadelphia, May 1996), pp. 314--321a. http://www.cs.ucsd.edu/cfetzer/FA.


FORTRESS: A System to Support Fail-Aware Real-Time Applications - Fetzer, Cristian (1997)   (1 citation)  Self-citation (Fetzer Cristian)   (Correct)

....units. Otherwise, is timely. Services are typically specified by a set of properties. For example, an external clock synchronization service is specified by the following property: a clock is at any time at most some given apart from real time. Fortress is based on the fail awareness paradigm [10]: services have to monitor themselves to detect when they cannot provide all their properties anymore. This detection has to be timely. Before a property of a service becomes invalid, the service has to signal this condition to its clients. The goal of that signaling is to simplify failure ....

.... a membership service in a time free asynchronous system [2] unless one introduces an additional mechanism like a failure detector [3] We proposed a different way to change the specification of a synchronous service such that it becomes implementable in timed asynchronous systems: fail awareness [10]. A server is required to provide its standard synchronous semantics as long as the failure frequency is within some given bound, and the server has to signal to its clients an exception whenever it provides an exception semantics instead of its standard semantics. For example, a fail aware ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996.


Fail-Aware Clock Synchronization - Fetzer, Cristian (1996)   (1 citation)  Self-citation (Fetzer Cristian)   (Correct)

....clocks: fail aware clock synchronization. The specification of that new service is derived using the general concept of fail awareness: this concept allows to transform the specification of a synchronous service into a fail aware service such that it becomes implementable in asynchronous systems [1]. To illustrate the usage of fail aware clock synchronization, we show how it can be used to solve the highly available, fail aware leader election problem. The specification of fail aware clock synchronization requires that each time server maintains a synchronization indicator, for any time ....

....the synchronized clocks (i.e. their synchronization indicators are true) of a majority of processes within some given reading error at the end of a round, can keep its synchronization indicator true for the next round. Otherwise, s synchronization indicator is (automatically) switched to false [1, 4]. Notes 1. This work was presented at the Dagstuhl Seminar on Clock Synchronization, March, 1996. It also appears in the Journal of Real Time Systems, 1997. ....

Fetzer, C. and Cristian, F., "Fail-Awareness in Timed Asynchronous Systems", Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, May 1996, Philadelphia, pp. 314--321a.


The Timed Asynchronous Distributed System Model - Cristian, Fetzer (1999)   (78 citations)  Self-citation (Fetzer Cristian)   (Correct)

....example, we are concerned about how one can ensure that there are no two leaders at any point in real time and we are not interested in solutions where there are no two leaders in virtual time. This difference is important for real time systems that have to interact with external processes. In [15] we introduced the notion of fail awareness as a systematic means of transforming synchronous service specifications into fail aware specifications that become implementable in timed asynchronous systems. The idea is that processes have to provide their synchronous properties as long as the ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996. http://www.cs.- ucsd.edu/cfetzer/FA.


The Timewheel Group Communication System - Mishra, Fetzer, Cristian (2002)   (3 citations)  Self-citation (Fetzer Cristian)   (Correct)

....protocol provides a consistent system wide view of which team members are operational at any given moment in time. In this paper, we present the timewheel atomic broadcast protocol and the timewheel group membership protocol. These two protocols along with a clock synchronization protocol [24] comprise the timewheel group communication system. The timewheel group communication system provides four unique characteristics that distinguish it from other group communication services [12, 8, 44, 52, 9, 3, 39, 4, 21, 50, 45, 19, 5, 7] First, this system has been designed for a timed ....

....when broadcasting an update. Termination The termination semantics requires that all updates proposed by stable processes are eventually delivered by all stable processes. Furthermore, only the updates proposed by stable processes are delivered. The predicate stable is defined in [24]. A process is said to be stable iff (1) is timely, 2) can communicate in a timely manner with a majority of processes in the team, and all these processes are stable, and (3) detect all messages from non stable processes as being late, and can reject them. Order The ....

[Article contains additional citation context not shown here]

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, Philadelphia, PA, May 1996.


Fail-Awareness: An Approach to Construct Fail-Safe Systems - And (1997)   Self-citation (Fetzer Cristian)   (Correct)

....service that classifies all messages with a transmission delay greater than some as slow and messages with a transmission delay of at most as fast . The second constant was introduced because a receiver of a message can only determine the transmission delay of with some error [9]. The following definition of disconnected is based on the fact that processes can identify slow messages: a process disconnected from a b d fe iff every message that receives in from has a transmission delay of more than pB time units (see Figure 4) Common situations in ....

....aims to support the design of real time systems by simplifying the detection of situations when the delays of messages and processes become so high that not all performance failures can be masked and an application has to switch to a fail safe mode. We introduced the concept of fail awareness in [9] as a general method of transforming synchronous service specifications into weaker, fail aware service specifications that are implementable in timed asynchronous systems [5] In our earlier work on fail awareness [9] we did not address the issue of partitionable operation: only servers in the ....

[Article contains additional citation context not shown here]

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996. http://www.christof.- org/FA.


Using Fail-Awareness to Design Adaptive Real-Time Applications. - Christof Fetzer And (1997)   (2 citations)  Self-citation (Fetzer Cristian)   (Correct)

....failure rate. Our approach uses the concept of fail awareness to adapt the quality of service. Failawareness is a concept that detects all non maskable failures of a server and propagates that information to the clients of the server using indicators. We introduced the notion of fail awareness in [7]. A more detailed description of Fortress can be found in [11] and an overview of fail awareness in a partitionable setting is given in [10] The fail aware services is are given in [8, 12, 9] A simple synchronized traffic signaling example that demonstrates the use of fail aware services in a ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996.


A Fail-Aware Membership Service - Fetzer, Cristian (1996)   Self-citation (Fetzer Cristian)   (Correct)

....designed for an asynchronous system subject to partitioning. Second, when an unstable process cannot keep up with the other processes in its partition, it has to detect that and indicate to its clients that its membership is out of date. We thus say that the membership service is fail aware [9]. 3 Stable Partitions Informally, a set of processes of messages sent between these processes per round are delivered in a timely manner, and (3) from any other partition either no or only old messages arrive. Formally, we define a stable partition by a stability predicate. To this end, ....

....service that classifies all messages with a transmission delay greater than some as slow and messages with a transmission delay of at most as fast . The second constant was introduced because a receiver of a message can only determine the transmission delay of within some bounds [9]. The following definition of disconnected is based on the fact that processes can identify slow messages: a process disconnected from a process iff any message that receives in has a transmission delay of more than time units. A common situation in which two ....

[Article contains additional citation context not shown here]

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996.


A Fail-Aware Datagram Service - Fetzer, Cristian (1998)   (7 citations)  Self-citation (Fetzer Cristian)   (Correct)

....information using the fail aware datagram service. 2 Related Work Many synchronous distributed services are specified by using safety properties, i.e. properties that should always hold. To implement a safety property, it is often necessary that one has certain communication. Fail awareness [10, 12] is a general method for extending the safety properties of a fault tolerant synchronous service by an exception indicator so that the new, extended service becomes implementable in distributed systems with uncertain communication but with access to local hardware clocks. The idea is that the ....

....the fail aware datagram service delivers is classified as either slow or fast . Such a classification of a message can also be performed with the help of internally synchronized clocks. Even though internal clock synchronization can be achieved in asynchronous systems by probabilistic methods [1, 10], the solution proposed in this paper (which does not use synchronized clocks) achieves a better precision since we only need to have pairwise synchronized clocks in the sense that any two connected processes know approximately the distance between their own two hardware clocks. The fail aware ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996. http://www.christof.org/FA.


Fail-Aware Failure Detectors - Fetzer, Cristian (1996)   (5 citations)  Self-citation (Fetzer Cristian)   (Correct)

....is at most one leader: when current leader, and is suspected by a majority of processes (that could elect a new leader) failure detector module suspects and this selfsuspicion lets know that it has been demoted to make place for a new leader. 2 Related Work Fail awareness [7] is a general concept to extend safety properties of a fault tolerant synchronous service by an exception indicator so that the new service becomes implementable in timed asynchronous systems [5] The idea is that the indicator tells a server and its clients whether a safety property currently ....

....and weak fail awareness but no other properties. In this section we sketch how QPSR can be implemented in timed asynchronous systems [5] provided they satisfy a certain progress assumption [6] The protocol depends upon failaware datagram and fail aware clock synchronization services [7]. For self containment, before we describe our protocol for , we give a brief overview of the timed asynchronous system model and these two services. 6.1 Timed Systems and Progress Assumptions The timed asynchronous system model does not guarantee an upper bound on message transmission and ....

[Article contains additional citation context not shown here]

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996.


Derivation of Fail-Aware Membership Service Specifications - Fetzer, Cristian (1996)   (2 citations)  Self-citation (Fetzer Cristian)   (Correct)

....time on the current membership while [3] does only require that processes agree on the order in which they remove or include processes. We show in [11] how the fail aware specifications that we derive in this paper can efficiently be implemented in timed asynchronous systems. We introduced in [9] the notion of fail awareness as a general method for extending properties of a fault tolerant synchronous service by an exception indicator so that the new, extended service becomes implementable in timed asynchronous systems. The idea is that each server uses its indicator to tell its clients ....

....processes change their member set within some i real time units of each other. While for synchronous systems the internal time base is based on deterministic internal clock synchronization, for asynchronous systems we base it on failaware internal clock synchronization (specification given below) [9]. r p q true false false true realtime b t up down r mset t s s d Figure 10: The best temporal agreement two up processes can achieve is i : their member set changes can be up to i apart from each other. has a clock . We denote the set of clock time values by j8 and ....

[Article contains additional citation context not shown here]

FETZER, C., AND CRISTIAN, F. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing (Philadelphia, May 1996), pp. 314--321a. http://- cfetzer/FA/fa.html.


A Fail-Aware Datagram Service - Fetzer, Cristian (1998)   (7 citations)  Self-citation (Fetzer Cristian)   (Correct)

....information using the fail aware datagram service. 2 Related Work Many synchronous distributed services are specified by using safety properties, i.e. properties that should always hold. To implement a safety property, it is often necessary that one has certain communication. Fail awareness [10, 12] is a general method for extending the safety properties of a fault tolerant synchronous service by an exception indicator so that the new, extended service becomes implementable in distributed systems with uncertain communication but with access to local hardware clocks. The idea is that the ....

....the fail aware datagram service delivers is classified as either slow or fast . Such a classification of a message can also be performed with the help of internally synchronized clocks. Even though internal clock synchronization can be achieved in asynchronous systems by probabilistic methods [1, 10], the solution proposed in this paper (which does not use synchronized clocks) achieves a better precision since we only need to have pairwise synchronized clocks in the sense that any two connected processes know approximately the distance between their own two hardware clocks. The fail aware ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996. http://www.christof.org/FA.


The Timewheel Group Membership Protocol - Mishra, Fetzer, Cristian (1998)   Self-citation (Fetzer Cristian)   (Correct)

....envelope of real time. In the timed asynchronous system model, it is not possible to keep correct clocks synchronized all the time, because it allows a (very unlikely) run in which no process can communicate with any other process. We therefore use a fail aware clock synchronization protocol [15] that guarantees that (1) any process p knows at any time if its clock is synchronized, and (2) whenever the underlying datagram and process service allows this, p s clock is synchronized. A process p that cannot keep its clock synchronized is removed from the current group by the group membership ....

....term process to refer to a process in that team. The timewheel group membership protocol maintains a consistent system wide current group (sometimes also called view ) of processes that exhibit synchronous behavior . The meaning of synchronous behavior is formalized by predicate Delta stable [15]. A process p is Delta stable iff (1) p is timely, 2) p can communicate in a timely manner with a majority of processes and all these processes are Delta stable, and (3) p can detect all messages from non Delta stable processes as being late and, therefore, can reject them. The membership ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, Philadelphia, PA, May 1996.


The Timed Asynchronous Distributed System Model - Cristian, Fetzer (1999)   (78 citations)  Self-citation (Fetzer Cristian)   (Correct)

....for example, we are concerned about how one can ensure that there are no two leaders at any point in real time and we are not interested in solutions where there are no two leaders in virtual time. This di erence is important for real time systems that have to interact with external processes. In [18] we introduced the notion of fail awareness as a systematic means of transforming synchronous service speci cations into fail aware speci cations that become implementable in timed asynchronous systems. The idea is that processes have to provide their synchronous properties as long as the ....

C. Fetzer and F. Cristian, \Fail-awareness in timed asynchronous systems," in Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, Philadelphia, May 1996, pp. 314-321a, http://www.christof.org/FA.


The Timed Asynchronous Distributed System Model - Cristian, Fetzer (1999)   (78 citations)  Self-citation (Fetzer Cristian)   (Correct)

....example, we are concerned about how one can ensure that there are no two leaders at any point in real time and we are not interested in solutions where there are no two leaders in virtual time. This difference is important for real time systems that have to interact with external processes. In [15] we introduced the notion of fail awareness as a systematic means of transforming synchronous service specifications into fail aware specifications that become implementable in timed asynchronous systems. The idea is that processes have to provide their synchronous properties as long as the ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 314--321a, Philadelphia, May 1996. http://www.cs.ucsd.edu/~cfetzer/FA.


Synchronous and Asynchronous Group Communication (Long Version) - Cristian (1996)   (39 citations)  Self-citation (Cristian)   (Correct)

....learn of new updates, joins, and failures are bounded only when certain stability conditions hold. The stability condition considered in this paper is system stability as defined in section 2. Weaker stability conditions, such as majority stability and Delta stability are investigated in [19, 18]. Third, because delays are unbounded in asynchronous systems, ensuring agreement on initial group states requires more work than in the synchronous case. An earlier paper [16] explored a suite of four increasingly strong asynchronous membership specifications. All protocols described in [16] ....

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. Technical Report CSE95-453, UCSD, 1995. Available via anonymous ftp at cs.ucsd.edu as /pub/team/failAwareness.ps.Z.


Timed Asynchronous System Model - Cristian, Fetzer (1997)   (29 citations)  Self-citation (Fetzer Cristian)   (Correct)

....use of conditional timeliness properties. To further highlight the similarities and differences which exist between the synchronous and the timed asynchronous system models, 7] compares the properties of fundamental synchronous and asynchronous services such as membership and atomic broadcast. In [16] we introduced the notion of fail awareness as a systematic means of transforming synchronous service specifications into (fail aware) specifications that are implementable in timed asynchronous systems. Progress assumptions, which require that, infinitely often, some majority set of processes ....

....m that p receives in [s; t] from q has a transmission delay of more than Delta ffi time units. A common situation in which two processes are Delta disconnected is when the network between them is overloaded or at least one of the processes is slow. One can use a fail aware datagram service [16] to classify messages with a transmission delay greater than some Delta ffi as slow and messages with a transmission delay of at most ffi as fast . Messages with a transmission delay within (ffi; Delta] are either classified as slow or fast . We use the predicate ....

[Article contains additional citation context not shown here]

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th ACM Symposium on Principles of Distribute d Computing, Philadelphia, May 1996.


Halo Membership Service: A speci - Membership Service For   (Correct)

No context found.

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314-321a, Philadelphia, 1996.


Halo Membership Service: A specific membership service for.. - Banuls, Galdamez (2004)   (Correct)

No context found.

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314-321a, Philadelphia, 1996.


Membership Service for Open Clusters - Ba Nuls Gald (2003)   (Correct)

No context found.

C. Fetzer and F. Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314-321a, Philadelphia, 1996.


Membership Service For Open Groups - Banuls, Galdamez (2003)   (Correct)

No context found.

Christof Fetzer and Flaviu Cristian. Fail-awareness in timed asynchronous systems. In Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314-321a, Philadelphia, 1996.


The Timely Computing Base Model and Architecture - Verissimo, Casimiro   (Correct)

No context found.

C. Fetzer and F. Cristian, "Fail-awareness in timed asynchronous systems," in Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, Philadefphia, USA, May 1996, pp. 314--321a.


Nonintrusive Failure Detection and Recovery for.. - Sultan, Bohra..   (Correct)

No context found.

C. Fetzer and F. Cristian. Fail-Awareness in Timed Asynchronous Systems. In Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC'96), pages 314--321a, Philadelphia, 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC