12 citations found. Retrieving documents...
R. Guerraoui and A. Schiper. Fault-tolerance by replication in distributed systems. In Reliable Software Technologies - AdaEurope '96, LNCS 1088, June 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Algorithms for Location-Independent Communication between.. - Wojciechowski (2001)   (2 citations)  (Correct)

....e.g. state checkpointing, message logging and recovery. We can also replicate the server on di#erent sites to enhance system availability and fault tolerance. Group communication can provide adequate multicast primitives for implementing either primary backup or active replication [GS96] Mechanisms similar to Home Servers have been used in many systems which support process migration, e.g. in Sprite [DO91] Caching has been used, e.g. in LOCUS [PW85] and V [Che88] allowing operations to be sent directly to a remote process without passing through another site. If the cached ....

R. Guerraoui and A. Schiper. Fault-tolerance by replication in distributed systems. In Reliable Software Technologies - AdaEurope '96, LNCS 1088, June 1996.


Distributed Object-Oriented Real-Time Systems using a Hybrid Model .. - Moody   (Correct)

....is imperative. Both CORBA and Ada s DSA provide an exception when communication fails (with varying time out specification mechanisms) Once assumed failure occurs, a new object must be found. This example provides a good vehicle to discuss many framework issues dealing with faulttolerance [10][11]. Issues such as replicating the single name server, knowing about a known set of backups, providing heart beats or handling connection exceptions have been examined by different Boeing applications, with the goal of shielding this from the user s code leading to more error proof and ....

R. Guerraoui, A. Schiper, "Fault-Tolerance by Replication in Distributed Systems" Proceedings on Reliable Software Technologies - Ada-Europe'96, Lecture Notes in Computer Science v1088, Springer-Verlag


Enforcing Strong Consistency with Semantically View.. - Pereira, Rodrigues.. (2001)   (2 citations)  (Correct)

....to implement fault tolerant highly available services using the primitives provided by group communication toolkits. For instance, the reliable and view synchronous multicast [7, 19] services offered by such toolkits have been shown to be an adequate foundation for primary backup replication [6]. On the other hand, in practice the use of group communication in the implementation of services that require stable high throughput is impaired by temporarily slow replicas or network links. In fact, due to flow control mechanisms a single slow component affects the overall performance of the ....

.... relation (Figure 2) exists in a maximal element c 4 e such that a cda b8e Ac 2 e preceded by a maximal element b c a Zb e (by the 1 Except for the way our algorithm sends state updates to the backups, it is identical to that in [6] 10 0 20 40 60 80 100 0 20 40 60 80 100 Invocation rate ( Perturbation ( reliable semantic Figure 3: Impact of an increasingly slower replica on the performance of primarybackup replication. algorithm and FIFO order) that overwrites b afcda Zbfe . Since a c a Zb e ....

R. Guerraoui and A. Schiper. Fault-tolerance by replication in distributed systems. In Reliable Software Technologies - Ada-Europe'96, LNCS 1088, pages 38--57. SpringerVerlag, June 1996.


High-Available Enterprise JavaBeans Using Group.. - Pasin, Riveill, Weber   (Correct)

....by CNPq Brazil under grant 200594 00 1 2 work partially sponsored by the French Ministry of Research (RNTL project ARCAD) Replication requires synchronizing replicas, even in the presence of client s concurrent access and node crashes. It can be easily done using group communication primitives [3]. Although these computational models (group communication and transactional) seem to be opposite, we agree with some authors that believe that group communication and transactions can be matched [5, 6 and 7] Our approach uses group communication to provide an efficient and high available EJB ....

....EJB service as a new property to the current EJB specification: the replication approach is hidden to users and the overhead, generated by the replica synchronization, is reduced using suitable design decisions. Our approach is based on a group communication primitive called total order multicast [3]. This primitive enables a client sending messages to a group of process (replicas) with the guarantee that all processes agree on the set and on the order that the messages will be delivered. Group communication primitives allow good throughput and response time under replicated database systems ....

[Article contains additional citation context not shown here]

Guerraoui, R. and Schiper, A. Fault-Tolerance by replication in distributed systems. In Proc Conference on Reliable Software Technologies (invited paper), p. 38-57. Springer Verlag, LNCS 1088, June 1996.


An Ada Library to Program Fault-tolerant Distributed .. - Guerra, Miranda.. (1997)   (1 citation)  (Correct)

....new programming paradigms arises. The basic approach to fault tolerance using standard hardware components is the use of distributed systems with hardware and software replication. The two main software techniques used there are the primary backup approach, and the active replication paradigm [19]. Compared with the primary backup approach, the active replication technique offers the additional advantage that it This work has been partially funded by the Spanish Research Council (CICYT) contract numbers TIC94 0162 C02 01 and TIC96 0614. allows for continuous service in the ....

Guerraoui, R. and Schiper, A. Fault-Tolerance by Replication in Distributed Systems. Proceedings of the Reliable Software Technologies---Ada-Europe ' 96 Conference, LNCS 1088, Springer Verlag. This article was processed using the L A T E X macro package with LLNCS style


Object Groups in Transactional CORBA Systems - Havenstein (1999)   (Correct)

....relieve programmers from the most difficult tasks involved in implementing replicated processes, such as handshaking and consensus. Figure 2 1: Active Replication Client Replica3 Replica1 Replica2 17 The two main replication techniques are Active Replication and Passive Replication (see e.g. Guerraoui Schiper 1996) 4 . In the active replication strategy, clients multicast requests to the group of replicas 5 , and each replica processes and replies to every request 6 . This is a symmetric model, i.e. each replica plays the same role. In contrast, in the passive replication strategy, only one of the ....

Guerraoui, R., and A. Schiper. 1996. "Fault-Tolerance by Replication in Distributed Systems." Proceedings International Conference on Parallel and Distributed computing (PDCS '96). Dijon, France. 125-130.


Fault Tolerance in Distributed Ada 95 - Wolf (1997)   (1 citation)  (Correct)

....we will only consider crash failures [6] of nodes. Also, we will at first only treat homogeneous distributed systems. Concerning the network, our assumptions are given by those made by the underlying group communication system we use for replica management, i.e. we assume an asynchronous network [4] with unreliable failure detectors [3] In addition, the replication scheme should be as transparent as possible for the application since we consider replication a configuration concern. This explains our choice of replication as the basic technique for achieving fault tolerance instead of more ....

Guerraoui, R.; Schiper, A.: "Fault Tolerance by Replication in Distributed Systems", Proceedings of AdaEurope '96, Montreux, Switzerland, June 1996; published as Lecture Notes in Computer Science 1088, pp. 38 - 57, Springer 1996


Implementing Highly-Available WWW Servers based on.. - Baldoni, Bonamoneta, .. (1999)   (Correct)

.... sees a consistent value for its own replicated state; Previous points can be achieved either by developing non fault tolerant software over redundant and expensive hardware (like Stratus and Tandem Systems) or by exploiting techniques based on redundant software. i.e. software replication [13]. Software replication is a low cost technique providing fault tolerance that requires objects and processes hosted on different workstations to (i) interact in order to maintain consistency among replicas of objects and (ii) to cooperate in order to give the illusion to clients that the service ....

....in order to maintain consistency among replicas of objects and (ii) to cooperate in order to give the illusion to clients that the service is provided by a single entity. Replication allows the service to survive despite a workstation crash. There are two main classes of replication techniques [4, 5, 13]: Primary Backup replication and Active replication. In Primary Backup replication (also named passive replication) a given object y is called primary and it is responsible for handling client requests and for maintaining the value of other y s replicas on the other sites consistent with its ....

[Article contains additional citation context not shown here]

Guerraoui R., Shiper A., "Fault-Tolerance by Replication in Distributed Systems", in Proc. Reliable Software Technologies -- Ada-Europe'96, Springer Verlag,LNCS 1088, 1996.


Scalable Atomic Multicast - Rodrigues, Guerraoui, Schiper (1998)   (8 citations)  Self-citation (Guerraoui)   (Correct)

....not require causal order delivery of messages. SCALATOM does not require a membership service which might lead to incorrectly suspect correct processes and cause them to crash. Notice that not relying on a membership service does not preclude reintegrating crashed process after their recovery (see [13]) We have compared the performances of SCALATOM with those of (less scalable) total order multicast algorithms. Not surprisingly, the price to pay for scalability is reflected in the high latency of SCALATOM, i.e. in the number of communication steps needed to deliver a message to the ....

R. Guerraoui and A. Schiper. Fault-Tolerance by Replication in Distributed Systems. In Proc Conference on Reliable Software Technologies (invited paper), pages 38--57. Springer Verlag, LNCS 1088, June 1996.


Scalable Atomic Multicast - Rodrigues, Guerraoui, Schiper (1998)   (8 citations)  Self-citation (Guerraoui Schiper)   (Correct)

....not require causal order delivery of messages. SCALATOM does not require a membership service which might lead to incorrectly suspect correct processes and cause them to crash. Notice that not relying on a membership service does not preclude reintegrating crashed process after their recovery (see [14]) We have compared the performances of SCALATOM with those of (less scalable) total order multicast algorithms. Not surprisingly, the price to pay for scalability is reflected in the high latency of SCALATOM, i.e. in the number of communication steps needed to deliver a message to the ....

R. Guerraoui and A. Schiper. Fault-Tolerance by Replication in Distributed Systems. In Proc Conference on Reliable Software Technologies (invited paper), pages 38--57. Springer Verlag, LNCS 1088, June 1996.


Consensus Service: a modular approach for building.. - Guerraoui, Schiper (1996)   (10 citations)  Self-citation (Guerraoui Schiper)   (Correct)

....by the Isis system [2] can be seen as an atomic (in the sense of all or nothing) multicast for dynamic groups of processes. This primitive is adequate, for example, in the context of the primary backup replication technique, to multicast the update message from the primary to the backups [11]. Consider a dynamic group g, i.e. a group whose membership changes during the life time of the system, for example as the result of the crash of one of its members. A crashed process p i is removed from the group; if p i later recovers, then it rejoins the group (usually with a new identifier) ....

.... Function VS InitValue(dataReceived j ) batch(k) dataReceived j ; v k 1 (g) fp i j unstable i 2 dataReceived j g; return (batch(k) v k 1 (g) 13 When using view synchronous multicast to implement the primary backup replication technique, the set unstable i contains at most one message [11]. The pair (batch(k) v k 1 (g) returned by the VSInitValue function is the initial value of server s j for the consensus, and not yet the decision of the consensus service. For every server s j , the initial value of s j is such that v k 1 (g) contains a majority of processes of v k (g) and ....

R. Guerraoui and A. Schiper. Fault-Tolerance by Replication in Distributed Systems. In Proc Conference on Reliable Software Technologies (invited paper) . Springer Verlag, June 1996. To appear.


Scalable Atomic Multicast - Rodrigues, Guerraoui, Schiper (1998)   (9 citations)  Self-citation (Guerraoui Schiper)   (Correct)

....not require causal order delivery of messages. SCALATOM does not require a membership service which might lead to incorrectly suspect correct processes and cause them to crash. Notice that not relying on a membership service does not preclude reintegrating crashed process after their recovery (see [13]) We have compared the performances of SCALATOM with those of (less scalable) total order multicast algorithms. Not surprisingly, the price to pay for scalability is reflected in the high latency of SCALATOM, i.e. in the number of communication steps needed to deliver a message to the ....

R. Guerraoui and A. Schiper. Fault-Tolerance by Replication in Distributed Systems. In Proc Conference on Reliable Software Technologies (invited paper), pages 38--57. Springer Verlag, LNCS 1088, June 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC