| R.H. Campbell, B. Randell. Error Recovery in Asynchronous Systems. IEEE TSE-12, 8. 1986 |
....Further, since this control is performed at the operating system level, no modification to the feature is required to allow its use. The approach shares similarities with software fault tolerance solutions [20] and error recovery techniques such as documented by Shin [25] and Campbell and Randell [10]. To implement this approach a feature is nested inside a feature controller. The feature is a conventional non transactional state machine implementation providing call processing functionality and the feature controller is responsible for cocooning this to make it work correctly with the ....
R. H. Campbell and B. Randell. Error recovery in asynchronous systems. IEEE Trans. Software Engineering, SE-12(8):811--826, 1986.
....To provide a more general method, an exception graph representing an exception hierarchy can be utilised. If several exceptions are raised concurrently, then the multiple exceptions are resolved into the exception that is the root of the smallest subtree containing all the raised exceptions [Campbell Randell 1986]. In principle, each CA action should have its own exception graph. Figure 4 shows an example of an exception graph containing three primitive exceptions e 1 , e 2 , e 3 at the level 0. The resolving exception e 1 e 2 at level one will be raised when e 1 and e 2 are raised concurrently. ....
....1982] the timeout mechanism is the most practical way of detecting process desertion) In this respect we distinguish entry and exit desertion for a given CA action since, we believe, they should be treated separately. The basic idea of coordinating roles recovery activities within an action [Campbell Randell 1986] can be naturally extended to the treatment of real time exceptions. Either forward recovery, backward recovery, or a combination of both can be used. The only requirement is to make error recovery predictable and to involve it early enough to be effective. This can be achieved using ....
R.H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems," IEEE Trans. Soft. Eng., vol. SE-12, no.8, pp.811-826, 1986.
....nature (including software tasks, people, plants, documents, organisations, etc. participate in cooperative activities. 1. Atomic Actions and Asynchrony Atomic actions have proven to be a very efficient way of structuring complex concurrent systems and of providing their fault tolerance [LA90, CR86, XR00]. Several participants (objects, processes, etc. enter an action to perform some joint cooperative activity. When the goal of this cooperation is achieved, the participants leave the action. Action execution is atomic for its environment because information is not allowed to cross the action ....
....and system fault tolerance using the same structuring technique. Forward error recovery is known to be the most general way of providing application level fault tolerance [LA90] We share many researchers view of exception handling as the most powerful software fault tolerance mechanism [C89, CR86]. In the context of atomic actions coopera # tive exception handling was introduced as the main feature of tolerating faults [CR86] Within this approach, any exception raised by any action participant is to be handled cooperatively by all action participants; when handling is not ....
[Article contains additional citation context not shown here]
R.H. Campbell, B. Randell. Error Recovery in Asynchronous Systems. IEEE TSE-12, 8, 1986, 811-826.
....any complex systems during their integration to make system level error containment and exception handling easier. Researchers working on system dependability realise that there are many situations when it is not enough to recover only one process of complex concurrent and distributed systems [R75, C86, X95] because erroneous information can be propagated among processes, mistakes can be made in designing process joint activity; exceptions raised concurrently in several processes can be the symptoms of the same problem. This understanding is not common for CBSD (to the best of our knowledge only ....
....integrated systems out of dynamic atomic actions and associating system level exception handling with such actions introducing local error detection and exception handling for each component implementing these local functionalities using wrapping techniques. We believe that atomic actions [C86] form the sound basis of structuring integrated systems mainly because they offer a recursive approach to building complex systems and for incorporating exception handling into them. Several participants (threads, objects, etc. enter such an action and cooperate inside it to achieve joint goals ....
[Article contains additional citation context not shown here]
R.H. Campbell, B. Randell. Error Recovery in Asynchronous Systems. IEEE TSE-12, 8 (1986) 811826
....state of CA action implementation and is important for future research in CA actions. 1. Atomic Actions Atomic actions (or conversations) are a well known technique intended for structuring complex concurrent systems in which several activities (processes, threads, active objects) cooperate [1, 2]. These activities (action participants) enter the action and cooperate within its scope in such a way that no information flow can cross the action border. They leave the action synchronously when all of them have agreed on the action outcome. The action execution is invisible and indivisible for ....
....to design them with monitor semantics [10] e.g. as Ada 95 protected objects) Private local objects are used by individual action participants and represent their internal states. CA actions can use both BER and FER as well as their combination. In this respect they are similar to conversations [2]. When FER is used, the action body is the exception context in which exceptions can be declared, exception handlers are associated with each role and exception resolution is used to resolve several exceptions raised by several roles; the failure exception is used to inform the containing action ....
Campbell, R.H., Randell, B. Error recovery in asynchronous systems, IEEE Trans., 1986, SE-12, 8
....in Transactional Drago, a single exception is propagated to inform about the cause of the abort. If more than one thread of a transaction nishes with an unhandled exception, exception resolution is performed in Transactional Drago in order to propagate a single exception. Exception resolution [CR86] allows to choose an exception that represents all the exceptions that have been concurrently raised. Transactional Drago provides a default resolution scheme that is applied when concurrent exceptions are raised (in contrast with Ada, where no exception resolution scheme is provided and ....
R. H. Campbell and B. Randell. Error Recovery in Asynchronous Systems. IEEE TSE, 12(8):811-826, August 1986.
....tests have been satisfied, the processes leave the conversation. Otherwise, they restore their states from the recovery points and may try and execute a different alternate. Atomic Actions Later on, conversations have been enhanced with additional forward error recovery and exception resolution [11], resulting in so called atomic actions [4] This means that an exception that has been raised in a process that is part of an atomic action will be propagated to all other participating processes of that action. Since multiple excep tions can be raised concurrently, an exception resolution ....
R. H. Campbell and B. Randell: "Error Recovery in Asynchronous Systems". IEEE Transactions on Software Engineering (SE) SE-12(8), August 1986.
.... interaction that provides facilities for exception handling, in particular including means of: ffl Handling Concurrent Exceptions: when an exception occurs in one of the bodies of a participant, if it is not dealt with by that participant, the exception must be propagated to all participants [5] [6] A DMI must provide a way of dealing with exceptions that can be raised by one or more participants. If several different exceptions are raised concurrently, then the dependable multiparty interaction mechanism uses a process of exception resolution to decide upon a common exception that will ....
R. H. Campbell and B. Randell. "Error Recovery in Asynchronous Systems". In IEEE Transactions on Software Engineering, SE-12(8), pp. 811-826, 1986.
....of peer entities. This paradigm appeared as early as in [Powell et al. 1988] where it is called multipoint association, and also in [Peterson et al. 1989] where it is called conversation, a term that we avoid in order not to cause confusion with a different paradigm with the same name described in [Campbell Randell 1986], and also discussed below. Multipeer interactions are the kind of interaction one might wish among managers of a distributed database, a group of commerce servers, a group of TTP servers, or a group of participants running a cryptographic agreement (e.g. contract signing) Communication ....
....coordinated atomic actions each providing different guarantees. Atomic transactions are a well known structuring mechanism that are best suited to competitive interactions. Atomic transactions guarantee the properties of atomicity, consistency, isolation and durability (ACID) Conversations [Campbell Randell 1986] are traditionally used for cooperative systems and employ coordinated exception handling for tolerating faults. Coordinated atomic actions (or CA actions) Xu et al. 1995] Xu et al. 1999] are a structuring mechanism that integrates and extends conversations and atomic transactions. The former ....
R. H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems", IEEE Trans. Software Engineering, SE-12 (8), pp.811-26, 1986.
....idea behind this scheme is that of resolving all exceptions signalled by all versions: if we are to follow the intention of NVP we should vote here and ignore the minority results (including exceptions) as produced by faulty versions. Another reason why we believe that using exception resolution [19] is not adequate here is that when all exceptions signalled by versions are resolved and a covering exception (a concerted one in the terminology adopted in [12] is calculated, the states of versions, generally speaking, do not correspond to this exception and are inconsistent. This is because ....
Campbell, R. H. and Randell, B. (1986) Error Recovery in Asynchronous Systems. IEEE TSE, SE-12, 811-826.
.... previous paragraph, a dependable multiparty interaction has to provide the following properties: ffl Handling of Concurrent Exceptions: when an exception occurs in one of the bodies of a participant, if it is not dealt with by that participant, the exception must be propagated to all participants [4] [5] A DMI must provide a way of dealing with 2 exceptions that can be raised by one or more participants. If several different exceptions are raised concurrently, then the DMI mechanism has to decide which same exception will be raised in all participants. ffl Synchronisation Upon Exit: all ....
R. H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems". In IEEE Transactions on Software Engineering, SE-12(8), pp. 811-826, 1986.
....by the role, then it must be propagated to the other roles in the CA action. Since it is possible for several roles to raise exceptions at more or less the same time, a process of exception resolution is necessary in order to agree on the exception to be propagated and handled within the CA action [9]. Once an agreed exception has been propagated to all of the roles involved in the CA action, then error recovery starts. It may still be possible to complete the execution of the CA action successfully using either forward or backward error recovery [10] If it is not possible to achieve either a ....
R. H. Campbell and B. Randell. Error Recovery in Asynchronous Systems. IEEE Trans. Soft. Eng., SE-12(8), pp. 811-826, 1986.
.... for participants to interact and to coordinate their execution (external objects can be used as well) The CA action mechanism also provides a basic framework for exception handling, which can support a variety of fault tolerance mechanisms aimed at tolerating both hardware and software faults [9, 10]. 3. A Complete Example In order to present the proposed methodology and the CA action design, we consider the following simple example: computing the sum of the integers present in a multiset. The computing of the sum follows the Gamma paradigm [6] a chemical reaction removes two values from a ....
R. H. Campbell, and B. Randell: `Error recovery in asynchronous systems', IEEE Transactions on Software Engineering, 1986, 12(8), pp. 811-826.
....these with the behaviour flows in the stored signature behaviour. Deviations from the signature behaviour are reported as feature interactions, and the FIM subsequently applies a resolution technique. The resolution techniques were inspired by error recovery in distributed operating systems [8,9]. Each time a new SLP is provisioned on the network, a new signature behaviour store has to be generated to represent its behaviour in order to enable feature interaction management. The cycle of FIM learning and management is illustrated in Figure 3. Signature behaviour stores for all ....
....of the interaction or the function of the SLPis, attempting to selectively absorb or reject the events could lead to deadlock situations. However, using this type of information also violated the aims of the approach. An investigation of error recovery techniques in distributed operating systems [8,9] resulted in a viable resolution approach. Adopting the philosophy of damage limitation from forward error recovery, it was decided that the best way to resolve this type of interaction was to end the call in a controlled manner and connect all parties involved in the call to a call failed ....
R.H.Campbell and B.Randell, "Error Recovery in Asynchronous Systems", IEEE Transactions on Software Engineering, Vol.12, No.8, August 1986, pp811-826.
....has its own peculiarities and advantages both in implementation and use, so it is usually discussed separately. 5 Tolerating software faults is much more difficult to secure for concurrent systems. Forward error recovery for concurrent systems should rely on a kind of the resolution mechanism [14] which provides joint recovery by simultaneously raising appropriate exceptions in the set of cooperating processes. B.Randell [13] proposed the concept of conversation which was intended to provide joint backward recovery of several processes exchanging information. Each process taking part in a ....
Campbell, R. H. and Randell, B. Error Recovery in Asynchronous Systems. IEEE Trans. Softw. Engng. SE-12: 811-826; 1986.
....conversation (Lee and Anderson 1990) Basically, these features provide error detection and recovery within conversations: when an error has been detected, the corresponding recovery starts. Conversations can use backward error recovery, forward error recovery, or a combination of these (Campbell and Randell 1986; Lee and Anderson 1990) In any case, recovery has to be coordinated, and all conversation participants have to be involved in it. Backward error recovery does not depend on the application much and can be made transparent (or provided, to a considerable degree, by the conversation support) ....
....by the conversation support) because it uses the rollback of all conversation participants to recover the system. Forward recovery usually relies on an exception mechanism and may incorporate an additional mechanism to resolve multiple exceptions raised in several conversation participants (Campbell and Randell 1986) . This can be done by imposing a partial order on all conversation exceptions in such a way that a higher exception has a handler capable of handling any lower exception. Exception handlers are attached to each conversation participant, and the basic scheme of forward error recovery is to call ....
[Article contains additional citation context not shown here]
Campbell, R.H. and Randell, B. (1986). Error recovery in asynchronous systems.
....are very prone to faults and errors. Various fault tolerance techniques for coping with hardware and software faults can provide a practical way of improving the dependability of such systems. Moreover, because faults can have an impact on, or arise from, the environment of a computing system [Campbell Randell 1986], some forms of error recovery may require stepping outside the boundaries of a computer system (i.e. considering the computer system and its environment recursively as an entire distributed system at a higher level of abstraction) In current practice, however, the majority of fault tolerant ....
....If the failure persists, the action will produce the exceptional outcome defined previously, and at the same time switch on the alarm signal to inform the user of the failure. For each (enclosing or nested) action, various exceptions are defined based on failure analysis and an exception graph [Campbell Randell 1986][Xu et al. 1998a] for resolving concurrent exceptions is defined. For example, the LoadPress1 action may contain exceptions such as pr1 failure (press 1 failure) b sensor failure (blank sensor failure) arm1 failure1 (blank lost) arm1 failure2 (can t drop the blank) rs m failure (rotary sensor ....
R.H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems," IEEE Trans. on Soft. Eng., vol.SE-12, no.8, pp.811-826, 1986.
....handling(e; r) Note that all the roles within a given action are supposed to be able to handle the same exception. However, in a complex concurrent system, there is a possible complication that several exceptions can occur at the same time and thus require a process of exception resolution [CR86, RXR96] We now define a new predicate resolving(c) that is true in the state in which action c starts resolving multiple exceptions raised concurrently, and define R(e 1 ; e k ) as a set function that returns an exception which covers all the exceptions e 1 ; e k that occurred ....
R. H. Campbell and B. Randell. Error recovery in asynchronous systems. IEEE Trans. on Software Engineering, SE--16(8):811--826, 1986.
....very prone to faults and errors. Various fault tolerance techniques for coping with hardware and software faults can provide a practical way of improving the dependability of such systems. Moreover, because certain faults can have an impact on, or arise from, the environment of a computing system [Campbell and Randell 1986], some forms of error recovery may require stepping outside the boundaries of a computer system (i.e. considering the computer system and its environment recursively as an entire system at a higher level of abstraction) However, in reality the majority of fault tolerant computing systems do not ....
.... approach to structuring dependable concurrent systems to facilitate error recovery is the conversation concept [Randell 1975] an approach that provides full support for cooperative concurrent activities, and which was extended to cover forward error recovery via coordinated exception handling in [Campbell and Randell 1986]. However, without special support for object interactions and consistent access to shared objects, it has proved difficult to use the conversation concept to control concurrency and facilitate error recovery in an object oriented system [Gregory and Knight 1989] Xu, Randell et al. 1995a] In this ....
[Article contains additional citation context not shown here]
R.H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems," IEEE Trans. on Soft. Eng., vol. SE-12, no.8, pp.811- 826, 1986.
....is invoked. However, this notion typically still assumes that there are absolute outermost transactions, and that outputs to the world outside the database system, e.g. to the users, that take place after such outermost transactions end, must be presumed to be valid. The conversation scheme [17] provides a means of coordinating the recovery provisions of interacting threads so as to avoid the domino effect, without making assumptions regarding output or input validation. Figure 6 shows an example where three threads communicate within a conversation and the threads T1 and T2 communicate ....
....the exception, and both roles transfer control to their respective exception handlers H1 and H2 for this particular exception, which then attempt to perform forward error recovery. When multiple exceptions are raised within an action, a resolution algorithm based on an exception resolution graph [17, 21] is used to identify the appropriate covering exception, and hence the set of exception handlers to be used in this situation. The effects of erroneous operations on external objects are repaired, if possible, by putting the objects into new correct states so that the CA action is able to exit ....
Campbell, R. H. and Randell, B. (1986) Error recovery in asynchronous systems. IEEE Trans. Softw. Eng., SE-12, 811-- 826.
....for coping with hardware and 3 3 software faults can provide a practical way of improving the dependability of such systems. These typically use fault masking or backward error recovery. However, because faults can have an impact on, or arise from, the environment of a computing system [Campbell Randell 1986], some forms of error recovery may require stepping outside the boundaries of a computer system (i.e. considering the computer system and its environment recursively as an entire distributed system at a higher level of abstraction) in which case backward error recovery by the computer system will ....
....the exception and both roles transfer control to their respective exception handlers H1 and H2 for this particular exception, which then attempt to perform forward error recovery. When multiple exceptions are raised within an action, a resolution algorithm based on an exception resolution graph [Campbell Randell 1986][Xu et al. 1998a] is used to identify the appropriate exception, and hence the set of exception handlers to be used. The effects of erroneous operations on external objects are repaired, if possible, by putting the objects into new correct states so that the CA action is able to exit with an ....
R.H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems," IEEE Trans. on Soft. Eng., vol.SE-12, no.8, pp.811-826, 1986.
....object systems, exception resolution, nested atomic actions. 2 1 Introduction Concurrent and distributed computing systems often give rise to complex asynchronous and interacting activities. The provision of exception handling and error recovery becomes very difficult in such circumstances [Campbell Randell 1986]. One way to control the entire complexity, and hence facilitate error recovery, is to somehow restrict interaction and communication. Atomic actions are the usual tool employed in both research and practice to achieve this goal. Most of the existing schemes for exception handling in concurrent ....
....both research and practice to achieve this goal. Most of the existing schemes for exception handling in concurrent systems use the concept of an atomic action as a unit of error confinement, though there is no clear consensus on how to handle exceptions when asynchronous activities occur [Jalote Campbell 1986][Taylor 1986] Many new architectural developments in the area of distributed computing systems are, to some extent, object based or object oriented (OO) The OO technique, with its modularity, flexibility and reusability features, can be usefully exploited for handling complexity and ....
[Article contains additional citation context not shown here]
R.H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems," IEEE Trans. Soft. Eng., vol. SE-12, no.8, pp.811-826, 1986.
....not in the conversation. The concept of a conversation permits only strict nesting. If a process within a conversation raises an exception, then an appropriate error recovery mechanism must be invoked. A coordinated error recovery strategy between all the processes in the conversation is required [5]. Error handlers can use a mixture of forward and backward recovery techniques. For example, the state of a process may be rolled back to the recovery line or compensating actions may be performed to correct the erroneous state. Note that the FTCS 25 Submission 4 incorporation of forward error ....
....and to define a single handler to cope with a group of related exceptions. Regardless of whether an exception is raised by one or several of the participating objects in a CA action, the fault tolerance measures must necessarily involve all of the objects that are participating in that CA action [5]. Thus, each participating object in the CA action should suffer the same exception. It is important that all the participating objects have exception handlers for each possible exception (though the use of a default exception handler provided by the underlying system is permitted) These handlers ....
[Article contains additional citation context not shown here]
R.H. Campbell and B. Randell, "Error Recovery in Asynchronous Systems," IEEE Trans. Soft. Eng., vol. SE-12, no.8, pp.811-826, 1986.
No context found.
R.H. Campbell, B. Randell. Error Recovery in Asynchronous Systems. IEEE TSE-12, 8. 1986
No context found.
R. H. Campbell and B. Randell. Error recovery in asynchronous systems. Transactions on Software Engineering, SE-12#8#:811#826, 1986.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC