Results 1 -
7 of
7
Closure and Convergence: A Foundation of Fault-Tolerant Computing
- IEEE Transactions on Software Engineering
, 1993
"... We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the ..."
Abstract
-
Cited by 103 (28 self)
- Add to MetaCart
We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the system state remains within that larger set (Closure). And two, if faults stop occurring, the system eventually reaches a state within the legal set (Convergence). We demonstrate the applicability of our definition for specifying and verifying the fault-tolerance properties of a variety of digital and computer systems. Further, using the definition, we obtain a simple classification of fault-tolerant systems and discuss methods for their systematic design. as traditionally been studied in the context of specifi...
Compositional Design of RT Systems: A Conceptual Basis for Specification of Linking Interfaces
- Research report, Technische Universität Wien, Institut für Technische Informatik, Treitlstr. 1-3/182-1, 1040
, 2003
"... Composition of a system is driven by the (a) identification and specification of basic components, and (b) specification of the interactions across the components, i.e., the communication linkages, that are needed to communicate value and temporal information across the components from which the agg ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
Composition of a system is driven by the (a) identification and specification of basic components, and (b) specification of the interactions across the components, i.e., the communication linkages, that are needed to communicate value and temporal information across the components from which the aggregate system results. This paper addresses compositional design of distributed Real-Time (RT) systems focusing specifically on the role of specification of linking interfaces (LIFs) across components.
From Defects to Failures: a View of Dependable Computing
- ACM SIGARCH Computer Architecture News – Special Issue on Architectural Support for Operating Systems
, 1998
"... ..."
On the Specification of Linking Interfaces in Distributed Real-Time Systems
- Institut fuer Technische Informatik
, 2002
"... This paper is concerned with building large distributed real-time systems out of computational components that interact by the exchange of messages across linking interfaces (LIFs). The notions of an operational and a meta-level specification of a LIF of a component are introduced ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper is concerned with building large distributed real-time systems out of computational components that interact by the exchange of messages across linking interfaces (LIFs). The notions of an operational and a meta-level specification of a LIF of a component are introduced
Replication for Fault Tolerant Software Using a Functional and Attribute Grammar Based Computation Model
- PhD thesis, School of Information Science, Japan Advanced Institute of Science and Technology
, 1998
"... As people reliance on computer systems increases, it is of primary importance for these systems to be dependable. This new dependability requirement increases the need for the development of fault tolerant software. Designing and implementing fault tolerant software is a difficult task, especially w ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
As people reliance on computer systems increases, it is of primary importance for these systems to be dependable. This new dependability requirement increases the need for the development of fault tolerant software. Designing and implementing fault tolerant software is a difficult task, especially when implementing fault tolerant parallel software. Only few programming languages support fault tolerance and parallel programming. These languages are based on an imperative language paradigm. Most fault tolerance techniques are developed for such language paradigm. The imperative language paradigm increases system complexity. Novel fault tolerance techniques for the implementation of fault tolerant software based on a different language paradigm have to be developed in order to decrease system complexity and increase its performance. This dissertation presents a novel replication technique for implementing fault tolerant parallel software based on a declarative language paradigm. The repli...
Fault-Tolerant System Reliability In The Presence Of Imperfect Diagnostic Coverage
, 1989
"... This paper examines the effects of less than perfect diagnostics coverage on system reliability. The mathematical background for analyzing the coverage factor of fault--tolerant systems is presented in detail as well as specific examples of practical systems and their relative reliability measures. ..."
Abstract
- Add to MetaCart
This paper examines the effects of less than perfect diagnostics coverage on system reliability. The mathematical background for analyzing the coverage factor of fault--tolerant systems is presented in detail as well as specific examples of practical systems and their relative reliability measures. In a complex system, malfunction and even total nonfunction may not be detected for long periods, if ever. --- John Gall
A MULTI-LEVEL VIEW OF DEPENDABLE COMPUTING
, 1993
"... Abstract--This paper serves a dual purpose. It presents a unified framework and terminology for the study of computer system dependability. It also surveys the field of dependable computing in light of the proposed framework. Specifically, impairments to dependability are viewed from six levels, eac ..."
Abstract
- Add to MetaCart
Abstract--This paper serves a dual purpose. It presents a unified framework and terminology for the study of computer system dependability. It also surveys the field of dependable computing in light of the proposed framework. Specifically, impairments to dependability are viewed from six levels, each being more abstract than the previous one. It is argued that all of these levels are useful, in the sense that proven dependability assurance techniques can be applied at each level, and that it is beneficial to have distinct, precisely defined terminology for describing impairments to, and procurement strategies for, computer system dependability at these levels. The six levels are: (I) Defect level or component level, dealing with deviant atomic parts. (2) Fault level or logic level, dealing with deviant signal values or path selections. (3) Error level or information level, dealing with deviant data or internal states. (4) Malfunction level or system level, dealing with deviant functional behavior. (5) Degradation level or service level, dealing with deviant performance. (6) Failure level or result level, dealing with deviant outputs or actions. Briefly, a hardware or software component may be defective (hardware may also become defective due to wear and aging). Certain system states will expose the defect, resulting in the development of faults

