Results 1 - 10
of
17
DieHard: probabilistic memory safety for unsafe languages
- in PLDI ’06
, 2006
"... Applications written in unsafe languages like C and C++ are vulnerable to memory errors such as buffer overflows, dangling pointers, and reads of uninitialized data. Such errors can lead to program crashes, security vulnerabilities, and unpredictable behavior. We present DieHard, a runtime system th ..."
Abstract
-
Cited by 93 (13 self)
- Add to MetaCart
Applications written in unsafe languages like C and C++ are vulnerable to memory errors such as buffer overflows, dangling pointers, and reads of uninitialized data. Such errors can lead to program crashes, security vulnerabilities, and unpredictable behavior. We present DieHard, a runtime system that tolerates these errors while probabilistically maintaining soundness. DieHard uses randomization and replication to achieve probabilistic memory safety by approximating an infinite-sized heap. DieHard’s memory manager randomizes the location of objects in a heap that is at least twice as large as required. This algorithm prevents heap corruption and provides a probabilistic guarantee of avoiding memory errors. For additional safety, DieHard can operate in a replicated mode where multiple replicas of the same application are run simultaneously. By initializing each replica with a different random seed and requiring agreement on output, the replicated version of Die-Hard increases the likelihood of correct execution because errors are unlikely to have the same effect across all replicas. We present analytical and experimental results that show DieHard’s resilience to a wide range of memory errors, including a heap-based buffer overflow in an actual application.
Diversity against Accidental and Deliberate Faults
- Computer Security, Dependability, and Assurance: From Needs to Solutions
, 1998
"... The paper is aimed at examining the relationship between the three topics of the workshops that gave rise to this book: security, fault tolerance, and software assurance. Those three topics can be viewed as different facets of dependability. The paper focuses on diversity, as a desirable approach fo ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
The paper is aimed at examining the relationship between the three topics of the workshops that gave rise to this book: security, fault tolerance, and software assurance. Those three topics can be viewed as different facets of dependability. The paper focuses on diversity, as a desirable approach for addressing the classes of faults that underlay all these topics, i.e., design faults and intrusion faults. 1. Introduction The paper is aimed at examining the relationship between the three topics of the workshops that gave rise to this book: security, fault tolerance and software assurance. Those three topics can be viewed as different facets of dependability [29, 33], (see also the paper by Brian Randell in this volume). The second section is devoted to a fault classification, which identifies three major classes of faults: physical faults, design faults, (human-machine) interaction faults, where the latter two classes can be either accidental or deliberate. The classes of faults that ...
Definition and analysis of hardware- and software-fault-tolerant architectures
- IEEE Computer
, 1990
"... 0th experimental and real-life safety-related systems have begun to use design diversity to tolerate software faults. ’ Such systems focus strongly on design faults, where the term “design ” encompasses everything from system requirements to realization during both initial production and future modi ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
0th experimental and real-life safety-related systems have begun to use design diversity to tolerate software faults. ’ Such systems focus strongly on design faults, where the term “design ” encompasses everything from system requirements to realization during both initial production and future modifications. Design faults are a source of common-mode failures, which defeat
An Exception Handling Framework for N-Version Programming in Object Oriented Systems
- in ISORC ‘00, Object-Oriented Real-Time Distributed Computing, 2000 Proceedings. Third IEEE International Symposium, March 2000 Pages:226 – 233
, 2000
"... This paper proposes an approach for introducing exception handling into object -oriented N-version programming (NVP). We start with outlining general principles of structuring systems with diversity and show why it is important to use exceptions while developing and using diversely-developed softwar ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
This paper proposes an approach for introducing exception handling into object -oriented N-version programming (NVP). We start with outlining general principles of structuring systems with diversity and show why it is important to use exceptions while developing and using diversely-developed software. Internal version exceptions and external exceptions, which the diversely-designed class can propagate, are clearly separated in our framework: each version has its own internal exceptions but the external exceptions of all versions have to be the same and identical to the interface exceptions of the whole class. This scheme requires an adjudicator of a special kind to allow interface exception signalling when a majority of versions have signalled the same exception. We demonstrate these ideas using a general framework for introducing NVP into object-oriented systems which we have developed recently [1]. This framework follows all principles of structured NVP: software diversity is introduced here at the level of classes and encapsulated into the diversely-designed class. We discuss the internal structure of this class and the interfaces of its subcomponents; and show how the NVP controller works, version execution is coordinated and re-use operates here. This framework makes use of many advantages object-oriented programming has. For the demonstration, it has been implemented in Ada. The paper finishes with a comparison of our proposal with some existing NVP schemes and with a discussion of our future work. 1. Introduction
An "asymmetric" Approach to the Assessment of Safety-Critical Software During Certification
, 2000
"... The purpose of the present paper is the description of the offered by the authors general approach to the software assessment during certification and licensing. This kind of software assessment has the specific character, taking into account limitation of time, material and human resources availabl ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
The purpose of the present paper is the description of the offered by the authors general approach to the software assessment during certification and licensing. This kind of software assessment has the specific character, taking into account limitation of time, material and human resources available to the experts. The offered "asymmetric" approach allows to define the most important areas, where the basic efforts of the software assessment should be concentrated.
On Distribution of Coordinated Atomic Actions
- ACM Operating Systems Review
, 1997
"... this paper is to discuss how distributed CA action schemes can be realised. In particular, we outline different ways of action component distribution, trade-offs, applications for which these schemes are applicable. We discuss a wide range of schemes (some of them have not yet been implemented) base ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
this paper is to discuss how distributed CA action schemes can be realised. In particular, we outline different ways of action component distribution, trade-offs, applications for which these schemes are applicable. We discuss a wide range of schemes (some of them have not yet been implemented) based on a classification of various approaches to CA action distribution; to do this we analyse all possible ways of different action component distribution. We believe that this general discussion should help to better understand the current state of CA action implementation and is important for future research in CA actions
Integrating Dependability Analysis into the Design of Distributed Systems
- Proceedings of the 5 th European Computer Conference (COMPEURO91
, 1991
"... The increasing importance and criticality of computers in technical applications have spurred a demand on dependable computers. In these applications acceptance of the system is strongly influenced by dependability aspects. Consideration of dependability has to be an integral part of system developm ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The increasing importance and criticality of computers in technical applications have spurred a demand on dependable computers. In these applications acceptance of the system is strongly influenced by dependability aspects. Consideration of dependability has to be an integral part of system development and has to start at early phases in the design. This paper presents our approach for the integration of dependability analysis into the design of distributed systems. This design methodology supports a close and efficient interaction of design creation and design evaluation activities. The feasibility of this approach is shown by applying it to the application "Rolling Ball", thereby demonstrating the interaction of design creation activities and dependability analysis. 1 Introduction The tremendous advances in hardware technology and the significant decrease of computers' costs have caused an increased demand on distributed computer systems. Modern technology now allows the use of co...
Fault-Tolerant Partitioning Scheduling Algorithms in Real-Time Multiprocessor Systems
"... This paper presents the performance analysis of several well-known partitioning scheduling algorithms in real-time and fault-tolerant multiprocessor systems. Both static and dynamic scheduling algorithms are analyzed. Partitioning scheduling algorithms, which are studied here, are heuristic algorith ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents the performance analysis of several well-known partitioning scheduling algorithms in real-time and fault-tolerant multiprocessor systems. Both static and dynamic scheduling algorithms are analyzed. Partitioning scheduling algorithms, which are studied here, are heuristic algorithms that are formed by combining any of the bin-packing algorithms with any of the schedulability conditions for the Rate-Monotonic (RM) and Earliest-Deadline-First (EDF) policies. A tool is developed which enables to experimentally evaluate the performance of the algorithms from the graph of tasks. The results show that among several partitioning algorithms evaluated, the RM-Small-Task (RMST) algorithm is the best static algorithm and the EDF-Best-Fit (EDF-BF) is the best dynamic algorithm, for non fault-tolerant systems. For faulttolerant systems which require about 49 % more processors, the results show that the RM-First-Fit Decreasing Utilization (RM-FFDU) is the best static algorithm and the EDF-BF is the best dynamic algorithm. To decrease the number of processors in faulttolerant systems, the RMST is modified. The results show that the modified RMST decreases the number of required processors between 7 % and 78 % in comparison with the original RMST, the RM-FFDU and other well-known static partitioning scheduling algorithms.
Programming Notations for Expressing Error Recovery in a Distributed Object-Oriented Language
- In 1st Open Workshop of the BROADCAST project
, 1993
"... This paper investigates definition of linguistic constructs to help achievement of greater reliability within distributed applications. Our proposal is sketched in the framework of distributed object-oriented programming. The object-oriented paradigm exhibits nice properties to meet the reliability ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper investigates definition of linguistic constructs to help achievement of greater reliability within distributed applications. Our proposal is sketched in the framework of distributed object-oriented programming. The object-oriented paradigm exhibits nice properties to meet the reliability requirement. In particular, the facilities of inheritance and subtyping encourage software re-use. Furthermore, the base programming model that we have chosen embeds a notion, called multi-operation, that supports hierarchical as well as redundant design methodology for distributed applications. From the perspective of failure handling during software execution, we introduce an exception handling mechanism for the above framework. This mechanism defined according to the object-oriented paradigm supports both forward and backward error recovery, backward error recovery being provided through distributed actions that are atomic with respect to exceptions. 1 Introduction
Transient Fault Tolerance in Mobile Agent Based Computing
- Infocomp Journal of Computer Science
, 2005
"... Agent technology is emerging as a new paradigm in the areas of distributed and mobile computing. Agent is a computational entity capable of relocating code, data and execution- state to another host. Mobile agents' code often experience transient faults resulting in a partial or complete loss during ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Agent technology is emerging as a new paradigm in the areas of distributed and mobile computing. Agent is a computational entity capable of relocating code, data and execution- state to another host. Mobile agents' code often experience transient faults resulting in a partial or complete loss during execution at a host machine. Protocol for fault -- tolerant agent prevents a partial or complete loss of a mobile agent at a host. This article describes how to detect and recover random transient bit-errors at an agent before starting its execution at a host after its arrival at a host, in order to maintain availability of an agent by comparing an agent's states by using time and space redundancy. In this proposed self-repair approach, a software fix for fault -- tolerance exists along with an agent. This generalized scheme is useful for recovering any kind of distributed agents against hardware transient faults (at a host). This paper presents a fault-tolerance mechanism for mobile agents that attempts to detect and correct any bit errors that may occur at a host after agents' mobility on a Web Agent-based Service Providing (WASP) platform. Though in modern distributed systems, the communication stack handles any bit errors and error correction is used on multiple layers (for example, in transport layer), the proposed approach is intended to be a supplement one to the conventional error detecting and correcting codes.

