| D.J. Taylor, D.E. Morgan, and J.P. Black, "Redundancy in Data Structures: Improving Software Fault Tolerance," IEEE Trans. on Software Engineering, vol. 6, no. 6, 585-594, November 1980. |
....defensive checks, data audits, process activity and resource checks, and modular, hierarchical error recovery. A related approach to data audit is robust data structures, which aims at detecting and correcting errors in stored data structures that contain carefully deployed redundancy. Taylor [Blac80, Tay80, Tay85] presents examples of robust storage structures and their practical implementation. Commercial off the shelf database systems from Oracle [Oracle] Sybase [Sybase] and the inmemory database from TimesTen [TimesTen] include utilities to perform consistency checks of database integrity. For ....
D.J. Taylor, D.E. Morgan, and J.P. Black, "Redundancy in Data Structures: Improving Software Fault Tolerance," IEEE Trans. on Software Engineering, vol. 6, no. 6, 585-594, November 1980.
....discuss work related to the detection and correction of such errors. This work is directly comparable to our new methods of detecting corruption in database management systems based on parity words. Some of these techniques, especially those which fall under the heading robust data structures [78, 79], relate to the issue of recoverable mutual exclusion, in that they represent general techniques by which data structures that are guarded by a recoverable spin lock maybe recovered to a consistent state after halting failures, as well as certain other failures. We note that commercial databases ....
....it is recommended that a few words be reserved between each page s image in the buffer manager as a defense wall to detect writes that erroneously extend beyond a page s boundary. Other techniques suggested by Kuspert are based on adding redundancy to data structures (similar to proposals in [78] discussed below) ffl Marking free data with special bit patterns. ffl Storing redundant information about lists to verify length information. ffl Checks for pointer consistency based on a specific B tree implementation. ffl Adding sequence numbers to chained hash table overflow buckets to ....
[Article contains additional citation context not shown here]
D. Taylor, D. Morgan, and J. Black. Redundancy in data structures: Improving software fault tolerance. IEEE Transactions on Software Engineering, 6(6):585-- 594, November 1980.
....extend beyond a page s boundary. These techniques serve to detect certain bad writes as well as imposing certain integrity constraints to catch DBMS programming errors. Our techniques address the first area (protection against bad writes) but are far more general. Taylor, Morgan and Black [TMB80a, TMB80b] provide some theoretical structure to the design of data structures that can recover from certain failures. However, this work is not in the context of a DBMS, and does not apply to corruption of general application data, since it handles only components such as pointers and counts. ....
D. Taylor, D. Morgan, and J. Black. Redundancy in data structures: Improving software fault tolerance. IEEE Transactions on Software Engineering, 6(6):585-- 594, November 1980.
....consistency [Frison Wensley 1982] The knowledge of certain properties of the system may allow the required redundancy to be limited. Classical examples are given by regularities of a structural nature: error detecting and correcting codes [Peterson Weldon 1972] robust data structures [Taylor et al. 1980], multiprocessors and networks [Pradhan 1986, Rennels 1986] algorithm based fault tolerance [Huang Abraham 1982] The faults that can be tolerated are then dependent upon the properties considered since these properties are directly involved in the fault assumptions made during the design. ....
D. J. Taylor, D. E. Morgan and J. P. Black, "Redundancy in Data Structures: Improving Software Fault Tolerance", IEEE Transactions on Software Engineering, SE-6 (6), pp.383-94, 1980.
....consistency [Frison Wensley 1982] The knowledge of certain properties of the system may allow the required redundancy to be limited. Classical examples are given by regularities of a structural nature: error detecting and correcting codes [Peterson Weldon 1972] robust data structures [Taylor et al. 1980], multiprocessors and networks [Pradhan 1986, Rennels 1986] algorithm based fault tolerance [Huang Abraham 1982] The faults that can be tolerated are then dependent upon the properties considered since these properties are directly involved in the fault assumptions made during the design. ....
D. J. Taylor, D. E. Morgan and J. P. Black, "Redundancy in Data Structures: Improving Software Fault Tolerance", IEEE Transactions on Software Engineering, SE-6 (6), pp.383-94, 1980.
....large number of error scenarios and do not require a traditional reboot. Instead they can lead to application recovery if it exists (1) kernel assisted retry (2 4) or finally to containment by execution termination if other recovery fails. Armoring data structures (such as that by Taylor et al. [19]) aims to make critical data structures more resilient to asynchronous notification of errors and to errors in the data structures themselves. This support aims to enable roll back for re execution once errors are contained and or replication for data corruption recovery. Armoring critical data ....
Taylor, D., et al., "Redundancy in Data Structures: Improving Software Fault Tolerance", IEEE T-SE, vol 6, 1980, pp 595-602.
....of application code with direct access to the database buffers. Kuspert [10] presents a number of specific techniques for detecting corruption of DBMS data structures. These techniques are ad hoc in the sense that they are designed for specific DBMS storage structures. Taylor, Morgan and Black [22] provide some theoretical structure to the design of data structures that can recover from certain failures. However, this work is not in the context of a DBMS, and does not apply to corruption of general application data, since it handles only components such as pointers and counts. Finally, we ....
D. Taylor, D. Morgan, and J. Black. Redundancy in data structures: Improving software fault tolerance. IEEE Transactions on Software Engineering, 6(6):585--594, Nov. 1980.
....to an error. 1 Introduction A richly diverse and still growing collection of techniques have been developed to perform software based fault tolerance including: recovery blocks [18] N version programming [3] program checkers [4, 5] algorithm based methods [12, 16] robust data structures [13, 14, 24, 25, 26], certification trails [20, 21, 22, 23, 27] and other methods [1] In this paper, we consider the problem of detecting errors in the answers given in response to data structure queries. For many programs a substantial fraction of the intricate error prone code resides in the routines associated ....
....test can be placed at the end of a recovery block so that if an error is discovered an alternate method for computing the answer can be attempted. The programs discussed in this paper can function as acceptance tests. Finally, we mention the notion of a repairable or robust data structure [13, 14, 24, 25, 26]. In this technique, a data structure is augmented with additional information, such as redundant values and links. The idea is that if the data structure should become corrupted, then by using the redundant information it may be possible to restore the data structure to its correct form. Some of ....
Taylor, D. J., Morgan, D. E., Black, J. P., "Redundancy in Data Structures: Improving Software Fault Tolerance," Trans. Soft. Eng., 585-594, v. 6, 1980.
....is equivalent to a failure of the processor the backup controller and processor become active. The backup controller aborts current internal STM operations and restores a consistent object state. In order to do this, all algorithms executed by a controller manipulate robust data structures [Taylor et al. 80] The memory banks are made up of a set of memory chips organized by columns (i.e. each chip is referenced by different address bits) In order to tolerate soft (non permanent) errors when reading memory, an Error Correcting Code (ECC) mechanism is used to correct a false bit. Soft errors are ....
D.J. Taylor, D.E. Morgan, & J.P. Black. Redundancy in data structures: Improving software fault tolerance. IEEE Transactions on Software Engineering, SE-6(6):585--594, November 1980.
....Applications Various features that could be included in a safety kernel have been built into almost every safety critical system utilizing software. Common techniques include watchdog timers, input and output assertions [32] sequencing checkers [44] fault tolerant data structures [48], software isolation [1] and software self checking [20] These techniques have been incorporated largely in an ad hoc fashion. Some of the systems that are presently the state of art in this area are described below. The control of an electric generation turbine is a safety critical task in that ....
....information in the data to facilitate error detection. Operation During operation, the primary data integrity concern is that some entity in the computer system will be able to access and alter memory that is critical to policy enforcement. Although, some sort of fault tolerant data structures [48] might be effective for detecting corruption of this type, there is no means for detecting corruption of all of the data because much of this data is never accessed directly by the safety kernel (e.g. process state data) A primary means of ensuring data integrity is to employ memory protection ....
Taylor, D. J., D. E. Morgan, and J. P. Black, "Redundancy in Data Structures: Improving Software Fault Tolerance," IEEE Transactions on Software Engineering Vol. SE-6 (Nov. 1980) pp. 585-594.
.... blocks [13] program checkers [5] 4] algorithm based fault tolerance methods [7] and also certification trails [15] 16] 17] It is also possible to detect and or correct faults in the memory directly, using error detecting correcting codes [6] 14] 21] fault tolerant data structures [8] 9][18][19] 20] and limited memory checkers [1] 3] We will discuss these last two techniques in the next section. At the hardware level, one can use triple modular redundancy [2] and also watchdog processors [12] to try to detect memory faults. In this paper, we discuss an approach based on the ....
Taylor, D. J., Morgan, D. E., Black, J. P., "Redundancy in Data Structures: Improving Software Fault Tolerance," IEEE Trans. Soft. Eng., pp. 585-594, vol. 6, Nov. 1980.
....Because only one team of programmers is required, a process pair is considerably cheaper than an N version program. Auragen [13] used a similar scheme. Another spatial temporal redundancy hybrid method uses redundant data in the same address space to reconstruct data structures damaged by errors [76]. When an error is detected during an operation on the data structure, the structure is rebuilt using the redundant data and the operation is retried. A system can only tolerate software errors if these errors are detected in the first CHAPTER 1. INTRODUCTION 11 place. The most common approach ....
D. Taylor, D. Morgan, and J. Black. Redundancy in data structures: Improving software fault tolerance. IEEE Transactions on Software Engineering, SE-6, May 1980.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC