| Chillarege R., Bowen N. S., "Understanding Large System Failures - A Fault Injection Experiment", Proc. FTCS-19, pp. 356-363, 1989. |
.... of the fault tolerance mechanisms the fault tolerance coverage that is usually defined as the probability of system recovery given that a fault exists [Bouricius et al. 1969] Fault tolerance mechanisms are commonly assessed by carrying out fault injection experiments [Segall et al. 1988, Chillarege Bowen 1989, Gunneflo et al. 1989, Walter 1990, Choi et al. 1991, Arlat et al. 1993, Kanawati et al. 1995] A single fault injection experiment consists of injecting a fault condition into a simulation or a prototype of a fault tolerant system and observing the behavior of the system to determine whether or ....
R. Chillarege and N. S. Bowen, "Understanding Large System Failures --- A Fault Injection Experiment", in Proc. 19th Int. Symp. on Fault-Tolerant Computing (FTCS-19), (Chicago, MI, USA), pp.356-63, IEEE Computer Society Press, 1989.
....of using software fault injection to improve systematically the robustness of a large software system through several iterations. Software implemented fault injection [34] is used commonly to compare the robustness of different systems [61] 69] 42] understand how systems behave during a fault [16], 10] 39] and validate fault tolerant mechanisms [4] 31] 57] 13] However, there are very few case studies that use fault injection for all three of these purposes to guide the design and implementation of a faulttolerant system. In our study, we use fault injection for all three of ....
....mechanisms, and comparing the robustness of different systems. See [34] for an excellent introduction to the overall area and a summary of much of the past work on fault injection. Fault injection is traditionally used to understand how systems behave during a fault. Chillarege and Bowman [16] use fault injection to characterize large system failures. They inject software bugs on a commercial transaction processing system, and analyze the crash data to measure system component failure rates and fault latency. Barton et al. 10] use the FIAT fault injection tool to inject memory bit ....
R. Chillarege and N.S. Bowen, "Understanding Large System Failure--A Fault Injection Experiment," Proc. 1989.
.... benchmark tests for this purpose (e.g. see [1 5] usually based on some form of fault injection testing focused on single computers [6,11] In spite of these efforts, and even considering that fault injection techniques are commonly used by developers to assess and tune their designs (e.g. see [7 10]) nothing has emerged which has gained even modest adoption in the industry for making comparisons among systems. Researchers acknowledge that: emulated faults will not represent the variety and scope of actual field faults [6, 14] fault injection cannot predict actual availability or MTBF ....
....for making comparisons among systems. Researchers acknowledge that: emulated faults will not represent the variety and scope of actual field faults [6, 14] fault injection cannot predict actual availability or MTBF [13] comparison of dissimilar architectures is extremely problematic [7, 13, 14]. The authors, working as members of the IFIP WG 10.4 Dependability Benchmarking SIG (SIGDeB) 15] are proposing a different method for making dependability comparisons. This method is to create a standardized classification system that could rate systems in each of the dimensions that affect ....
R. Chillarege and N. S. Bowen, "Understanding Large System Failures -- A Fault Injection Experiment", in Proc. 19th Int. Syrup. on FaultTolerant Computing (FTCS-19), Chicago, IL, USA, 1989, pp. 356-363 (IEEE CS Press).
....experiments. However, as the complexity of contemporary computer systems increases as a result of using highly integrated VLSI chips, it is becoming more difficult, or nearly impossible, to evaluate dependability with HFIs alone. On the other hand, softwareimplemented fault injectors (SFIs) [10, 11, 12, 13, 14] have been proposed as less expensive and more controllable alternatives. Although SFI techniques such as overwriting memory or register contents are becoming popular, they still face many difficulties, such as limited accessibility to hardware, perturbation to workloads, and poor time resolution ....
R. Chillarege and N. S. Bowen, "Understanding large system failures --- a fault injection experiment," in Proc. IEEE FTCS, pp. 356--363, June 1989.
.... The techniques of fault injection that we use are also not uncommon in the fault tolerance community, where fault injection is commonly used in a case specific manner to verify fault tolerant systems, to generate models of fault tolerance behavior, and to study fault propagation [4] 11] 12] [14] [27] 35] However, most of this work uses either very low level hardware fault injection that requires expensive and dangerous equipment (such 69 as heavy ion bombarders) 11] or software implemented fault injection. The former is not tractable for general use because of the cost and ....
R. Chillarege and N. Bowen. Understanding Large System Failure---A Fault Injection Experiment. In Proceedings of the 1989.
....by the user at the application level and the physical location within the memory image was obtained from compiler and loader information. Although this work provided valuable results, it was not able to inject transient faults. The concept of failure acceleration was introduced by Chillarege in [10] where faults were injected by modifying memory contents under software control. Another tool named DOCTOR [11] is capable of injecting processor, memory and communication faults on a distributed real time system called HARTS. Processor faults are injected by modifying the applications executable ....
R. Chillarege and N. Bowen, "Understanding Large Systems Failures - A Fault Injection Experiment ", in Proc. of 19 th FTCS, pp. 356-363, Chicago, June 1989.
....was required to correct errors in them. Thus there was a tradeo between the cost of initial development time and the cost of adapting modules to a new speci cation. A related area are fault injection studies [2, 14] which dynamically insert bugs into the system to see how it crashes or survives [6]. These focus mostly on robustness in the face of arti cial errors, whereas we are interested more in the features of actual errors. Another approach is explicit testing, such as the fuzz studies that compare how a set of systems utilities behaved in the face of random inputs [17, 18] In ....
R. Chillarege and N. Bowen. Understanding Large System Failures - A Fault Injection Experiment. In The 19th International Symposium on Fault Tolerant Computing, June 1989.
....required to correct errors in them. Thus there was a tradeo# between the cost of initial development time and the cost of adapting modules to a new specification. A related area are fault injection studies [2, 14] which dynamically insert bugs into the system to see how it crashes or survives [6]. These focus mostly on robustness in the face of artificial errors, whereas we are interested more in the features of actual errors. Another approach is explicit testing, such as the fuzz studies that compare how a set of systems utilities behaved in the face of random inputs [17, 18] In ....
R. Chillarege and N. Bowen. Understanding Large System Failures - A Fault Injection Experiment. In The 19th International Symposium on Fault Tolerant Computing, June 1989.
....required to correct errors in them. Thus there was a tradeoff between the cost of initial development time and the cost of adapting modules to a new specification. A related area are fault injection studies [2, 14] which dynamically insert bugs into the system to see how it crashes or survives [6]. These focus mostly on robustness in the face of artificial errors, whereas we are interested more in the features of actual errors. Another approach is explicit testing, such as the fuzz studies that compare how a set of systems utilities behaved in the face of random inputs [17, 18] In ....
R. Chillarege and N. Bowen. Understanding Large System Failures - A Fault Injection Experiment. In The 19th International Symposium on Fault Tolerant Computing, June 1989.
.... injections have mostly been conducted to validate error detection and error recovery mechanisms of a system, where the main aim was to quantify the e ectiveness (coverage, latency etc) of the di erent proposed mechanisms [1] A comprehensive discussion on Fault Injection (FI) appears in [9] In [3], FI experiments were conducted to characterize large system failures, whereby concepts such as potential hazards and failure acceleration were introduced. In [6] FI experiments were conducted to obtain pertinent information on possible locations for EDMs and ERMs. Our approach di ers from that ....
R. Chillarege, N. Bowen,\Understanding Large System Failures - A Fault Injection Experiment," Proc FTCS-19, pp. 356-363, 1989.
....contribute to the estimation of the coverage and efficacy of fault tolerance and complement the analytical techniques previously described. Independently of the abstraction level applied, fault injection campaigns contribute to the Fault Forecasting by studying either error propagation [Chillarege and Bowen 1989; Choi and Iyer 1992; Steininger and Schweinzer 1995] or error latency [Chillarege and Iyer 1987; Geist et al. 1990] or the coverage of fault tolerance mechanisms [Powell et al. 1993; Walter 1990] Many techniques and methods have been developed and integrated in tools specific for fault ....
R. Chillarege and N.S. Bowen. "Understanding large system failures - a fault injection experiment," in 19-th International Symposium on FaultTolerant Computing (FTCS-22), pp. 356-363, Chicago, Il, USA, IEEE Computer Society Press, 1989.
.... Approach Our method for experimentally estimating the error permeability values of software modules is based on fault injection (FI) FI artificially introduces faults and or errors into a system and has been used for evaluation and assessment of dependability for several years, see for example [Chillarege89], Arlat90] and [Fabre99] A comprehensive survey of experimental analysis of dependability appears in [Iyer96] For analysis of the raw experimental data, we make use of so called Golden Run Comparisons (GRC) A Golden Run is a trace of the system executing without any injections being made, ....
Chillarege R., Bowen N. S., "Understanding Large System Failures -- A Fault Injection Experiment", Proc. 19 th Int. Symp. on Fault-Tolerant Comp., pp. 356-363, 1989.
....can thus lead to significant economic losses or even loss of human lives. Obviously, such a system must undergo a rigorous dependability validation and verification. Fault injection is an attractive approach to the experimental dependability validation of fault tolerant systems (see e.g. [1], 2] as it provides the means for a detailed study of the complex interaction between faults, errors, failures and fault handling mechanisms (Figure 1.1) Dependability validation of fault tolerant systems by fault injection addresses fault removal and fault forecasting [3] In the case of ....
R. Chillarege, N. S. Bowen, "Understanding Large System Failures - A Fault Injection Experiment", Proc. 19th Int. Symp. On Fault Tolerant Computing, pp. 356-363, June, 1989.
....Although a fault has been triggered, the tester might see the manifestation of the fault after control has passed through several modules or components. It may take days before a fault manifests itself. A set of faults is required for the purpose of injection. Having a generic set of faults [1, 3, 4, 10, 11, 12] for a class of system helps in automating the process of fault injection. The faults can be provided in a fault injection tool and can be selected by the tester for injection. Fault injection has mostly been performed only on selected systems because of a lack of tools that can perform ....
R. Chillarege and N. Bowen. "Understanding Large System Failures --- a Fault Injection Experiment". In Proceedings 19th International Symposium Fault-Tolerant Computing, pages 356--363, 1989.
....capabilities of systems. Hardware fault injection [1, 13, 31] and simulation approaches for injecting hardware failures [7, 10, 14] have received much attention in the past. Recent efforts have focused on software fault injection by inserting faults into system memory to emulate errors [6, 30]. Others have emulated fault injection into CPU components [21] typically by setting voltages on pins or wires. However, fault injection and testing dependability of distributed systems has received very little attention until recently [3, 11, 12, 16] Most of the recent work in this area have ....
R. Chillarege and N. S. Bowen. Understanding large system failures --- a fault injection experiment. In Proc. Int'l Symp. on Fault-Tolerant Computing, pages 356--363, June 1989.
.... heavily relies on the efficiency of the fault tolerance mechanisms the fault tolerance coverage that is usually defined as the probability of system recovery given that a fault exists [4] Fault injection experiments are commonly carried out to assess the fault tolerance mechanisms [3, 5, 6, 8, 12, 15, 17]. A single fault injection experiment consists of injecting a fault condition into a simulation or a prototype of a fault tolerant system and observing the behavior of the system to determine whether or not the injected fault has been properly handled by the system s fault tolerance mechanisms. In ....
R. Chillarege and N. S. Bowen, "Understanding Large System Failures --- A Fault Injection Experiment", in Proc. 19th Int. Symp. on Fault-Tolerant Computing (FTCS-19), (Chicago, MI, USA), pp.356-363, IEEE Computer Society Press, June 1989.
No context found.
Chillarege R., Bowen N. S., "Understanding Large System Failures - A Fault Injection Experiment", Proc. FTCS-19, pp. 356-363, 1989.
No context found.
R. Chillarege, N. Bowen, "Understanding Large System Failures -- A Fault Injection Experiment", Proc. FTCS 19, pp. 356--363, 1989
No context found.
R. Chillarege and N. S. Bowen. Understanding large system failures - a fault injection experiment". In In Proceedings of the International Symposium on Fault Tolerant Computing, June 1989.
No context found.
R. Chillarege and N.S. Bowen, "Understanding Large System Failures---A Fault Injection Experiment," Proc. 19th Int'l Symp. Fault-Tolerant Computing (FTCS-19), pp. 356-363, 1989.
No context found.
R. Chillarege and N. Bowen. Understanding Large System Failure---A Fault Injection Experiment. In Proceedings of the 1989.
No context found.
Chillarege R., Bowen N. S., "Understanding Large System Failures - A Fault Injection Experiment", Proc. FTCS-19, pp. 356-363, 1989.
No context found.
R. Chillarege and N. S. Bowen, "Understanding large system failures --- a fault injection experiment," in Proc. IEEE FTCS, pp. 356--363, June 1989.
No context found.
Chillarege, R. and Bowen, N. S., "Understanding large system failures - A fault injection experiment," in Proceedings of the 19th International Symposium on Fault-Tolerant Computing, pp. 356--363, IEEE, June 1989.
No context found.
R. Chillarege and N. S. Bowen, "Understanding Large System Failures --- A Fault Injection Experiment", Proc. 19th Int. Symp. on Fault-Tolerant Computing (FTCS19) , pp. 356-363, Chicago, MI, USA, IEEE Computer Society Press, 1989.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC