| K. Nagaraja, R. Bianchini, R. P. Martin, and T. D. Nguyen. Using fault model enforcement to improve availability. In Proc. 2nd Workshop on Evaluating and Architecting System Dependability, San Jose, CA, 2002. |
....[103] 4.1.1 Crash Only and Fault Model Enforcement A crash only system makes it affordable to coerce every detected failure into component level crash(es) this leads to a simple fault model in that components only need to know how to recover from one type of failure. Fault model enforcement [77] uses such an approach to turn unknown faults into crashes, effectively coercing reality into a well understood, simple fault model. By performing recovery based on this fault model, 77] managed to improve availability in a cluster system. Much existing literature assumes unrealistic fault models ....
....model in that components only need to know how to recover from one type of failure. Fault model enforcement [77] uses such an approach to turn unknown faults into crashes, effectively coercing reality into a well understood, simple fault model. By performing recovery based on this fault model, [77] managed to improve availability in a cluster system. Much existing literature assumes unrealistic fault models (e.g. that failures are uncorrelated and occur according to well behaved tractable distributions) for analysis of system behavior; fault model enforcement can increase the impact of ....
K. Nagaraja, R. Bianchini, R. P. Martin, and T. D. Nguyen. Using fault model enforcement to improve availability. In Proc. 2nd Workshop on Evaluating and Architecting System Dependability, San Jose, CA, 2002.
....very limited. However, the more complex the fault model, the more complex the detection and recovery code, leading to higher chances for bugs. Further, detection would likely require additional monitoring hardware, leading to higher cost as well. One idea that we have recently explored in [24, 26] is to define a limited fault model and then to enforce that fault model during operation of the server. We refer to this approach as Fault Model Enforcement (FME) As an example FME policy, in [24] we enforced the node crash model in PRESS by forcing any fault that leads to the separation of a ....
....monitoring hardware, leading to higher cost as well. One idea that we have recently explored in [24, 26] is to define a limited fault model and then to enforce that fault model during operation of the server. We refer to this approach as Fault Model Enforcement (FME) As an example FME policy, in [24] we enforced the node crash model in PRESS by forcing any fault that leads to the separation of a process node from the main group to cause the automatic reboot of that node. While this is an extreme example of FME, it does improve the availability of PRESS substantially, as well as reduces the ....
K. Nagaraja, R. Bianchini, R. Martin, and T. D. Nguyen. Using Fault Model Enforcement to Improve Availability. In Proceedings of the Second Workshop on Evaluating and Architecting System dependabilitY (EASY), Oct. 2002.
....no single definition adequately captures the system state that is necessary to provide a highly available service. As a result, these diverging fault models can lead to inconsistent recovery actions. To address this problem, we implement a novel technique called Fault Model Enforcement (FME) [25] that can be leveraged to address these discrepancies. The key idea behind FME is that it is too difficult to build a complex cluster based system that is tolerant to all possible faults. Thus, service designers should define a simplified abstract fault model that the service will tol erate. ....
....their view of the system state may overlap. Moreover, because each subsystem has slightly different definitions for fault symptoms and what it means for a component to fail, this overlapping can lead to conflicting recovery behaviors. We show that our novel Fault Model Enforcement (FME) [25] technique provides one approach to overcoming such conflicts. We start our discussion by adding a front end and extra processing capacity to PRESS to detect and hide node failures from end clients. This is a good starting point because it is a standard industrial solution. Then, we extend PRESS ....
K. Nagaraja, R. Bianchini, R. Martin, and T. D. Nguyen. Using Fault Model Enforcement to Improve Availability. In Proceedings of the Second Workshop on Evaluating and Architecting System dependability (EASY), Oct. 2002.
No context found.
K. Nagaraja, R. Bianchini, R. P. Martin, and T. D. Nguyen. Using fault model enforcement to improve availability. In Proc. 2nd Workshop on Evaluating and Architecting System Dependability, San Jose, CA, 2002.
No context found.
K. Nagaraja, R. Bianchini, R. Martin, and T. Nguyen. Using Fault Model Enforcement to Improve Availability. In 2nd Workshop on Evaluating and Architecting System Dependability (EASY), San Jose, CA, October 2002.
No context found.
K. Nagaraja, R. Bianchini, R. Martin, and T. Nguyen. Using Fault Model Enforcement to Improve Availability. In 2nd Workshop on Evaluating and Architecting System Dependability (EASY), San Jose, CA, October 2002.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC