See this document in CiteSeerX!

A Case for Two-Level Distributed Recovery Schemes (1995)  (Make Corrections)  (23 citations)
Nitin H. Vaidya
Measurement and Modeling of Computer Systems



  Home/Search   Context   Related

 
View or download:
tamu.edu/pub/vaidy...SIGMETRICS95.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  disys.korea.ac.kr/~kibo...ftpaper (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Most distributed and multiprocessor recovery schemes proposed in the literature are designed to tolerate arbitrary number of failures. In this paper, we demonstrate that, it is often advantageous to use "two-level" recovery schemes. A two-level recovery scheme tolerates the more probable failures with low performance overhead, while the less probable failures may be tolerated with a higher overhead. By minimizing the overhead for the more frequently occurring failure scenarios, our approach is ... (Update)

Context of citations to this paper:   More

.... selection of such an interval is for the most part a solved problem [25, 34] There has been important research in parallel systems [16, 33, 36], but the results are less unified. No previous work has addressed the issue of processor availability following a failure in...

Cited by:   More
Hazim Shafi - Ibm Research Burnet   (Correct)
A Large-Scale Study of Failures in High-Performance.. - Bianca Schroeder Garth   (Correct)
Processor Allocation and Checkpoint Interval Selection in.. - Plank, Thomason (2001)   (Correct)

Similar documents (at the sentence level):
19.4%:   A Case for Multi-Level Distributed Recovery Schemes - Vaidya (1994)   (Correct)
15.9%:   Another Two-Level Failure Recovery Scheme: Performance Impact of.. - Vaidya (1994)   (Correct)

Active bibliography (related documents):   More   All
0.5:   On Checkpoint Latency - Vaidya (1995)   (Correct)
0.4:   A Survey of Rollback-Recovery Protocols in.. - Elnozahy, Alvisi.. (1996)   (Correct)
0.3:   Some Thoughts on Distributed Recovery - Vaidya (1994)   (Correct)

Similar documents based on text:   More   All
0.2:   Recovery Schemes for High Availability and High Performance .. - Lundberg, Häggander   (Correct)
0.2:   Staggered Consistent Checkpointing - Vaidya (1999)   (Correct)
0.2:   On Staggered Checkpointing - Vaidya (1996)   (Correct)

Related documents from co-citation:   More   All
12:   Libckpt: Transparent checkpointing under Unix - Plank, Beck et al. - 1995
9:   A longitudinal survey of internet host reliability - Long, Muir et al. - 1995
9:   A survey of rollback-recovery protocols in message-passing systems - Elnozahy, Johnson et al. - 1996

BibTeX entry:   (Update)

N. H. Vaidya, "A case for two-level distributed recovery schemes", in Proc. of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, May 1995, pp. 64--73. http://citeseer.ist.psu.edu/vaidya95case.html   More

@inproceedings{ vaidya95case,
    author = "Nitin H. Vaidya",
    title = "A Case for Two-Level Distributed Recovery Schemes",
    booktitle = "Measurement and Modeling of Computer Systems",
    pages = "64-73",
    year = "1995",
    url = "citeseer.ist.psu.edu/vaidya95case.html" }
Citations (may not include all citations):
572   Distributed snapshots: Determining global states in distribu.. (context) - Chandy, Lamport - 1985
109   Sender-based message logging - Johnson, Zwaenepoel - 1987
58   Nonblocking and orphan-free message logging protocols - Alvisi, Hoppe et al. - 1993
31   Queueing and Computer Science Applications (context) - Trivedi, Statistics - 1988
27   Efficient Checkpointing on MIMD Architectures (context) - Plank - 1993
22   A first order approximation to the optimum checkpoint interv.. (context) - Young - 1974
21   Computer Organization & Design: The Hardware/Software Interf.. (context) - Patterson, Hennessy - 1994
20   Roll-forward checkpointing scheme: A novel fault-tolerant ar.. - Pradhan, Vaidya - 1994
18   Analytic models for rollback and recovery strategies in data.. (context) - Chandy, Browne et al. - 1975
15   Fail-safe PVM: A portable package for distributed programmin.. (context) - Le'on, Fisher et al. - 1993
15   Performance analysis of checkpointing strategies (context) - Tantawi, Ruschitzka - 1984
11   Comparative analysis of different models of checkpointing an.. (context) - Nicola, van Spanje - 1990
10   Performance of rollback recovery systems under intermittent .. (context) - Gelenbe, Derochette - 1978
6   Optimal message logging protocols (context) - Alvisi, Marzullo - 1994
5   Another two-level failure recovery scheme: Performance impac.. - Vaidya - 1994
4   Analysis of checkpointing schemes for multiprocessor systems - Ziv, Bruck - 1993
3   A case for multi-level distributed recovery schemes - Vaidya - 1994
3   Analysis of an improved distributed checkpointing algorithm (context) - Garg, Wong - 1993
3   Efficient checkpointing over local area network (context) - Ziv, Bruck - 1994
2   A model for roll-back recovery with multiple checkpoints (context) - Gelenbe - 1976



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://disys.korea.ac.kr/~kibom/Recovery/ftpaper.html):   More
Rapid Prototyping of Parallel Fault Tolerant Systems - Nixon, Birkinshaw, Croll.. (1994)   (Correct)
Recovery in Multicomputers with Finite Error Detection.. - Krishna, Vaidya, Pradhan (1994)   (Correct)
Distributed Recovery Units: An Approach for Hybrid and.. - Nitin Vaidya (1993)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC