See this document in CiteSeerX!

Algorithm-Based Diskless Checkpointing for Fault Tolerant Matrix Operations (1995)  (Make Corrections)  (13 citations)
James S. Plank, Youngbae Kim, Jack J. Dongarra



  Home/Search   Context   Related

 
View or download:
hpc2204.etl.go.jp/mirrors/...fault.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  hpc2204.etl.go.jp/mirro...papers (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper is an exploration of diskless checkpointing for distributed scienti#c computations. With the widespread use of the #Network Of Workstation" #NOW# platform for distributed computing, long-running scienti#c computations need to tolerate the changing and often faulty nature of NOWenvironments. We present high-performance implementations of several algorithms for distributed scienti#c computing, including Cholesky factorization, LU factorization, QR factorization, and preconditioned ... (Update)

Context of citations to this paper:   More

.... tasks to other machines) stable storage can be implemented by disks or by replicated copies in the memories of multiple machines [17, 19]. Some systems perform checkpointing transparently to the application, often on top of PVM [6, 5, 9, 13] or MPI [18] Other systems...

.... tasks to other machines) stable storage can be implemented by disks or by replicated copies in the memories of multiple machines [11]. Some systems perform checkpointing transparently to the application, often on top of PVM [4] or MPI [12] Other systems rely on...

Cited by:   More
Fault-tolerant Parallel Applications with Dynamic Parallel.. - Gerlach, Hersch (2005)   (Correct)
Compiler Assisted Generation of Error Detecting Parallel.. - Roy-Chowdhury, Banerjee   (Correct)
System Checkpointing using Reflection and Program Analysis - Whaley (2001)   (Correct)

Similar documents (at the sentence level):
50.0%:   Algorithm-Based Diskless Checkpointing for Fault Tolerant.. - Plank, Kim, Dongarra (1995)   (Correct)
25.1%:   Fault Tolerant Matrix Operations for Networks of.. - Plank, Kim, Dongarra (1997)   (Correct)

Active bibliography (related documents):   More   All
0.5:   Generic Programming for High Performance Numerical Linear.. - Siek, Lumsdaine, Lee (1998)   (Correct)
0.5:   Interposed Request Routing for Scalable Network Storage - Anderson, Chase, Vahdat (2000)   (Correct)
0.5:   Tuning Strassen's Matrix Multiplication for Memory.. - Thottethodi, Chatterjee.. (1998)   (Correct)

Similar documents based on text:   More   All
0.2:   Diskless Checkpointing - Plank, Li, Puening (1997)   (Correct)
0.2:   Fault Tolerant Matrix Operations for Networks of.. - Kim, Plank, Dongarra (1997)   (Correct)
0.2:   Diskless Nodes HOW-TO document for Linux - Nemkin, Vasudevan   (Correct)

Related documents from co-citation:   More   All
5:   Fail-safe PVM: A portable package for distributed programming with transparent r.. - Leon, Fisher et al. - 1993
3:   Transparent fault-tolerance in parallel Orca programs - Kaashoek - 1992
3:   Fault tolerant matrix operations for networks of workstations using multiple che.. - Kim, Plank et al. - 1997

BibTeX entry:   (Update)

James S. Plank, Youngbae Kim, and Jack J. Dongarra. Algorithm-Based Diskless Checkpointing for Fault Tolerant Matrix Operations. In FTCS-25, Pasadena, CA, June 1995. http://citeseer.ist.psu.edu/plank95algorithmbased.html   More

@inproceedings{ plank95algorithmbased,
    author = "James S. Plank and Youngbae Kim and Jack J. Dongarra",
    title = "Algorithm-Based Diskless Checkpointing for Fault-Tolerant Matrix Operations",
    pages = "351--360",
    year = "1995",
    url = "citeseer.ist.psu.edu/plank95algorithmbased.html" }
Citations (may not include all citations):
185   Distributed Systems (context) - Mullender - 1989
12   Technical Report CMU-CS (context) - CMU, of et al. - 1994
2   LAPACK User's Guide (context) - Hammarling, McKenney et al. - 1992
2   SIAM Press (context) - Methods - 1994
1   IEEE Publishers (context) - the, Massively et al. - 1992



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://hpc220-4.etl.go.jp/mirrors/netlib/utk/people/JackDongarra/papers.html):   More
Determining the Idle Time of a Tiling: New Results - Desprez, Dongarra, Rastello, .. (1997)   (Correct)
The Performance of PVM on MPP Systems - Casanova, Dongarra, Jiang   (Correct)
Network-Enabled Solvers and the NetSolve Project - Casanova, Dongarra, Moore (1998)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC