(Enter summary)
Abstract: Recently, an algorithm-based approach using diskless
checkpointing has been developed to provide fault tolerance
for high-performance matrix operations. With this approach,
since fault tolerance is incorporated into the matrix
operations, the matrix operations become resilient to any
single processor failure or change with low overhead. In
this paper, we present a technique called multiple checkpointing
to enable the matrix operations to tolerate a certain
set of multiple processor failures by ... (Update)
Context of citations to this paper: More
.... or load balancing (e.g. 27,35] or modified algorithms for performing certain specific computations in a fault tolerant manner (e.g. [7,20,30]) While the effectiveness of these techniques has been demonstrated experimentally, none of them have made a large impact on the...
.... dependent strategies for incorporating fault tolerance have already received attention in the scienti c computing community; see, e.g. [21]. These approaches rely primarily on the use of diskless checkpointing, a signi cant improvement over traditional approaches. The nature...
Cited by: More
A Diskless Checkpointing Algorithm for Super-scale.. - Engelmann, Geist (2003)
(Correct)
Asynchronous Parallel Pattern Search For Nonlinear.. - Hough, Kolda, Torczon (2000)
(Correct)
Design, Implementations and Robustness in Parallel.. - Roucairol, Cung, Yahfoufi (2000)
(Correct)
Similar documents (at the sentence level): More
68.5%: Fault Tolerant Matrix Operations for Networks of.. - Kim, Plank, Dongarra (1997)
(Correct)
30.7%: Fault Tolerant Matrix Operations Using Checksum and Reverse.. - Kim (1996)
(Correct)
18.8%: Fault Tolerant Matrix Operations for Parallel and Distributed.. - Kim (1996)
(Correct)
Similar documents based on text: More All
0.4: Algorithm-Based Diskless Checkpointing for Fault Tolerant.. - Plank, Kim, Dongarra (1995)
(Correct)
0.3: Survivability of Multiple Fiber Duct Failures - Schupke, Autenrieth, Fischer (2001)
(Correct)
0.2: Diskless Checkpointing - Plank, Li, Puening (1997)
(Correct)
Related documents from co-citation: More All
5: MIST: PVM with Transparent Migration and Checkpointing
- Casas, Clark et al. - 1995
5: Checkpointing SPMD applications on transputer networks (context) - Silva, Veer et al. - 1994
4: Consistent Checkpoints of PVM Applications
- Stellner - 1994
BibTeX entry: (Update)
Y. Kim, J. S. Plank, and J. J. Dongarra. Fault tolerant matrix operations for networks of workstations using multiple checkpointing. In High Performance Computing on the Information Superhighway, HPC Asia '97, pages 460--465, Seoul, Korea, April 1997. http://citeseer.ist.psu.edu/article/kim97fault.html More
@misc{ kim97fault,
author = "Y. Kim and J. Plank and J. Dongarra",
title = "Fault tolerant matrix operations for networks of workstations using multiple
checkpointing",
text = "Y. Kim, J. S. Plank, and J. J. Dongarra. Fault tolerant matrix operations
for networks of workstations using multiple checkpointing. In High Performance
Computing on the Information Superhighway, HPC Asia '97, pages 460--465,
Seoul, Korea, April 1997.",
year = "1997",
url = "citeseer.ist.psu.edu/article/kim97fault.html" }
Citations (may not include all citations):
2
DOME: Parallel programming (context) - Arabe, Beguelin et al.
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://netlib2.cs.utk.edu/utk/people/JackDongarra/papers.html): More
Message-Passing Performance of Various Computers - Dongarra, Dunigan (1995)
(Correct)
High-Performance Computing in Industry - Strohmaier, Dongarra
(Correct)
Determining the Idle Time of a Tiling: New Results - Desprez, Dongarra, Rastello, .. (1997)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC