See this document in CiteSeerX!

Hazim Shafi  (Make Corrections)  
IBM Research 11501 Burnet Road M/S 9460 Austin, TX 78758



  Home/Search   Context   Related

 
View or download:
colorado.edu/jkbweb/papers...Raptor.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  colorado.edu/jkbweb/Papers (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper presents Raptor, a SDSM cluster management system based on checkpoint/recovery and thread migration. Raptor decouples the runtime system and application data from application threads, allowing efficient load balancing, resource allocation, and rollback recovery. There are two important features of the system. First, it reduces checkpoint overhead by only saving application-specific data that cannot be recreated at recovery time. Second, by integrating thread migration capability both ... (Update)

Active bibliography (related documents):   More   All
1.1:   Efficient User-Level Thread Migration and Checkpointing.. - Hazim Abdel-Shafi Evan   (Correct)
0.7:   Compile/Run-time Support for Thread Migration - Jiang, Chaudhary (2002)   (Correct)
0.5:   Process Migration based on Gobelins Distributed Shared .. - Vallee, Morin.. (2002)   (Correct)

Similar documents based on text:
0.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

@misc{ burnet-hazim,
  author = "Ibm Research Burnet",
  title = "Hazim Shafi",
  url = "citeseer.ist.psu.edu/756237.html" }
Citations (may not include all citations):
587   PVM: A Framework for Parallel Distributed Computing - Sunderam - 1990  ACM   DBLP
566   Condor: A Hunter of Idle Workstations (context) - Litzkow, Livny et al. - 1988
496   SPLASH: Stanford Parallel Applications for Shared-Memory (context) - Singh, Weber et al. - 1991  ACM
406   TreadMarks: Distributed Shared Memory on Standard Workstatio.. - Keleher, Dwarkadas et al. - 1994  DBLP
361   Reliable Communication in the Presence of Failures (context) - Birman, Joseph - 1987
305   The NAS Parallel Benchmarks - Bailey, Barton et al. - 1991
230   Cilk: An Efficient Multithreaded Runtime System - Blumofe, Joerg et al. - 1995  ACM   DBLP
117   Libckpt: Transparent Checkpointing under Unix - Plank, Beck et al. - 1995  ACM   DBLP
57   Lazy Release Consistency for Distributed Shared Memory - Keleher - 1995  ACM
43   Thread Migration and its Applications in Distributed Shared .. - Itzkovitz, Shuster et al. - 1998  ACM
38   Detours: Binary Interception of Win32 Functions - Hunt, Brubacher - 1999
37   Ariadne: Architecture of a Portable Threads System Supportin.. - Mascarenhas, Rego - 1996  DBLP
35   A Recoverable Distributed Shared Memory: Integrating Coheren.. - Kermarrec, Cabillic et al. - 1995
30   Application Program Interface (context) - Review, OpenMP - 2002
28   Lightweight Logging for Lazy Release Consistent Distributed .. - Costa, Guedes et al. - 1996  ACM   DBLP
26   The Performance of Consistent Checkpointing in Distributed S.. - Cabillic, Muller et al. - 1995  ACM   DBLP
23   A Case for Two-Level Distributed Recovery Schemes - Vaidya - 1995  ACM   DBLP
20   MYOAN: An Implementation of the KOAN Shared Virtual Memory o.. - Cabillic, Priol et al. - 1994
16   Transparent Adaptive Parallelism on NOWs using OpenMP - Scherer, Lu et al. - 1999  ACM   DBLP
14   Advanced Windows (context) - Richter - 1997  ACM
8   intala, Chung-Yih Wang, and De-Ron Liang. NT-SwiFT: Software.. (context) - Huang, Chung - 1998
6   Efficient Fine-Grain Thread Migration with Active Threads - Weissman, Gomes et al. - 1998  ACM   DBLP
6   A Performance Study of Sequential I/O on Windows NT (context) - Reidel, van Ingen et al. - 1998
4   A Transparent Checkpoint Facility on NT - Srouji, Schuster et al. - 1998
3   Implementation and Evaluation of ICARE: An Efficient Recover.. (context) - Kermarrec, Morin et al. - 1998
2   Thread Migration and Communication Minimization in DSM Syste.. (context) - Thitikamol, Keleher - 1999
2   Thread Migration in Distributed Memory Multicomputers - Milton - 1998

Documents on the same site (http://ecadw.colorado.edu/jkbweb/Papers.html):   More
The Performance Value of Shared Network Caches in Clustered.. - John Bennett (1995)   (Correct)
Fault Tolerant Algorithms and Architectures for Robotics - Hamilton, Bennett.. (1994)   (Correct)
Willow: A Scalable Shared-Memory Multiprocessor - Bennett, Dwarkadas.. (1992)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC