| Z. M. Kedem and K. V. Palem. Transformations for the automatic derivation of resilient parallel programs. In IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, pages 15--25, 1992. |
....addresses these issues. The research leading to Charlotte started as theoretical work where provable methods for executing parallel computations on abstract asynchronous processors 50 were developed [40, 38, 2] The outline of the virtual machine interface to the actual system was proposed in [37]. Theoretical results were then interpreted in the context of networks of workstations in [20] The above were significantly extended and validated in the Calypso [4] system which provides a virtual machine interface and a run time system targeting homogeneous networks of workstations. This ....
Z. M. Kedem and K. Palem. Transformations for the Automatic Derivation of Resilient Parallel Programs. In Proceedings of the IEEE Workshop on FaultTolerant Parallel and Distributed Systems, 1992.
....with limited success. Mainly due to the fact that it compounds the overheads and these high overheads have to be paid even when there is no failure. 3. The Formal Foundations This section presents some of the formal results which form the basis of our design. These results have been published in [AKPR93, Ked92, KP92, KPRR92, KPRS91, KPRS93, KPS90, Rab83, Rab89]. The formal results are developed in the context of abstract machines modeling some key properties of realistic highly parallel machines. These results lead to precise provably correct and efficient techniques for execution of parallel computations in a faulttolerant manner on these abstract ....
.... The first Idempotent Execution strategy (not relying on evasion) the Certified Touch All technique, and the related Eager Scheduling for it, were presented by Kedem, Palem, and Spirakis [KPS90] Additional improvements were presented by Kedem, Palem, Raghunathan, and Spirakis [KPRS91] See also [Ked92,KP92,KPRS93]. The first asynchronous parallel execution (including the underlying asynchronous clock construction) was presented by Kedem, Palem, Rabin, and Raghunathan [KPRR92] An improvedconstruction was presented by Aumann andRabin [AR93] Dispersed variables and fingerprinting were presented by Rabin ....
Z. Kedem and K. Palem. Transformations for the Automatic Derivation of Resilient Parallel Programs. In IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, pages 15-- 25, 1992.
....points, again, even when there is no failure. That is, the speed of the computation is often limited by that of the slowest processor. 3. The Formal Foundations This section presents some of the formal results which form the basis of our design. These results have been published in [AKPR93, Ked92, KP92, KPRR92, KPRS91, KPRS93, KPS90, Rab83, Rab89]. The formal results are developed in the context of abstract machines modeling some key properties of realistic highly parallel machines. These results lead to precise provably correct and efficient techniques for execution of parallel computations in a fault tolerant manner on these abstract ....
....to it by M. To accomplish this, several new techniques are used, which interact in subtle ways. These techniques are: A Self Referential Logical Clock, Eager Schedul ing, Idempotent Execution, Certified Touch All, Evasive Memory Layout, The Information Dispersal Algorithm and Fingerprinting [KPS90, KPRS91, Ked92, KPRR92, KP92, KPRS93, AKPR93, Rab81, Rab89]. In a simplified form, the execution is a sequence of parallel steps, logically numbered 1, 2, etc. The Logical Clock maintains the current step number. A free processor of M eagerly schedules itself by grabbing a copy of a thread segment whose execution has not been completed in ....
[Article contains additional citation context not shown here]
Z. Kedem and K. Palem. Transformations for the Automatic Derivation of Resilient Parallel Programs. In IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, pages 15-- 25, 1992.
....Fault tolerant features are an integral part of our system, without additional fault detection and recovery mechanisms like checking points and roll back. These is no additional overhead if no failures actually occurred. 1. 3 Contributions Our work is a continuation of research reported in [29, 20, 4, 39, 37, 35, 36, 38, 40]. We developed several unique features in our system: Novel techniques to handle nested parallelism and synchronization Several new techniques are developed in our system in order to cope with nested parallelism. These techniques include nested two phase idempotent execution strategy and ....
....computing and distributed systems related to the system. Chapter 8 summarizes our work. 9 Chapter 2 Key Concepts and Techniques In this chapter, we will discuss the key concepts and techniques used in our system. Some of these concepts and techniques were developed in previous research [29, 20, 4, 39, 37, 35, 36, 38, 40], including the abstract execution model, two phase idempotent execution strategy, and eager scheduling. However, the introduction of the new features in our system like nested parallelism and synchronization make these techniques inadequate. We will present both the original ideas and the ....
[Article contains additional citation context not shown here]
Z. M. Kedem and K. V. Palem. Transformations for the automatic derivation of resilient parallel programs. In IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, pages 15--25, 1992.
No context found.
Z. M. Kedem and K. Palem. Transformations for the automatic derivation of resilient parallel programs. In Proceedings of the IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, 1992.
....addresses these issues. The research leading to Charlotte started as theoretical work where provable methods for executing parallel computations on abstract asynchronous processors were developed [25, 24, 2] The outline of the virtual machine interface to an actual system was proposed in [23]. Theoretical results were then interpreted in the context of networks of workstations in [12] The above were significantly extended and validated in the Calypso [5] system, which provides a virtual machine interface and a runtime system targeting homogeneous networks of workstations. This ....
Z. M. Kedem and K. Palem. Transformations for the Automatic Derivation of Resilient Parallel Programs. In Proceedings of the IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, 1992.
....explicitly addresses these issues. The research leading to Charlotte started as theoretical work where provable methods for executing parallel computations on abstract asynchronous processors were developed [20,19,1] The outline of the virtual machine interface to an actual system was proposed in [18]. Theoretical results were then interpreted in the context of networks of workstations in [11] The above were significantly extended and validated in the Calypso [4] system, which provides a virtual machine interface and a runtime system targeting homogeneous networks of workstations. This ....
Z. M. Kedem and K. Palem, Transformations for the automatic derivation of resilient parallel programs, In Proc. IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, 1992.
No context found.
Z. M. Kedem and K. Palem. Transformations for the automatic derivation of resilient parallel programs. In Proceedings of the IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, 1992.
....in addition to other performance benefits that we shall see later. 2 Previous and Related Work Calypso has its roots in results by us and by our colleagues addressing fault tolerance, parallel program execution on fault prone and asynchronous abstract machines, and distributed systems [30, 31, 11, 26, 12, 24, 14, 21, 23, 22, 4, 25, 3, 15, 13]. The research leading to Calypso started as formal work which developed provable methods for executing parallel computations, initially on abstract machines with crashfailing processors, and later on abstract machines with asynchronous processors. An outline of a network of workstations based ....
Z. Kedem and K. Palem. Transformations for the automatic derivation of resilient parallel programs. In Proc. IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, 1992.
No context found.
Z. Kedem and K. Palem, "Transformations for the Automatic Derivation of Resilient Parallel Programs," Proc. of the workshop on Fault Tolerance in Parallel and Distributed Systems, 1992, to appear.
....and not simulations. The complete source code of the program is shown on page 14. 2 Previous Work Calypso has its roots in results by us and by our colleagues addressing fault tolerance, parallel program execution on fault prone asynchronous abstract machines, and distributed systems [37, 38, 16, 32, 17, 30, 20, 27, 29, 28, 4, 31, 3, 21, 19]. The research leading to Calypso started as formal work which developed provable methods for executing parallel computations, initially on abstract machines with crash failing processors, and later on abstract machines with asynchronous processors. An outline of a network of workstations based ....
Z. Kedem and K. Palem. Transformations for the automatic derivation of resilient parallel programs. In IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, pages 15-- 25, 1992.
No context found.
Z. Kedem and K. Palem. Transformations for the automatic derivation of resilient parallel programs. In Proc. of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, 1992.
No context found.
Z. Kedem and K. Palem. Transformations for the automatic derivation of resilient parallel programs. In Proc. of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, 1992.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC