| D. E. Bakken and R. D. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, Mar. 1995. |
....[5] A number of systems (based on algorithms that perform a set of tasks) have been developed to incorporate fault tolerance into distributed applications. They use various techniques such as random scheduling of tasks [11] scheduling based on Manager Worker models ( 2] stable tuple spaces ([1]) This paper focuses on an analytical performance model ( 14] and on a prototype ( 15] for analyzing systems that perform a fixed set of tasks (i.e. no arrival process) The former is modeled as a transient M G P queue while the latter is based on an asynchronous algorithm. The model and the ....
D. Bakken, R. Schlichting, "Supporting FaultTolerant Parallel Programming in Linda", IEEE, Vol. 6, No. 3, 1995.
....practicality in developing distributed systems. When constructing an analytic performance model, it is imperative that one focus on a particular application domain. In the eld of high performance scienti c computing, applications that are based on processing a bag of tasks form a large domain [2, 3, 27]. These applications are also referred to as iterative, grid and data parallel. Some examples are: Simulation, Image Processing, Discrete Optimization, Transformation, and Computational Geometry. A task can easily be de ned in these applications. For example, in image processing, classifying a ....
D. Bakken, R. Schlichting, \Supporting Fault-Tolerant Parallel Programming in Linda", IEEE, Vol. 6, No. 3, March 1995.
....processes. Their client server approach is significantly different from that of LIME, as we target mobile ad hoc applications with implicit access to the remote data of other hosts, and support mobile agents. Distributed Linda implementations have been studied extensively for fault tolerance [36, 2] and data availability [27] The main disadvantage with these approaches is their need for high degrees of connectivity among the hosts of the distributed portions of the tuple spaces, a property inherently not present in the mobile environment. One of the first applications of Linda to mobility ....
D.E. Bakken and R. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 1994.
....is a new technology that goes beyond the possibilities offered by message passing. Its advantages concerning its conceptually higher abstraction of the underlying hardware, and its advantages concerning caching and replication have intensively been discussed in scientific literature (see e.g. [1 3]) An obvious tendency towards virtual shared memory replacing or accompanying client server technology can be observed. CeRse concepts CORSO is a layered software component for the development of robust and parallel applications that supports the virtual shared memory paradigm. It has been ....
D. E. Bakken, Supporting fault-tolerant parallel programming in LINDA, Ph.D. thesis, University of Arizona, Department of Computer Science, 1994.
....but also ones which are tailor made for requirements such as speed, security and fault tolerance. For the above reasons, many researchers concluded that it was necessary for Linda to embrace multiple distinct tuple spaces and a number of prototype solutions have appeared, as described below [Bakken,94] Carriero,94] Douglas,95] Hupfer,90] Minsky,94] FT Linda is a variant of Linda designed to support fault tolerant applications through properties such as tuple stability, multiple operation atomicity and strong semantics [Bakken,94] The FT Linda model features a collection of ....
.... prototype solutions have appeared, as described below [Bakken,94] Carriero,94] Douglas,95] Hupfer,90] Minsky,94] FT Linda is a variant of Linda designed to support fault tolerant applications through properties such as tuple stability, multiple operation atomicity and strong semantics [Bakken,94] The FT Linda model features a collection of processors connected by a network that have no physically shared memory. FT Linda supports multiple tuple spaces which can be of either shared or private scope. Shared tuple spaces are accessible by multiple processes while private tuple spaces ....
[Article contains additional citation context not shown here]
D. E. Bakken, "Supporting Fault-Tolerant Parallel Programming in Linda", Ph.D. Thesis, Department of Computer Science, University of Arizona, Tucson, Arizona 85271, U.S., 8th August 1994.
....should maintain a consistent view of a distributed data space, which is a classical problem. Full P2P systems devoted to file storage and retrieval have implemented broker faut tolerance based on redundancy. Failure resilient distributed data space at the programming level have been defined in [8] and in JavaSpaces. 4.1 Volatile Workers At the worker level, a GCS has to ensure that the computation will make some progress, at long as functional resources are available. However, defining what is a functional resource is somehow blurred in such systems. The most traditional way is to ....
D. E. Bakken and R. D. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Trans. on Parallel and Distributed Systems, 6(3):287--302, 95.
....in a reliable storage. In the event of a process failure, either the a ected process or the whole computation is restarted from its last checkpointed state. In the eld of high performance scienti c computing, applications that are based on processing a bag of tasks form a large domain [3, 4, 11]. Such applications are also referred to as iterative, grid and data parallel. Some examples are: Simulation, Image Processing, Discrete Optimization, Transformation, and Computational Geometry. A task can easily be A preliminary version of this paper appeared in [13] de ned in these ....
....will be less, however the dynamic load balancing phase will not be very ecient. Therefore, one can structure an application as in case 1 to make the dynamic load balancing phase ecient and use r 1 to reduce the overhead of communication) 2 Related Work The FT Linda programming model [3] allows fault tolerant applications to be written using stable tuple spaces (TS) and atomic execution of tuple space operations. The TSs are replicated in order to tolerate processor failures and they are updated using atomic multicast. This model, initially stores the bag of tasks, in the TS. ....
D. Bakken, R. Schlichting, \Supporting Fault-Tolerant Parallel Programming in Linda," IEEE, Vol. 6, No. 3, March 1995.
....parallel computations; for special problems much simpler solutions exist [6] However, also parallel programming models that are more abstract than message passing should allow to deal with fault tolerance in a simpler way. While this is not yet completely true for the coordination language Linda [1], the functional programming model provides this potential to a large degree [7] Distributed Maple runs programs in the imperative language of Maple, but its parallel programming model is essentially functional : it provides the ability to spawn function applications as concurrent tasks and to ....
D. E. Bakken and R. D. Schlichting. Supporting Fault-Tolerant Parallel Programming in Linda. IEEE TPDS, 6(3):287-302, March 1995.
....Piranha [24] is built on top of Linda that dynamically balances system load across available machines. Piranha, like the fish, aggressively harnesses idle machine s resources during program execution. However, it does not handle failures. Extensions to handle failures are implemented in FT Linda [5] and PLinda [32] 94 7.1.3 Memory Coherence Models of the Shared Memory Memory consistency is an important aspect in shared memory systems that deal with the question: what is the correct results when multiple tasks read and write to the same memory location. Maintaining coherent shared memory ....
....fault tolerance separately and provide as an 97 add on feature. Fault tolerant techniques includes check pointing, replication, and migration. Systems that provides fault tolerance features with these techniques are CIRCUS[17] LOCUS[47] Clouds[19] Fail safe PVM [44] PLinda [32] and FT Linda [5]. These systems often provide fault tolerance features independent to other system functions and require user intervention when failures are present. Calypso [6] Chime [51] and our system, belong to a di#erent group in which load balancing and fault tolerance are naturally supported by the ....
D. Bakken and R. Schlichting. Supporting fault-tolerant parallel programming in Linda. Technical Report TR93-18, The University of Arizona, 1993.
....we can maintain the description of all the tasks and all the results collected from the completed tasks in reliable storage and design processes to execute each task atomically. In fact, such a scheme has already been demonstrated in fault tolerant parallel computing systems such as FT Linda[4] and PLinda[38] ffl Using workstations connected by LANs or even WANs, large scale high performance parallel processing is possible for these problems because computation basically consists of a large number of mostly independent tasks. 1 However, it is difficult for the end user to find ....
....Upon disagreement, the minority is ignored. 1.3.3 Fault tolerant Programming Languages Various fault tolerant programming languages have been developed to ease the task of constructing fault tolerant programs. Examples are Argus[47] Avalon[26] Fault tolerant Concurrent C[19] FT Linda[4], Orca[41] and FT SR[56] In general, these fault tolerant programming languages are distinguished by what program structuring paradigms they support since they all assume the fail stop processor failure model. Argus and Avalon support the object action model. Reliability and concurrency control ....
[Article contains additional citation context not shown here]
D. E. Bakken and R. D. Schlichting. Supporting fault tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 1994.
....been addressed separately from the issues of parallel processing. There have been three major mechanisms: checkpointing, replication and process groups. Such approaches have been implemented in CIRCUS [Coo85] LOCUS [PWC 81] and Clouds [DLA 90] Isis [BJ87] Fail safe PVM [LFS93] FT Linda [BS93], and Plinda [AS91] However, all these systems add significant overhead, even when there is no failure. More recently several prominent projects have similar goals to us. These include the NOW [Pat 95] project, the HPC [MMB 94] project, The Cilk project [BL97] and the Dome [NAB 95] project. All ....
D. Bakken and R. Schlichting. Supporting FaultTolerant Parallel Programming in Linda. Technical Report TR93-18, The University of Arizona, 1993.
....input data and a way to identify which software should be used to process that data. Again, both NetSolve and Ninf comply. A farming job is one composed of a large number of independent requests that may be serviced simultaneously. This is sometimes referred to as the bag of tasks model [24, 25]. Farming jobs fall into the class of embarrassingly parallel programs, for which it is very clear how to partition the jobs for parallel programming environments. Many important classes of problems, such as Monte Carlo simulations (e.g. 26] and parameter space searches (e.g. 7] fall into ....
D. E. Bakken and R. D. Schilchting. Supporting fault-tolerant parallel programming in linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, March 1995.
....when reclaimed by their owners. In fact, our daemons are modeled after Piranha. But unlike all of the previous systems, Calypso can mask process crashes. There have been several proposals to provide fault tolerance, mostly by augmenting an existing system. They include FT PVM [32] FT Linda [5], PLinda [19] and Orca [20] A notable exception is DOME [1] that incorporated fault tolerance and load balancing form the onset. These systems provide fault tolerance by using well known mechanisms: checkpointing the data, logging messages, and using reliable atomic broadcasts. In contrast, ....
D. Bakken and R. Schlichting. Supporting fault-tolerant parallel programming in Linda. Technical Report TR9318, The University of Arizona, 1993.
....extensively in the database community. A good general reference is [6] However, these mechanisms are generally perceived as costly and require the use of file I O (logging) which is often prohibitive in the global computing setting. One example of such a relatively costly approach is FT Linda [2]. The original Linda definition ( 35] see also Section 5) does not consider fault tolerance mechanisms. FT Linda is a version of Linda that addresses these concerns by providing two kinds of enhancements: stable tuple spaces, in which tuple values are guaranteed to persist across failures, and ....
D. E. Bakken and R. D. Schlichting. Supporting Fault-Tolerant Parallel Programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, Mar. 1995.
.... This has been explored by researchers for transparent runtime libraries that implement distributed shared memory [8,11,22,39] and for programs that make explicit use of data structures with shared memory semantics [5,33] Relatedly, there has been research on fault tolerant shared tuple spaces [3] and other models of parallel programming such as farming [36] master slave [4] and coarse grained dataflow [14] that are more restrictive than general message passing, and facilitate the addition of fault tolerance and computation migration. A great strength of NetSolve is that if a server ....
D. E. Bakken and R. D. Schilchting. Supporting Fault-Tolerant Parallel Programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, March 1995.
....Operating Systems (RTOS) There has been much less done in the area of fault tolerant parallel processing systems. Most of the work has concentrated on fault tolerant hardware, e.g. faulttolerant networks and system reconfiguration after a fault. There has been some though, for example, FT Linda [3], PLinda [13] Orca [14] Calypso [4] and Fail safe PVM [15] These systems use a combination of well known mechanisms such as replication, transactions, message logging, or checkpoints and rollbacks to provide fault tolerance. In Mentat, regular objects do not hold state and thus the overhead ....
D. Bakken and R. Schlichting, "Supporting fault-tolerant parallel programming in Linda," Technical Report TR93-18, The University of Arizona, 1993.
.... [KS90,KS91] making the tuple space and processes working on it recoverable, and transaction based or transaction style like language extensions enabling the programmer to define a sequence of tuple space operations as an atomic operation which will be evaluated completely or not at all [BDE94,BS93] In LiPS version 2.4, we follow the approach of [KS90,KS91] Its main advantages allowing (efficient) independent checkpoint generation and not extending the set of tuple space operations have been the pros of our decision. One main drawback, namely the additional cost for message logging, ....
Bakken D. E. and Schlichting R.D. Supporting Fault-Tolerant Parallel Programming in Linda. Technical Report 93.18, Department of Computer Science, The University of Arizona, 6 1993.
....in the system design is a N Fault Tolerant Tuple Space Machine. Our current approach to its implementation is presented next together with its runtime data. 2 Related work There are several approaches to integrate different levels of fault tolerance into tuple space based applications. Following [BDE94] these approaches can be divided 3 Unix is licensed exclusively through X Open Company Limited. 3. Generative Communication 3 into extensions to the tuple space runtime system [Xu 88,LX89,CKM92,PTHR93] making the tuple space fault tolerant, resilient data and processes [KS90,KS91] making the ....
.... processes [KS90,KS91] making the tuple space and processes working on it recoverable, and transaction based or transaction style like language extensions enabling the programmer to define a sequence of tuple space operations as an atomic operation which will be evaluated completely or not at all [BDE94,BS93] In LiPS version 2.4, we follow the approach of [KS90,KS91] Its main advantages allowing (efficient) independent checkpoint generation and not extending the set of tuple space operations have been the pros of our decision. One main drawback, namely the additional cost for message ....
Bakken D. E. Supporting Fault-Tolerant Parallel Programming in Linda. PhD thesis, The University of Arizona, 6 1994. Department of Computer Science.
.... [KS90,KS91] making tuple space and processes working on it recoverable, and transaction based or transaction style like language extensions enabling the programmer to define a sequence of tuple space operations as an atomic operation which will be evaluated completely or not at all [BDE94,BS93] In LiPS version 2.4 we follow the approach to resilient data and processes. A more detailed description of the design and the implementation of this concept is given in [Set95] 3 Generative Communication In order to implement distributed applications, a programmer must be supplied with ....
Bakken D. E. and Schlichting R.D. Supporting Fault-Tolerant Parallel Programming in Linda. Technical Report 93.18, Department of Computer Science, The University of Arizona, 6 1993.
....well suited system design. The last section presents the design of our Fault Tolerant Tuple Space Machine along with its integration into the LiPS system. 2 Related work There are different approaches to integrate different levels of fault tolerance into tuple space based applications. Following [BDE94] these approaches can be divided into extensions to the tuple space runtime system [Xu 88,LX89,CKM92,PTHR93] making tuple space fault tolerant, resilient data and processes [KS90,KS91] making tuple space and processes working on it recoverable, and transaction based or transaction style like ....
.... and processes [KS90,KS91] making tuple space and processes working on it recoverable, and transaction based or transaction style like language extensions enabling the programmer to define a sequence of tuple space operations as an atomic operation which will be evaluated completely or not at all [BDE94,BS93] In LiPS version 2.4 we follow the approach to resilient data and processes. A more detailed description of the design and the implementation of this concept is given in [Set95] 3 Generative Communication In order to implement distributed applications, a programmer must be supplied with ....
Bakken D. E. Supporting Fault-Tolerant Parallel Programming in Linda. PhD thesis, The University of Arizona, 6 1994. Department of Computer Science.
....when reclaimed by their owners. In fact, our daemons are modeled after Piranha. But unlike all of the previous systems, Calypso can mask process crashes. There have been several proposals to provide fault tolerance, mostly by augmenting an existing system. They include FT PVM [32] FT Linda [5], PLinda [19] and Orca [20] A notable exception is DOME [1] that incorporated fault tolerance and load balancing form the onset. These systems provide fault tolerance by using well known mechanisms: checkpointing the data, logging messages, and using reliable atomic broadcasts. In contrast, ....
D. Bakken and R. Schlichting. Supporting faulttolerant parallel programming in Linda. Technical Report TR93-18, The University of Arizona, 1993.
....is impossible, because there is no point in time when the Linda system can decide that a tuple will no longer be referenced. Fault tolerance has not been considered, thus a system or network failure might compromise the tuple space. Many attempts have been made to overcome these limitations [2], 27] However, as more functionality is added to a sound model, its design becomes less pure. The Coordination Kernel [12] 31] offers the shared data paradigm by means of communication objects. It provides advanced transactions and thus reliability and software fault tolerance. Inter process ....
.... f 104 (comm ToDoEntry request) new; 105 request.task = trip; 106 g commit; 107 (chairman office ; request; root.chair) insert to do entry( 108 p[0] INDEP) process; 109 (pres1 grant.office; trip) travel grant( p[1] INDEP) process; 110 (pres2 grant.office; trip) travel grant( p[2]; INDEP) process; 111 (pres3 grant.office; trip) travel grant( p[3] INDEP) process; 112 (LOCAL; p; request) wait for local; p[4] INDEP) process; 113 (LOCAL; p; trip) wait for grant; p[5] INDEP) process; 114 (LOCAL; p; trip) wait for none; p[6] INDEP) process; 115 g 116 trans (comm ....
[Article contains additional citation context not shown here]
D. E. Bakken. Supporting Fault-Tolerant Parallel Programming in LINDA. PhD thesis, University of Arizona, Department of Computer Science, August 1994. TR 94-23.
....Examples include seismic computations [1] and materials science [13] Typically the computation is controlled by a single master process and the data manipulated by the computation is located on a single disk with all I O being performed by the master. A fault tolerant version of this structure [6, 2, 10] allows cheap recovery since only the particular task affected by a machine failure needs to be recovered. It is possible to increase capacity and bandwidth of storage at a single machine using RAID techniques [5] but in some computations the data manipulated outstrips the capacity of a single ....
D. E. Bakken. Supporting Fault-Tolerant Parallel Programming in Linda. PhD thesis, The University of Arizona, Aug. 1994.
....process replication. The technique of using a collection of extra processors to provide fault tolerance with no reliance on disk comes from Plank and Li [29] and is unique to this work. There are efforts to provide programming platforms for heterogeneous computing which can adapt to changing load [2, 12, 17]. In all of these however, the programmer must make his or her program conform to the programmingmodel of the platform. None are garden variety message passing environments like PVM. There has been much research on algorithm based fault tolerance for matrix operations on parallel platforms where ....
D. E. Bakken and R. D. Schilchting. Supporting fault-tolerant parallel programming in linda. IEEE Trans. on Par. and Dist. Sys., 6(3):287--302, Mar 1995.
....have been proposed in the literature. Some of the more interesting ones include: having multiple tuple spaces [11] more powerful tuple space operations such as collect [4] and copy collect [16] specifying access patterns on tuples [6] persistent tuple spaces [3] fault tolerant tuple spaces [1], and open Linda [14] In this paper we will use the extension of multiple tuple spaces. This paper discusses Blossom, a C based implementation of Linda with extensions. Since we are using only the C compiler and not a Linda compiler, the syntax of Blossom differs slightly from that of Linda. ....
Bakken, D. E., and Schlichting, R. D. Supporting Fault-Tolerant Parallel Programming in Linda. IEEE Transactions on Parallel and Distributed Systems 6, 3 (1995), 287--302.
....Plank and Li [35] and is unique to this work. There are efforts to provide programming platforms for heterogeneous computing that can adapt to changing load. These can be divided into two groups those presenting new paradigms for parallel programming that facilitate fault tolerance migration [2, 3, 15, 20], and migration tools based on consistent checkpointing [9, 37, 41] In the former group, the programmer must make his or her program conform to the programming model of the platform. None are garden variety message passing environments like PVM or MPI. Those in the latter group achieve ....
D. E. Bakken and R. D. Schilchting. Supporting fault-tolerant parallel programming in linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, March 1995.
....collection is impossible, because there is no point in time when the Linda system can decide that a tuple will no longer be used. Fault tolerance has not been considered, thus a system or network failure might compromise the tuple space. Many attempts have been made to overcome these limitations [2, 24]. However, as more functionality is added to a sound model, its design becomes less pure. 4.3.3 The Coordination Kernel CoKe is a successor of V PL in that sense, that it extracted the coordination features of V PL to make them available for other languages. The Coordination Kernel [12, 34] ....
D. E. Bakken. Supporting Fault-Tolerant Parallel Programming in LINDA. PhD thesis, University of Arizona, Department of Computer Science, August 1994. TR 94-23.
....[16] Tuple space based systems such as Linda[8] provide a higher level programming model, but it is unconventional and requires programmers to marshal unmarshal data. Limited support for load balancing exists in Linda s derivative, Piranha [12] Extensions to handle fault tolerance are proposed [5, 13]. Distributed shared memory systems employ a high level programming model, but do not provide architectural support for load balancing and fault tolerance. CC [10] addresses task parallelism at a high level and is close to our programming. However, CC divides the C extension into two parts, ....
D. Bakken and R. Schlichting. Supporting faulttolerant parallel programming in Linda. TheUniversity of Arizona, 1993.
....to disk or on process replication. Some efforts are underway to provide programming platforms for heterogeneous computing that can adapt to changing load. These efforts can be divided into two groups: those presenting new paradigms for parallel programming that facilitate fault tolerance migration [1, 2, 8, 11], and migration tools based on consistent checkpointing [5, 27, 31] They cannot handle processor failures or revocation due to availability, without checkpointing to a central disk. 8. Conclusions and Future Work We have presented a new technique for executing certain scientific computations on ....
D. E. Bakken and R. D. Schilchting. Supporting fault-tolerant parallel programming in Linda. ACM Transactions on Computer Systems, 7(1):1--24, Feb 1989.
....such as in PLinda. However, Paradise does not support explicit mechanisms (e.g. continuation committing) for making processes resilient, or tunable fault tolerance mechanisms, nor does it give the same correctness guarantee as PLinda. There are two fault tolerant Linda variant systems, FT Linda [2] and MOM [7] For tuple space reliability, FT Linda assumes a set of replicated tuple spaces connected together by an ordered atomic broadcast network. FT Linda also provides a restricted form of transaction mechanism called atomic guarded statements for processes. MOM supports checkpointing for ....
D. Bakken and R. Schlichting. Supporting Faulttolerant Parallel Programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, March 1995.
....faults, and the ISIS system is widely used. A prototype PASO system built on top of ISIS is currently running at Yale. In previous work, Xu and Liskov [28] discuss the use of the virtual partition algorithm to maintain the consistency of tuple replicas in the tuple space. Bakken and Schlichting [4, 5] assume a reliable tuple space, and propose a new atomic tuple swap operator that can be used to build reliable applications of a certain type ( bag of task applications) Anderson and Shasha s [2] work on Persistent Linda includes support for transactions, but doesn t focus on the problem of ....
D. E. Bakken and R. D. Schlichting. Supporting fault-tolerant parallel programming in linda. Technical Report TR93-18, Univ. Arizona Dept. Computer Science, 1993.
....[25] and is unique to this work. Some efforts are underway to provide programming platforms for heterogeneous computing that can adapt to changing load. These efforts can be divided into two groups: those presenting new paradigms for parallel programming that facilitate fault tolerance migration [2, 3, 10, 15], and migration tools based on consistent checkpointing [6, 27, 31] In the former group, the programmer must make a program conform to the programming model of the platform. None are garden variety message passing environments such as PVM or MPI. Those in the latter group achieve transparency, ....
D. E. Bakken and R. D. Schilchting. Supporting faulttolerant parallel programming in Linda. ACM Transactions on Computer Systems, 7(1):1--24, Feb 1989.
...., and so P will reconstruct the state when it committed Tn . 8 Related work As it happens, PLinda is one of the older attempts to put fault tolerance in Linda[1] However, others have beat us to implementation. So, this section compares PLinda with other work as if the other work were earlier. In [2,3], Bakken and Schlichting present FT Linda, a variant of Linda that addresses fault tolerance. For tuple space reliability, FT Linda assumes a set of replicated tuple spaces connected together by an ordered atomic broadcast network. It supports a restricted form of transaction mechanism, called ....
D. E. Bakken and R. D. Schlichting. Supporting fault tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 1994. To appear.
....are harder still. The parallel processing systems mentioned above have tried to incorporate fault tolerance using a variety of methods. FT PVM [LFS93] is an extension of PVM which uses checkpointing to keep track of partial executions and restarts the computation in case of failure. FT Linda [BS93] uses multiple Linda servers and a costly replicated update mechanism to keep them in synchrony. While ISIS was not designed for parallel processing, it can be used to run parallel programs in a fault tolerant manner by using ISIS services for each thread. The overhead of this approach would be ....
D. Bakken and R. Schlichting. Supporting Fault-Tolerant Parallel Programming in Linda. Technical Report TR93-18, The University of Arizona, 1993.
.... To provide fault tolerance services using such checkpointing techniques, researchers have implemented several process migration or replication tools on top of the existing parallel programming environments [CCK 95, PL94d, Ste96] In addition, efforts such as Fail Safe PVM [LFS93] FT Linda [BS89] and Pact [Mai93] have dealt with fault tolerance in existing parallel programming environments. Detailed discussion of these various efforts is beyond the scope of this dissertation. Instead, in this section, we examine in detail the existing work related to fault tolerant matrix operations in ....
D. E. Bakken and R. D. Schilchting. Supporting fault-tolerant parallel programming in Linda. ACM Transactions on Computer Systems, 7(1):1--24, Feb 1989.
....such as in PLinda. However, Paradise does not support explicit mechanisms (e.g. continuation committing) for making processes resilient, or tunable fault tolerance mechanisms, nor does it give the same correctness guarantee as PLinda. There are two fault tolerant Linda variant systems, FT Linda [2] and MOM [7] For tuple space reliability, FT Linda assumes a set of replicated tuple spaces connected together by an ordered atomic broadcast network. FT Linda also provides a restricted form of transaction mechanism called atomic guarded statements for processes. MOM supports checkpointing for ....
D. Bakken and R. Schlichting. Supporting Faulttolerant Parallel Programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, March 1995.
....tuplespaces thus treating a tuplespace as a fundamental object of the model. 24] describes a new tuplespace data type and an operation to create an object of this data type (i.e. a tuplespace) Interesting motivation and examples of applicability of multiple tuplespaces is also described. [32, 4] also utilize flavors of the multiple tuplespace concept. Presently, there seems to be no clear semantic definition of multiple tuplespaces or how they can should be implemented. Persistent tuplespaces Another interesting extension is having the tuplespace persist beyond the lifetime of an ....
....of the Linda compiler, the modification of the user s program in order to run inside the Deli environment, the tasks and 6 Providing fault tolerance introduces a series of additional implementation issues; moreover, a fault tolerant implementation must first satisfy the issues described here. See [4, 26, 50] for in depth treatments of fault tolerant Linda implementations. Tuplespace Manager Server Eval main( Tuplespace Manager Host Node Host Node deli n 1 my linda prog Figure 1: High level system state before and after Deli bootstrapping. implementation of tuplespace managers, tuple and ....
David E. Bakken and Richard D. Schlichting. Supporting fault-tolerant parallel programming in linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, March 1995.
....of all possible kinds of communication structures [4] determines the elegance of the Linda model. For the following discussion we will refer only to the original Linda model 2 , without taking into account recent proposals to extend Linda (e.g. guards, atomicity and faulttolerance [1], or structuring of the name space [13] A text node is represented as a tuple ( text , lid, line, flag deleted , lid left , lid right ) The tuple ( root , id) points to the root of the text. If two users want to insert text at the same point at the same time, the following will happen. The ....
D. E. Bakken. Supporting Fault-Tolerant Parallel Programming in LINDA. PhD thesis, University of Arizona, Department of Computer Science, August 1994. TR 94-23.
....Operating Systems (RTOS) there has been much less done in the area of fault tolerant parallel processing systems. Most of the work has concentrated on fault tolerant hardware, e.g. fault tolerant networks and system reconfiguration after a fault. There has been some though, for example, FT Linda [4], PLinda [15] Orca [16] Calypso [5] and Fail safe PVM [17] These systems use a combination of well known mechanisms such as replication, transactions, message logging, or checkpoints and rollbacks to provide fault tolerance. Mentat differs from these systems in that its underlying ....
D. Bakken and R. Schlichting, "Supporting fault-tolerant parallel programming in Linda," Technical Report TR93-18, The University of Arizona, 1993.
....locality management. However, it does not implement a fault tolerant manager, described in [19] which relied on dispersal and evasion. We now briefly summarize other related work. A large body of experimental results exist in the attempt to make parallel programs run on distributed hardware [22, 33, 40, 5, 14, 8, 10, 6, 7]. These systems can be loosely divided into two types, those that depend on a message passing scheme and those that use some form of global address spaces. Many systems provide message passing, or Remote Procedure Call facility built on top of a message passing. These include PVM [40, 22] Orca ....
....have generally been addressed separately from the issues of parallel processing. There have been three major mechanisms: checkpointing, replication, and process groups. Such approaches have been implemented in CIRCUS [15] LOCUS [36] and Clouds [20] Isis [12, 39, 11] FT PVM [33] FT Linda [5], and PLinda [2, 26] However, all these systems add significant overhead, even when there is no failure. More recently several prominent projects have similar goals to us. These include the NOWs project at Berkeley, the HPC project and the Dome project at CMU. All these projects however use ....
D. Bakken and R. Schlichting. Supporting fault-tolerant parallel programming in Linda. Technical Report TR93-18, The University of Arizona, 1993.
....support for mobile computing. This section provides an overview of these three micro protocol suites. 5. 1 Atomic Multicast The atomic multicast service is a customized version designed for the runtime system of a fault tolerant version of the Linda coordination language [ACG86] called FT Linda [BS95] Linda is a language for parallel programming based on tuple space (TS) a communication abstraction defined as a bag that can hold data elements called tuples. Processes use TS to communicate and synchronize by depositing and withdrawing tuples from a TS. However, Linda as originally defined ....
D. Bakken and R. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distr. Syst., 6(3):287--302, March 1995.
....support for mobile computing. This section provides an overview of these three micro protocol suites. 5. 1 Atomic Multicast The atomic multicast service is a customized version designed for the runtime system of a fault tolerant version of the Linda coordination language [ACG86] called FT Linda [BS95] Linda is a language for parallel programming based on tuple space (TS) a communication abstraction defined as a bag that can hold data elements called tuples. Processes use TS to communicate and synchronize by depositing and withdrawing tuples from a TS. However, Linda as originally defined ....
D. Bakken and R. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Trans. on Parallel and Distr. Syst., 6(3):287--302, Mar 1995.
....be configured into a system is addressed in [19, 21, 22] for different types of network services, including membership and group RPC. The use of this approach for constructing a customized atomic multicast protocol for a version of the Linda coordination language with fault tolerance extensions [2] is described in [17] The prototype implementation described in this paper illustrates the feasibility of extending the x kernel to support this two level model of composition. In the prototype, messages arrive at a composite protocol and generate events that result in handlers in the appropriate ....
D. Bakken and R. D. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Trans. on Parallel and Distr. Syst., 6(3):287--302, March 1995.
....machines. Several applications involving replicating processing have been built using Consul, including a replicated directory service and a distributed word game. The implementation of a fault tolerant version of the Linda coordination language [1] based on Consul is also nearing completion [3]. This paper makes two main contributions. The first is presentation of new algorithms for implementing a variety of these fundamental fault tolerant services. In particular, Consul provides novel realizations of the following: ffl Consistent ordering of application requests to maintain replica ....
....application, a fault tolerant version of Linda, is currently nearing completion. Consul is being used in this application to construct the language runtime system, and in particular, to implement stable tuple spaces by replicating the data using the state machine approach. Details can be found in [3]. This section reports on the performance of various protocols in Consul and the overheads they impose on the overall performance of the system. All the numbers reported here have been taken from the replicated directory object application running on a collection of Sun 3 75 workstations connected ....
D. Bakken and R. D. Schlichting. Supporting fault-tolerant parallel programming in Linda. Technical Report TR 93-18, Dept of Computer Science, University of Arizona, Tucson, AZ, 1993.
No context found.
D. E. Bakken and R. D. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287--302, Mar. 1995.
No context found.
D.E. Bakken and R. Schlichting. Supporting fault-tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 1994.
No context found.
D. Bakken, and R. Schlichting. Supporting Fault-Tolerant Parallel Programming in Linda. Technical report, University of Arizona, Tucson, U.S.A., 1993
No context found.
D. Bakken, R. Schlichting, "Supporting Fault-Tolerant Parallel Programming in Linda", IEEE, Vol. 6, No. 3, March 1995.
No context found.
D.Bakken, R,Schlichting. "Supporting Fault-Tolerant Parallel Programming in Linda", Technical Report 93-18, Dept. of Computer Science, Univ. of Arizona, June 1993.
No context found.
Bakken, D.E. "Supporting fault-tolerant parallel programming in Linda." Ph.D. dissertation (TR 94-23), Department of Computer Science, University of Arizona, 173 pgs, August 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC