| G. Barnes. A method for implementing lock-free shared-data structures. In SPAA '93, pages 261--270, 1993. |
....that allow easy implementation of lock free objects. A universal construction is used to automatically generate lock free implementations of arbitrary objects from their sequential implementations. Universal constructions were proposed first by Herlihy [36] and improved later by others [2, 7, 20, 37, 42]. To implement a lock free object using a universal construction, a programmer first writes code for a sequential implementation of that object. This code is then embedded within a retry loop that is automatically generated by the universal construction. Alock free implementation that is based on ....
....semantics of SC are undefined if task T p has not previously executed a LL instruction. 2.4. 3 Universal Constructions of Lock Free Objects In recentyears, several groups of researchers have presented methods for automatically transforming sequential object implementations into lock free ones [20, 35, 36, 37, 73]. These methods are called universal constructions. A universal construction relieves the object designer of the need to reason about concurrency, thereby greatly simplifying thetaskofproviding a correct lock free implementation for a particular shared object. We now outline two universal ....
G. Barnes. A method for implementing lock-free shared data structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Architectures and Algorithms,pages 261--270. ACM, 1993.
.... As mentioned earlier, Lamport demonstrated that, in a sequentially consistent memory, atomic reads and writes can be implemented from non atomic reads and writes without mutual exclusion [94, 100, 101] Since then, extensive research has been conducted in lock free and wait free synchronization [8, 11, 17, 58, 63, 64, 65, 66, 75, 78, 119, 126, 127, 128, 138, 149, 158, 164]. Lock free and wait free operation implementations consist of code that typically executes multiple atomic statements and does not involve mutual exclusion. The correctness conditions for lock free and wait free implementations are necessarily more complicated than for mutual exclusion based ....
....and maintaining the copies is quite large for large objects. While optimizations have been proposed for reducing this overhead, the process of doing so is quite complex. Further, concurrent access to the object is not allowed. Barnes presented a mechanism allowing concurrent access to the object [11]. The proposal required the object be protected by a number of locks. Operations on the object acquire locks associated with relevant parts of the object in such a way that processes can help each other to perform operations and release locks. Barnes technique is not wait free. Since locks are ....
Greg Barnes. Method for Implementing Lock-Free Shared Data Structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 261-- 270, June 1993.
.... to allow multiple threads to work on a data structure concurrently without a lock [21] Herlihy gave a theoretical framework for constructing wait free objects [12, 11] Software lock free schemes using lock free data structures have been proposed to address the inherent limitations of locking [12, 38, 4, 27]. Lock free schemes provide optimistic concurrency without requiring a critical section or software wait on a lock. These schemes often require more complex operations than critical sections and rely on programmers to write appropriate code. Programmers have to reason about correctness in the ....
....respect to its lock based counterparts. Speculative Lock Elision [30] dynamically elides lock acquire and release operations from an execution stream but requires lock acquisitions in the presence of conflicts. Improving performance of software non blocking schemes have been studied previously [27, 4, 38]. Software proposals have been made to make lock based critical sections non blocking [37] and thread scheduling that is aware of blocking locks [18, 28] Database concurrency control. Transactions are well understood and studied in database literature [10] The use of timestamps for resolving ....
G. Barnes. Method for implementing lock-free shared data structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 261--270, June 1993.
.... synchronization operations on single memory locations, such as compare and swap (CAS) are not expressive enough to support design of efficient non blocking algorithms [9, 10, 12] and software emulations of stronger primitives from weaker ones are still too complex to be considered practical [1, 4, 7, 8, 21]. In response, industry is currently examining the idea of supporting Array Array Linked Snark with used as list with (with centralized circular tagged garbage access bu#er pointers collection) see [9] see [2] see [2] this paper) Left and right accesses interfere yes no no no Fixed ....
.... for comparison, it also shows code for the simpler CAS operation (which is not used in the algorithms presented here) For either operation, the sequence of suboperations is assumed to be executed atomically, either through hardware support [12, 18, 19] or through a non blocking software emulation [7, 21]. A CAS operation examines one memory location and compares its contents to an expected old value. If the contents match, then the contents are replaced with a specified new value and an indication of success is returned; otherwise the contents are unchanged and an indication of failure is ....
G. Barnes. A method for implementing lock-free shared data structures. In Proc. 5th ACM Symp. Parallel Algorithms and Architectures, pages 261--270, June 1993.
....primitives. Blocking techniques, however, have several drawbacks including decreased concurrency, the potential for deadlock, and certain undesirable scheduling side effects (i.e. priority inversion and convoying) An alternative approach is to use non blocking synchronization techniques (e.g. (Barnes, 1993), Greenwald et al., 1996) Herlihy, 1988) Herlihy, 1991) Massalin et al., 1991) Prakash et al., 1991) Valois, 1995) Non blocking techniques do not suffer from these problems since processes optimistically execute concurrently and therefore never wait on one another. As such, they offer ....
....(LL) and Store Conditional (SC) instruction pair. Our new algorithms for insertion and deletion into a shared linked list which are presented in Section 4 utilize only the CAS and DCAS instructions. 2. 2 Generic Non Blocking Data Structures A number of so called universal methods (e.g. (Barnes, 1993), Herlihy, 1993) Herlihy et al., 1993) Herlihy et al., 1987) Prakash et al., 1991) for constructing non blocking data structures of any type have been discussed in the literature. Lamport, 1983) described the first lock free algorithm for the problem of managing a single writer, ....
[Article contains additional citation context not shown here]
G. Barnes (1993). A Method for Implementing Lock-Free Shared Data Structures. Proceedings of the 5 th International Symposium on Parallel Algorithms and Architectures, pp. 261-270.
....sequentialize the access. Furthermore locks have the problem that if the process with the lock is blocked (e.g. swapped out by the OS or dies) then all processes can become blocked. To avoid this problem many non blocking (or lock free) implementations of data structures have been suggested [1, 2, 9, 10, 17, 18, 24, 25]. As with the versions that use This work was supported in part by the National Science Foundation under grants CCR 9706572 and CCR 0085982. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made ....
.... there has been considerable work on non blocking (or lock free) data structures [10] which only require that some user s request will complete in a bounded number of steps (although any particular user can be delayed indefinitely) Examples of non blocking data structures work includes [1, 2, 9, 10, 11, 17, 18, 24, 25]. Most of these implementations still fully sequentialize access to the data structure. Moreover, they often require unbounded memory 9 , or the use of atomic operations on two or more words of memory (such as a double compare andswap or transactional memory [12, 20] Such operations are ....
G. Barnes. A method for implementing lock-free shared data structures. In Proc. 5th ACM Symp. on Parallel Algorithms and Architectures, pages 261--270, June 1993.
.... synchronization operations on single memory locations, such as compare and swap (CAS) are not expressive enough to support design of e#cient non blocking algorithms [11, 12, 16] and software emulations of stronger primitives from weaker ones are still too complex to be considered practical [1, 4, 7, 8, 24]. In response, industry is currently examining the idea of supporting stronger synchronization operations in hardware. A leading candidate among such operations is double compare and swap (DCAS) a CAS performed atomically on two memory locations. However, before such a primitive can be ....
....structure. In summary, we believe that through the design of linearizable lock free implementations of classical data structures such as deques, we will be able to understand better the power of the DCAS abstraction, and whether one should continue the e#ort to provide support for implementing it [1, 4, 7, 8, 11, 12, 24] on concurrent hardware and software processing platforms. The next section presents our computation model and a formal specification of the deque data structure. 2 Modeling DCAS and Deques Our paper deals with implementing a deque on a shared memory multiprocessor machine, using the DCAS ....
[Article contains additional citation context not shown here]
Barnes, G. A method for implementing lock-free shared data structures. In Proceedings of the 5th ACM Symposium on Parallel Algorithms and Architectures (June 1993), pp. 261--270.
....of non interfering methods. There is a large literature on concurrent objects on which plenty of methods are invoked in parallel. Most previous works, however, focus on exposing parallelism between nonexclusive methods and pay little attention to the performance degradation in contended objects [Bar93, CD90, Chi91, CRPD96, TY96, YET98] Reader writer locks, which forbid multiple concurrent writes and allow multiple concurrent reads, are also useless for implementing objects on which multiple exclusive method invocations contend. Our study deals with exclusive methods and is diagonal to theirs. ....
Greg Barnes. A Method for Implementing Lock-Free Shared Data Structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '93), pages 261-- 270, Velen, Germany, June 1993.
....then some process (not necessarily p) will eventually complete its operation. Lock free implementations are hard to design and prove correct. Consequently, instead of inventing separate implementations for di#erent types of objects, recent research has focussed mostly on universal constructions [11, 21, 12, 18, 23, 8, 13, 16, 1, 22, 5, 4, 6, 2, 20]. A (non blocking or waitfree) universal construction for n processes is an algorithm that, when instantiated with the sequential implementation 1 of any type T , becomes a (non blocking or wait free) implementation of a type T object that can be accessed concurrently by n processes [12] In this ....
Barnes, G. A method for implementing lock-free shared data structures. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures (1993), pp. 261--270.
....are free of the pitfalls, such as convoying, priority inversion, and deadlocks, that afflict lock based implementations. However, wait free implementations are notoriously hard to design and prove correct. To overcome this di#culty, recent research has focussed on universal constructions [1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 14, 19, 20, 21, 22] A universal construction for n processes is an algorithm that, when instantiated with the sequential implementation 1 of any type T , becomes a wait free implementation of a type T object that can be accessed concurrently by n processes [9] Thus, once we have an e#cient universal construction ....
.... first universal construction are due to Herlihy [8, 9] The use of LL and SC in universal constructions was also introduced by Herlihy [10] Of the recent universal constructions, some focus on improving the time complexity when concurrent operations access disjoint parts of the implemented object [2, 5, 6, 7, 11, 19, 21, 22]. Others focus on reducing the worstcase time complexity [1] Afek, Weisberger, and Weisman present polynomial time implementations of some objects at Level 2 of the consensus hierarchy, including fetch add and swap objects, from 2 consensus objects [3] Some fast type specific implementations are ....
Barnes, G. A method for implementing lock-free shared data structures. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures (1993), pp. 261--270.
....and a high level of concurrency. That is, operations that access disjoint parts of the data structure, or are widely separated in time, should not interfere with each other. For example, operations on disjoint sets of components can proceed independently, and avoid concurrency control overhead [Bar93, IR94, TSP92, ST95, AM95a]. When there is considerable contention, conflicting operations could form long chains and complex webs, transitively effecting operations that are otherwise disjoint. Extremely fast algorithms might be designed that disentangle these webs, and require coordination only among local neighborhoods ....
....considered in a timing based model [AT93] A search for efficient implementations proceeded. Barnes presented algorithms with the idea of simulating the update without effecting the shared memory, and then to apply the changes atomically using a non blocking implementation of a multi word RMW [Bar93]. This idea was developed further by Israeli and Rappoport [IR94] Turek, Shasha, and Prakash explored similar constructions using an atomic compare swap primitive [TSP92] These non blocking algorithms avoid copying the entire data structure, and allow disjoint access: operations in disjoint ....
[Article contains additional citation context not shown here]
G. Barnes. A Method for implementing lockfree shared data structures. In Proceedings of the 5th ACM Symposium on Parallel Algorithms and Architectures, 1993.
....that T r has preempted T q as illustrated in Figure 4.15. Because T q has been preempted, the value of Save#q;c# cannot change during the execution of the Read procedure. Also, it must be the case that Status #q# 6= 2 holds 146 m 3 (1, x] 0] 5] m 1 (2, x,y] 0,1] 0,2] m 2 (2, x,z] [0,3], 0,4] initially x=0, y=1, z=3 succeeds succeeds m 1 (2, x,y] 0,1] 0,2] m 2 (2, x,z] 0,3] 0,4] initially x=0, y=1, z=3 (a) succeeds fails fails (b) 15 Figure 4.16: #a# Twooverlapping MWCAS operations m 1 and m 2 . Parameters are shown as #number of words, #list of ....
....of Save#q;c# cannot change during the execution of the Read procedure. Also, it must be the case that Status #q# 6= 2 holds 146 m 3 (1, x] 0] 5] m 1 (2, x,y] 0,1] 0,2] m 2 (2, x,z] 0,3] 0,4] initially x=0, y=1, z=3 succeeds succeeds m 1 (2, x,y] 0,1] 0,2] m 2 (2, x,z] [0,3], 0,4] initially x=0, y=1, z=3 (a) succeeds fails fails (b) 15 Figure 4.16: #a# Twooverlapping MWCAS operations m 1 and m 2 . Parameters are shown as #number of words, #list of words accessed #, #list of old values#, #list of new values##. The only potential con#ict is on word x, ....
#1#:67#99, March 1991. #20# G. Barnes. A method for implementing lock-free shared data structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Architectures and Algorithms, pages
....Symposium on the Principles of Distributed Computing. Email: moir cs.pitt.edu. Work supported in part by an NSF CAREER Award, CCR 9702767. 1 constructions that allow operations that access nonoverlapping parts of the object to be executed concurrently without affecting each others performance [3, 6, 8, 10]. Israeli and Rappoport call this property disjoint access parallelism [6] In cases where there is little parallelism available to be exploited, the overhead associated with the complicated mechanisms used to achieve disjoint access parallelism is not justified. The goal in this paper is to ....
G. Barnes. A method for implementing lockfree shared data structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 261--270, 1993.
....invokes an operation on the implementation and repeatedly takes steps, its operation will eventually complete, regardless of the speeds of other processes. Many recent wait free implementations are based on a shared memory that supports a pair of synchronization operations, known as LL and SC [33, 9, 19, 22, 1, 31, 4, 3, 7, 2, 30]. These operations work as follows. LL(a) returns the value at location a. SC(a, v) either changes the value at location a to v and returns true, or has no e#ect on location a and returns false. Correspondingly, we say the SC is successful or unsuccessful. Specifically, if process P applies ....
Barnes, G. A method for implementing lock-free shared data structures. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures (1993), pp. 261--270.
....elsewhere [10] and are long and tedious, so we do not present them again here. However, we do give a high level, intuitive proof sketch in an appendix. 1 The multi word operations considered here access a single variable that spans multiple words. The multi word operations considered in [2, 4, 9, 13] are more general: they can access multiple variables, each stored in a separate word. These operations could therefore be used to implement the operations we require. However, we implement the single variable, multi word operations used by our construction directly because they admit simpler and ....
....each stored in a separate word. These operations could therefore be used to implement the operations we require. However, we implement the single variable, multi word operations used by our construction directly because they admit simpler and more efficient implementations than those considered in [2, 4, 9, 13]. 2 Preliminaries Our algorithms are designed for use in shared memory multiprocessors that provide load linked (LL) validate (VL) and store conditional (SC) instructions. We assume that the SC operation does not fail spuriously. In some hardware implementations if LL and SC, SC can fail ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....of the execution speeds of other processes. Herlihy s universal wait free object construction has been regarded as very expensive to implement, and many researchers (including Herlihy) have proposed more efficient techniques based on a weaker synchronization technique called non blocking [5] [8], 15] A non blocking concurrent system guarantees that at least one process can complete any operation in a finite number of steps, regardless of the execution speeds of other processes. There has also been work which focuses on providing hardware support [6] and operating system support [21] ....
....the logically distinct private versions. 3. store conditional the logically distinct version. If this attemp fails, restarts by going back to step 1. 2. 2 Barnes s Approach Barnes s work attempts to derive lock free (nonblocking) object implementations from sequential implementations [8]. His primary goal is to increase the copying efficiency in Herlihy s approach. Barnes employs a 2 part method: cooperative technique and caching method. The shared data is partitioned into disjoints sets called cells and each thread tries to claim the cells it will need to perform its sequential ....
Barnes, G. A Method for Implementing Lock-free Shared Data Structures. In Proceedings of ACMSPAA '93 Conference (June 1993, Velen, Germany) pp. 261-270.
....first provably efficient work stealing algorithm [15] and implementation [14] is fairly recent, however. The idea of non blocking and wait free synchronization was developed by Herlihy [29] There has been a long line of work attempting to make the idea more practical via universal constructions [11, 28], useful primitives [2, 3, 39] and specific data objects [3, 36, 45] In fact, our non blocking implementation of work stealing uses the bounded tags technique of [39] Nevertheless, to this day, few applications or systems have been built with non blocking synchronization. Of notable exception ....
Greg Barnes. A method for implementing lock-free shared data structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 261--270, Velen, Germany, June 1993.
....help as many processes as possible with each operation, and by choosing processes to help in such a way that all processes 1 The multi word operations considered here access a single variable that spans multiple words. Thus, they are not the same as the multi word operations considered in [1, 2, 5, 6], which access multiple variables, each stored in a separate word. The multi word operations we consider admit simpler and more efficient implementations than those considered in [1, 2, 5, 6] shared var X : record pid : 0: N Gamma 1; tag : 0: 1 end; BUF : array[0: N Gamma 1; 0: 1] of ....
....variable that spans multiple words. Thus, they are not the same as the multi word operations considered in [1, 2, 5, 6] which access multiple variables, each stored in a separate word. The multi word operations we consider admit simpler and more efficient implementations than those considered in [1, 2, 5, 6]. shared var X : record pid : 0: N Gamma 1; tag : 0: 1 end; BUF : array[0: N Gamma 1; 0: 1] of array[0: W Gamma 1] of wordtype initially X = 0; 0) BUF [0; 0] initial value of the implemented variable V private var curr : record pid : 0: N Gamma 1; tag : 0: 1 end; i: 0: W Gamma 1; j: ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....3. We end the paper with concluding remarks in Section 4. Due to space limitations, we defer detailed proofs to the full paper. 1 The multi word operations considered here access a single variable that spans multiple words. Thus, they are not the same as the multi word operations considered in [1, 2, 5, 6], which access multiple variables, each stored in a separate word. The multi word operations we consider admit simpler and more efficient implementations than those considered in [1, 2, 5, 6] shared var X: record pid : 0: N Gamma 1; tag: 0: 1 end; BUF : array[0: N Gamma 1; 0: 1] of array[0: W ....
....variable that spans multiple words. Thus, they are not the same as the multi word operations considered in [1, 2, 5, 6] which access multiple variables, each stored in a separate word. The multi word operations we consider admit simpler and more efficient implementations than those considered in [1, 2, 5, 6]. shared var X: record pid : 0: N Gamma 1; tag: 0: 1 end; BUF : array[0: N Gamma 1; 0: 1] of array[0: W Gamma 1] of wordtype initially X = 0; 0) BUF[0; 0] initial value of the implemented variable V private var curr : record pid : 0: N Gamma 1; tag: 0: 1 end; i: 0: W Gamma 1; j: 0: 1 ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....TRUE 4 else 5 return FALSE END ATOMIC Figure 1: The Compare Swap synchronization primitive. primitive is one that can solve the consensus problem [7] for any number of processes; Compare Swap is a universal primitive. The first universal method was given by Herlihy [13] many others followed [1, 4, 11, 22, 26]. However, it has become increasingly apparent that universal methods suffer from several sources of inefficiency, such as wasted parallelism, excessive copying, and generally high overhead. In addition to the universal methods, algorithms have also been developed for lock free objects that are ....
G. Barnes. A method for implementing lock-free shared data structures. In Proceedings of the Fifth Symposium on Parallel Algorithms and Architectures, pages 261-- 270, 1993.
....first provably efficient work stealing algorithm [15] and implementation [14] is fairly recent, however. The idea of non blocking and wait free synchronization was developed by Herlihy [30] There has been a long line of work attempting to make the idea more practical via universal constructions [11, 29], useful primitives [2, 3, 41] and specific data objects [3, 38, 49] In fact, our non blocking implementation of work stealing uses the bounded tags technique of [41] Nevertheless, to this day, few applications or systems have been built with non blocking synchronization. Of notable exception ....
Greg Barnes. A method for implementing lock-free shared data structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 261--270, Velen, Germany, June 1993.
....many researchers. Both general transformation methods and specific concurrent implementations of various data structure have been proposed [Her91, Plo89, Her90, Her93] Those methods demonstrate the potential of wait free and nonblocking algorithms, but are too inefficient to use in practice. In [Bar93], algorithms with the idea of simulating the update without affecting the shared memory and then to apply the changes atomically using a non blocking implementation of a multi word Read modify write are presented. This idea was developed further in [IR94] Similar constructions using an atomic ....
Barnes, G. A Method for Implementing Lock-Free Shared Data Structures. In Proceedings of the 5th ACM Symposium on Parallel Algorithms and Architectures, 1993.
....finish the bookkeeping activities done in this loop. 3.2 Implementing MWCAS and READ We now show how to efficiently implement the MWCAS and READ primitives used in the previous subsection. Unfortunately, MWCAS is exceedingly difficult to implement efficiently in truly asynchronous systems [1, 6, 9, 22]. The most efficient known wait free implementation [1] requires Theta(N 3 M ) time complexity to implement M words that can be accessed by N tasks. Fortunately, as shown by Anderson and Ramamurthy in [3] a W word MWCAS can be implemented on a real time uniprocessor in only O(W ) time (which ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures ", Proceedings of the fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....finish the bookkeeping activities done in this loop. 3.2 Implementing MWCAS We now show how to efficiently implement the MWCAS and Read primitives used in the previous subsection. Unfortunately, MWCAS is exceedingly difficult to 16 Chapter 1 implement efficiently in truly asynchronous systems [1, 7, 11, 22]. The most efficient known wait free implementation [1] requires Theta(N 3 M ) time complexity to implement M words that can be accessed by N tasks. Fortunately, as shown by Anderson and Ramamurthy in [3] a W word MWCAS can be implemented on a real time uniprocessor in only O(W ) time. In the ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures ", Proceedings of the fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....primitive is impractical to provide in hardware, so it must be implemented in software. For our purposes, such an implementation should be lock free or wait free. Unfortunately, previous lock free and wait free implementations of primitives like MWCAS have rather high worst case time complexity [1, 6, 11, 20]. Thus, they are of limited utility in real time systems. One of the main contributions of this paper is to show that a wait free MWCAS primitive can be implemented efficiently if one assumes a priority based task scheduler. Our implementation of MWCAS was inspired by recent results of Ramamurthy, ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....Section 3 is a wait free implementation of MWCAS from LL, SC , and VL. By a straightforward generalization of the one word case, MWCAS can in turn be used to implement LL, VL, and MWSC (see Table 1) The problem of implementing such multi word primitives has been considered previously by Barnes [2], by Israeli and Rappoport [7] and by Shavit and Touitou [8] However, the implementations presented in these papers are only lock free. A process in our implementation attempts to lock , in a wait free manner, each of the words that it accesses. A similar (albeit only lock free) approach is ....
.... announce the parameters and state of its operation, so that another process can continue to execute a partially completed operation. LL, VL, and SC operations are used to ensure that each stage of each operation is executed exactly once. Techniques similar to this one have been used previously [2, 7, 8]. However, these implementations are only lock free, not wait free, so operations are not guaranteed to complete. We employ a technique that allows a process to detect concurrent operations with which it potentially interferes, and to help complete such operations. If a process is interfered with ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....primitive is impractical to provide in hardware, so it must be implemented in software. For our purposes, such an implementation should be lock free or wait free. Unfortunately, previous lock free and wait free implementations of primitives like MWCAS have rather high worst case time complexity [1, 5, 9, 18]. Thus, they are of limited utility in real time systems. One of the main contributionsof this paper is to show that a wait free MWCAS primitive can be implemented efficiently if one assumes a priority based uniprocessor task scheduler. Our implementation of MWCASwas inspired by recent results of ....
.... A Read operation and a W word MWCAS operation can be implemented in a wait free manner from CAS with O(1) and O(W ) time complexity, respectively, in m 1 (2, x,y] 0,1] 0,2] m 2 (2, x,z] 0,3] 0,4] succeeds succeeds m 1 (2, x,y] 0,1] 0,2] a) m 2 (2, x,z] 0,3] 0,4] b) m 3 (1, x] 0] [5]) 18 succeeds fails fails . 18 initially x=0, y=1, z=3 initially x=0, y=1, z=3 Figure 5. a) Two overlapping MWCAS operations m 1 and m 2 . Parameters are (number of words, list of words accessed] list of old values] list of new values] The only potential conflict is on word x, which ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....on the part of the object programmer, provides only lock free implementations, and results in no advantage for some commonly used objects. Furthermore, these constructions do not allow concurrent operations to execute in parallel. Below we briefly describe efforts to address these problems. Barnes [5] recognizes the importance of allowing operations to execute in parallel where possible. He presents a mechanism in which an object is protected by a number of locks. Operations on the object acquire the locks associated with affected parts of the object in such a way that processes can help ....
....(SC) instructions, and then, if successful, modifies each word in accordance with the implemented operation. If, in attempting to lock its words, a process p finds a word already locked by another process q, then p attempts to help q to complete its operation. In contrast to the implementations of [2, 5, 7], STM and the implementation presented here do not continue this helping recursively. That is, if p fails to complete q s operation because of another process r, then p does not start to help r, but instead causes q to start its operation again. This policy ensures that p helps only enough that a ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993.
....large variable is stored across several memory words. 1 In the first part of the paper, we show how to efficiently implement 1 The multi word operations considered here access a single variable that spans multiple words. Thus, they are not the same as the multi word operations considered in [1, 3, 7, 11], which access multiple variables, each stored in a separate word. The these operations using the usual single word LL, SC, and VL primitives. We present an implementation in which LL and SC operations on a W word variable take O(W ) time and VL takes constant time. Our implementation allows LL ....
....i is a private variable of process p. N is the total number of processes. The semantics of VL and SC are undefined if process p has not executed a LL instruction since p s most recent SC. multi word operations we consider admit simpler and more efficient implementations than those considered in [1, 3, 7, 11]. The correctness condition used in the proof presented in [8] is that of linearizability [4] A linearizable implementation ensures that, in every run, the partial order over operations 2 can be extended to a total order that is consistent with the sequential semantics of the implemented ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....In particular, many machines provide either CAS or LL SC, but not both. Furthermore, most hardware implementations of the LL SC instructions do not fully implement the semantics expected and assumed by algorithm designers. As a result, several non blocking algorithms developed recently (e.g. [2, 3, 4, 7, 10, 14]) are not directly applicable on current multiprocessors 1 . The results presented here eliminate this gap by providing time and space efficient, wait free implementations of the primitives required by such algorithms, using primitives that are commonly implemented in hardware. Specifically, we ....
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993, pp. 261-270.
....to arbitrary delays. On such machines, it is desirable for algorithms to be wait free, meaning that each thread makes progress independent of the other threads executing on the machine. We present a waitfree algorithm to implement heaps. The algorithms are similar to the general approach given in [4], with optimizations that allow many threads to work on the heap simultaneously, while still guaranteeing a strong serializability property. 1 Introduction We are interested in designing efficient data structures and algorithms for shared memory multiprocessors. Processors on these machines may ....
G. Barnes. A method for implementing lock-free shared data structures. In Proceedings of the 1993 ACM Symposium on Parallel Algorithms and Architectures, pages 261--270, Velen, Germany, June 1993.
No context found.
G. Barnes. A method for implementing lock-free shared-data structures. In SPAA '93, pages 261--270, 1993.
No context found.
Greg Barnes. A method for implementing lock-free shared data structures. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 261-270, Velen, Germany, June 30{July 2, 1993. SIGACT and SIGARCH. Extended abstract.
No context found.
Greg Barnes. A method for implementing lockfree shared data structures. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 261-- 270. ACM Press, June 1993.
No context found.
G. Barnes. A Method for Implementing Lock-Free Shared Data Structures. ACM Symposium on Parallel Algorithms and Architectures, pp. 261--270, 1993.
No context found.
Greg Barnes. A method for implementing lock-free shared data structures. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 261-270, Velen, Germany, June 30{July 2, 1993. SIGACT and SIGARCH. Extended abstract.
No context found.
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", ACM Symposium on Parallel Algorithms and Architectures, pp. 261--270, June 1993.
No context found.
G. Barnes. A method for implementing lock-free shared data structures. In Proceedings of the 5th ACM Symposium on Parallel Algorithms and Architectures, pages 261--270, June 1993.
No context found.
G. Barnes. A Method for Implementing Lock-Free Shared Data Structures. ACM Symposium on Parallel Algorithms and Architectures, pp. 261270, 1993.
No context found.
Greg Barnes. A method for implementing lockfree shared data structures. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 261-- 270. ACM Press, June 1993.
No context found.
G. Barnes, "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures , 1993, pp. 261-270.
No context found.
Greg Barnes. A Method for Implementing Lock-Free Shared Data Structures. In Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '93), pages 261--270, 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC