| L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-conscious synchronization. ACM Trans. on Computer Systems, 15(1):3--40, Feb. 1997. |
....requires lock acquisitions in the presence of conflicts. Improving performance of software non blocking schemes have been studied previously [27, 4, 38] Software proposals have been made to make lock based critical sections non blocking [37] and thread scheduling that is aware of blocking locks [18, 28]. Database concurrency control. Transactions are well understood and studied in database literature [10] The use of timestamps for resolving conflicts and ordering transactions in database systems has been well studied [5, 32] Optimistic concurrency control (OCC) was proposed as an alternative ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler -conscious synchronization. ACM Transactions on Computer Systems, 15(1):3--40, 1997.
....variant is only livelock free. However, the authors argue that starvation should be very unlikely in practice. 10 Other related algorithms. A number of researchers have proposed extensions to the three algorithms covered so far that support process priorities or that tolerate process preemptions [6, 21, 31, 38, 46, 55, 58, 59]. Priorities can be supported either by requiring the spin queue to be priority ordered, or by requiring each process in its exit section to completely scan the queue to nd the highest priority waiting process. In the former case, a process must have the ability to scan the queue and insert its ....
L. Kontothanassis, R. Wisniewski, and M. Scott. Scheduler-conscious synchronization. ACM Transactions on Computer Systems, 15(1):3-40, February 1997.
....spinning was concluded not to be useful for barriers, and immediate blocking was preferred. 1 Interactions of the synchronization algorithm with the operating system kernel have also been proposed. For example, this can be used to ensure that a process holding a lock will not be preempted [8]. In the case of barriers, the main contribution of this work was the observation that if the number of processes exceeds the number of processors p, only the last p processes should spin. Thus the spinning time is selected dynamically by each process as it reaches the barrier, rather than being a ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott, \Scheduler-conscious synchronization ". ACM Trans. Computer Systems 15(1), pp. 3-40, February 1997.
....program should be able to both resume preempted computation, should it lose a processor and utilize a newly granted idle processor. The literature provides a wealth of solutions for attacking the performance bottlenecks that arise from the interference between multiprocessing and multiprogramming [1, 3, 6]. However, little e ort has been spent on the transparent integration of these solutions and multithreading programming models in a generic, model independent manner. Existing frameworks for ecient multiprogrammed execution either pose stringent requirements on the multithreading model, or depend ....
L. Kontothanassis, R. Wisniewski, and M. Scott. Scheduler-Conscious Synchronization. ACM Transactions on Computer Systems, 15(1):3-40, February 1997.
....philosophy of informing algorithms may be applicable in a wide range of algorithms and architectures. 1 Introduction A typical class of applications suffering severe performance degradation in the presence of multiprogramming are multithreaded applications with frequently synchronizing threads [6]. It has been shown that the problem of poor performance under multiprogramming originates This work has been supported by the Hellenic General Secretariat of Research and Technology (G.S.R.T. research program 99 E 566. y This work has been carried out while the second author was with the ....
....Research and Technology (G.S.R.T. research program 99 E 566. y This work has been carried out while the second author was with the High Performance Information Systems Laboratory, University of Patras, Greece. mainly from the poor scheduling of synchronizing threads on physical processors [4] [6]. Idling time at synchronization points may constitute a significant fraction of execution time if one or more of the following conditions hold: a) a thread is preempted while holding a lock, b) a preempted lock waiter thread remains preempted after it has been granted the lock by the previous ....
[Article contains additional citation context not shown here]
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-conscious synchronization. ACM Transactions on Computer Systems, 15(1):3--40, February 1997.
....for thread based execution. However, as a proprietary implementation, SGI s MPI design is not documented and its source code is not available to public. Also, their design uses busy waiting when a process is waiting for events [Salo 1998] which is not desirable for multiprogrammed environments [Kontothanassis et al. 1997; Ousterhout 1982] Lock free studies in [Anderson 1990; Arora et al. 1998; Herlihy 1991; Lumetta and Culler 1998; Massalin and Pu 1991] restrict their queue models to be either FIFO or FILO. These models are not sufficient for MPI point to point communication, and sometimes too general with ....
....communication protocol which is specifically designed for threaded MPI execution and will be presented in next section. Our broadcasting queue management is based on previous lock free FIFO queue studies [Herlihy 1991; Massalin and Pu 1991] During event waiting, we adopt a spin block strategy [Kontothanassis et al. 1997; Ousterhout 1982] when a thread needs to wait for certain events. 5. LOCK FREE MANAGEMENT FOR POINT TO POINT COMMUNICATION Previous lock free techniques [Arora et al. 1998; Herlihy 1991; Lumetta and Culler 1998; Massalin and Pu 1991] are normally designed for FIFO or FILO queues, which are too ....
KONTOTHANASSIS, L. I., WISNIEWSKI, R. W., AND SCOTT, M. L. 1997. Scheduler-conscious synchronization. ACM Trans. Comput. Syst. 15, 1 (Feb.), 3--40.
....as kernel mediated IPC, and has advantages for asynchronous IPC, specialized protocols, and multiprocessors, then it is overall a win. However, we do not consider this sufficiently good since synchronous IPC on 2 The race conditions could be resolved with an extended operating system interface [7]. For example, SymUnix [2] allows user processes to indicate when they are executing a critical section, and should therefore not be pre empted. void Send( Msg msg, Msg ans ) f void Receive( Msg msg ) f while( enqueue( Q[srv] msg ) while( dequeue( Q[srv] msg ) f sleep( 1 ) queue ....
L.I. Kontothanassis, R.W. Wisniewski, and M.L. Scott. Scheduler Conscious Synchronization. Technical Report TR 550, Dept. Comp. Sci., University of Rochester, Rochester, NY, 1994.
....one of information and coordination (or lack thereof) 395] The proposed solutions supply the required information in various ways. For example, it has been suggested that threads be able to request to be temporarily immune to preemption, to prevent cases where the preempted thread holds a lock [181, 265, 632, 320]. This is implemented by setting a special flag in user space just before acquiring a lock. Having the flag in user space avoids the overhead of a kernel call. The kernel checks this flag before preempting a thread, and if it is set, gives the thread some more time. However, no guarantees are ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott, "Scheduler-conscious synchronization ". ACM Trans. Comput. Syst. 15(1), pp. 3--40, Feb 1997.
....There is actually more overhead in spinning and blocking: 1) spinning may be useless if the caller thread that executes the wakeup event is not currently being scheduled; 2) blocking may suffer cache refresh penalty due to context switch. The previous work on scheduler conscious synchronization [4, 17] has considered using OS scheduling information to guide lock and barrier implementations. There is also work on OS scheduling to exploit cache affinity [30] We combine these two ideas together and extend them for the MPI runtime system. The unique aspect of our situation is that scheduling ....
....OS changes while our approach requires a very small change in the current commercial OS interface to expose the kernel scheduling decision or no changes if we can use a user level central monitor to control all parallel jobs. Our scheduler conscious work is motivated by an earlier study in [17] while their work was focused on locks and barriers and our strategy is built on a two level thread scheme and considers cache affinity. The work on multi threading for parallelizing compilers is studied in [20, 21, 28, 31] and their threads are targeted at compiler generated fine grained ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. ACM Transactions on Computer Systems, 1997.
....SGI design uses undocumented low level functions and hardware support specific to the SGI architecture, which may not be general or suitable for other machines. Also, their design uses busy waiting when a process is waiting for events [31] which is not desirable for multiprogrammed environments [23, 28]. Lock free studies in [5, 6, 20, 25, 26] either restrict their queue model to be FIFO or FILO, which are not sufficient for MPI point to point communication, or are too general with unnecessary overhead for MPI. A lock free study for MPICH is conducted in a version for the NEC shared memory ....
....measurement shows that plain memory access is 20 times faster than compare and swap and 17 times faster than read modify write. Our broadcasting queue management is based on previous lock free FIFO queue studies [20, 26] Finally, in our design and implementation, we adopt a spin block strategy [23, 28] when a thread needs to wait for certain events. In next section, we will discuss our point to point communication protocol which is specifically designed for threaded MPI execution. 5 Lock free Management for Point to point Communication Previous lock free techniques [6, 20, 25, 26] are normally ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. ACM Transactions on Computer Systems, 1997.
....locks and barriers; 2) a non blocking synchronization algorithm is used (Section 3. 2) 3) the kernel guarantees that a process is not preempted while executing in a critical section [17] 4) the lock holder polls the status of its peer processes and avoids passing the lock to a preempted process [10]. The latter approach is also called scheduler conscious synchronization. Following the first approach, we embedded two multiprogramming conscious mechanisms in the synchronization algorithms for locks and barriers, namely immediate blocking and competitive spinning. With immediate blocking, an ....
....point is at most twice the cost of a context switch [9] This is the strategy that we employ in the implementation of competitive spinning in our experiments with spin locks. For barriers, we set the spinning interval to be proportional to the number of processes not arrived at the barrier yet [10]. Non blocking synchronization algorithms with concurrent objects have an inherent immunity to the undesirable effects of multiprogramming. The reason is that preemption of a process that attempts to update a concurrent object does not prevent other processes from proceeding and updating the ....
[Article contains additional citation context not shown here]
L. Kontothanassis, R. Wisniewski and M. Scott, Scheduler Conscious Synchronization, ACM Trans. on Computer Systems, 15(1), 1997.
....directly uses undocumented low level functions and hardware support specific to the SGI architecture, which may not be general or suitable for other machines. Also their design uses busy waiting when a process is waiting for events [21] which is not desirable for multiprogrammed environments [13, 18]. Lock free studies in [4, 5, 12, 15, 16] either restrict their queue model to be FIFO or stack, which are not sufficient for MPI point to point communication, or too general with unnecessary overhead for MPI. Thus our second goal is to design an efficient communication protocol for MPI threads by ....
....shows that plain memory access is 20 times faster than compare and swap and 17 times faster than read modify write. Our broadcasting queue management is based on the previous FIFO based lock free research [12, 16] Finally in our design and implementation, we adopt a spin block strategy [13, 18] when a thread needs to wait for certain events. In next section, we will discuss our point to point communication protocol which is specifically designed for threaded MPI execution. 5 Lock Free Management for Point to Point Communication Previous lock free techniques [5, 12, 15, 16] are ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. ACM Transactions on Computer Systems, 1997.
....algorithm, the contention can also be overcome with a simple, lock free centralised algorithm. As well as minimising the latency of barrier synchronisation, other researchers have addressed the problem of imbalances in workload on the performance of barrier synchronisation. For example, [14, 20] develop algorithms that cooperate with the scheduler to ensure good performance in the presence of multiprogramming (i.e. more processes running on a machine than processors) The work described here ignores this phenomenon by just considering the scenario of the number of processes being less ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-conscious synchronization. ACM Transactions on Computer Systems, February 1997.
....locks and barriers; 2) a non blocking synchronization algorithm is used (Section 3. 2) 3) the kernel guarantees that a process is not preempted while executing in a critical section [14] 4) the lock holder polls the status of its peer processes and avoids passing the lock to a preempted process [8]. The latter approach is also called scheduler conscious synchronization. Following the first approach, we embedded two multiprogramming conscious mechanisms in the synchronization algorithms for locks and barriers, namely immediate blocking and competitive spinning. With immediate blocking, an ....
....point is at most twice the cost of a context switch [7] This is the strategy that we employ in the implementation of competitive spinning in our experiments with spin locks. For barriers, we set the spinning interval to be proportional to the number of processes not arrived at the barrier yet [8]. Non blocking synchronization algorithms with concurrent objects have an inherent immunity to the undesirable effects of multiprogramming. The reason is that preemption of a process that attempts to update a concurrent object does not prevent other processes from proceeding and updating the ....
[Article contains additional citation context not shown here]
L. Kontothanassis, R. Wisniewski and M. Scott, Scheduler Conscious Synchronization, ACM Trans. on Computer Systems, 15(1), 1997.
....SGI design uses undocumented low level functions and hardware support specific to the SGI architecture, which may not be general or suitable for other machines. Also, their design uses busy waiting when a process is waiting for events [27] which is not desirable for multiprogrammed environments [19, 24]. Lock free studies in [4, 5, 18, 21, 22] either restrict their queue model to be FIFO or FILO, which are not sufficient for MPI point to point communication, or are too general with unnecessary overhead for MPI. A lock free study for MPICH is conducted in a version for the NEC shared memory ....
....measurement shows that plain memory access is 20 times faster than compare and swap and 17 times faster than read modify write. Our broadcasting queue management is based on previous lock free FIFO queue studies [18, 22] Finally, in our design and implementation, we adopt a spin block strategy [19, 24] when a thread needs to wait for certain events. In next section, we will discuss our point to point communication protocol which is specifically designed for threaded MPI execution. 5 Lock free Management for Point to point Communication Previous lock free techniques [5, 18, 21, 22] are ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. ACM Transactions on Computer Systems, 1997.
.... of communicating kernel scheduler interventions to the user level scheduler through signals or software interrupts [1, 21] Inopportune process preemptions from the operating system were also addressed in the context of synchronization, both with scheduler oblivious [8] and scheduler conscious [10] techniques. More recently, the kernel user communication path for dynamic application adaptability was realized through a shared arena in the IRIX operating system [2] The shared arena is a pinned region of memory shared between the kernel and user programs, which allows communication between ....
L. Kontothanassis, R. Wisniewski and M. Scott, Scheduler Conscious Synchronization, ACM Transactions on Computer Systems, Vol. 15(1), pp. 3--40, February 1997.
....may be too costly. Our work includes basic mechanisms that allow parts of operating systems to be more conscious of what other parts are doing. For instance, a synchronization mechanism may be chosen on the basis of information about what the scheduler is going to do to the processes involved [Kontothanassis, Wisniewski, and Scott 1994, 1995] Performance monitoring tools are needed to analyze how time is spent among processes in a complex parallel application [Wisniewski and Stevens 1995] Last, actual artificial intelligence applications such as parallel planning in a real time environment have allowed us to learn about ....
Kontothanassis, L.I., R.W. Wisniewski, and M.L. Scott, "Scheduler-conscious synchronization," TR 550, Computer Science Dept., U. Rochester, December 1994.
No context found.
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-conscious synchronization. ACM Trans. on Computer Systems, 15(1):3--40, Feb. 1997.
....in a system with one thread per processor, but it fails on multiprogrammed systems, in which a neighbor thread may be preempted, and thus unable to cooperate. Theproblem of preemption in critical sections has received considerable attention over the years. Alternative strategies include avoidance [6, 10, 14], recovery [3, 4] and tolerance [9, 16] The latter approach is appealing for commercial applications because it does not require modification of the kernel interface: if a thread waits too long for a lock, it assumes that the lock holder has been preempted. It abandons its attempt, yields the ....
....synchronization on large multiprogrammed systems. In future work we hope to evaluate our algorithms in the context of commercial OLTP codes. We also plan to develop variants that block in the scheduler on timeout [9, 16] cooperate with the scheduler to avoid preemption while in acritical section [6, 10], or adapt dynamically between test and set and queue based locking in response to observed contention [11] In a related vein, we are developing a tool to help verify the correctness of locking algorithms by transforming source code automatically into input for a model checker. 17 ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. ACM Transactions on Computer Systems, 15(1):3--40, February 1997.
....a lock for logging every event would add significant overhead to the event logging. More importantly, however, are the consequences of a process getting preempted while holding the lock. Although such an occurrence may be infrequent it can not only cause extremely bad average case performance [5] but can lead to large and unexpected delays making it unacceptable for real time applications. Although Wisniewski et al. 11] have proposed solutions for handling multiprogrammed synchronization and Craig [2] and Takada and Sakamura [7] have proposed similar real time solutions they are too ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler conscious synchronization. Technical Report 550, Department of Computer Science, University of Rochester, Rochester, NY, Dec 1994.
....technique to deal with preemption not only of the process holding a lock, but of processes waiting in the lock s queue as well. Preempting and scheduling processes in an order inconsistent with their order in the lock s queue can degrade performance dramatically. Kontothanassis et al. [12, 27, 28] present scheduler conscious versions of the ticket lock, the MCS lock [15] and Krieger et al. s reader writer lock [13] These algorithms detect the descheduling of critical processes using handshaking and or a widened kernel user interface. The proposals of Black and of Anderson et al. require ....
....we recommend (1) that hardware always include a universal atomic primitive, and (2) that kernel interfaces provide a mechanism for preemption safe locking. For small scale machines, the Synunix interface appears to work well [7] For larger machines, a more elaborate interface may be appropriate [12]. Quicksort queue 1 2 3 4 5 6 7 8 9 10 11 0.5 1 1.5 2 Processors preemption safe lock ordinary lock MS non blocking 1 2 3 4 5 6 7 8 9 10 11 0.5 1 1.5 2 Processors ordinary lock preemption safe lock MS non blocking 1 2 3 4 5 6 7 8 9 10 11 0.5 1 1.5 2 Processors ordinary lock preemption safe ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. TR 550, Computer Science Department, University of Rochester, December 1994. Submitted for publication.
....technique to deal with preemption not only of the process holding a lock, but of processes waiting in the lock s queue as well. Preempting and scheduling processes in an order inconsistent with their order in the lock s queue can degrade performance dramatically. Kontothanassis et al. [18] present preemption safe (or scheduler conscious ) versions of the ticket lock, the MCS lock [25] and Krieger et al. s reader writer lock [19] These algorithms detect the descheduling of critical processes using handshaking and or a widened kernel user interface, and use this information to ....
....we recommend (1) that hardware always include a universal atomic primitive, and (2) that kernel interfaces provide a mechanism for preemption safe locking. For small scale machines, the Synunix interface [8] appears to work well. For larger machines, a more elaborate interface may be appropriate [18]. We have presented a concurrent queue algorithm that is simple, non blocking, practical, and fast. It appears to be the algorithm of choice for any queue based application on a multiprocessor with a universal atomic primitive. Also, we have presented a two lock queue algorithm. Because it is ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. In ACM Transactions on Computer Systems, February 1997.
....technique to deal with preemption not only of the process holding a lock, but of processes waiting in the lock s queue as well. Preempting and scheduling processes in an order inconsistent with their order in the lock s queue can degrade performance dramatically. Kontothanassis et al. [17] present preemption safe (or scheduler conscious ) versions of the ticket lock, the MCS lock [24] and Krieger et al. s reader writer lock [18] These algorithms detect the descheduling of critical processes using handshaking and or a widened kernel user interface, and use this information to ....
....we recommend (1) that hardware always include a universal atomic primitive, and (2) that kernel interfaces provide a mechanism for preemption safe locking. For small scale machines, the Symunix interface [9] appears to work well. For larger machines, a more elaborate interface may be appropriate [17]. We have presented a concurrent queue algorithm that is simple, non blocking, practical, and fast. It appears to be the algorithm of choice for any queue based application on a multiprocessor with a universal atomic primitive. Also, we have presented a two lock queue algorithm. Because it is ....
L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott. Scheduler-Conscious Synchronization. ACM Transactions on Computer Systems, 15(1), February 1997).
....exclusion, reader writer locks, and barriers in turn. Space constraints preclude the inclusion of actual code. Readers are encouraged to access pseudocode at http: www.cs.rochester.edu u scott synch pseudocode ps and sc.html. Earlier versions appear in the technical report version of this paper [16]. Both pseudocode and C source code are available from ftp: ftp.cs.rochester.edu pub packages sched conscious synch. All of our algorithms work well in a dynamic hardware partitioned environment an environment widely believed to provide the best combination of throughput and response time for ....
KONTOTHANASSIS, L. I., WISNIEWSKI, R. W., and SCOTT, M. L. Scheduler-Conscious Synchronization. TR 550, Computer Science Department, University of Rochester, December 1994. Available as ftp://ftp.cs.rochester.edu/pub/papers/systems/94.tr550.Scheduler conscious synchronization.ps.gz.
No context found.
L. Kontothanassis, R. Wisniewski, and M. Scott. Scheduler-conscious synchronization. ACM Transactions on Computer Systems, 15(1):3-40, February 1997.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC