40 citations found. Retrieving documents...
John Zahorjan, Edward D. Lazowska, and Derek L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Scheduling and Resource Management Techniques for Multiprocessors - Black (1990)   (25 citations)  (Correct)

....system concurrent applications requires Strong Discouragement support, and that Handoff Scheduling is a powerful optimization if the information it requires is available. These are worst case results, but are indicative of the relative performance of the hints. Results reported by Zahorjan, et al.[59, 60] confirm the need for mechanisms to improve synchronization with non running threads. This research investigated the interaction of multiprocessor locks with scheduling disciplines for applications with more threads than processors. An important result is that descheduling a thread that holds a ....

John Zahorjan, Edward D. Lazowska, and Derek L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Processors. Technical Report 89-07-03, Department of Computer Science and Engineering, University of Washington, Seattle, WA, 1989.


An Analysis of Software Interface Issues for SMT Processors - Redstone (2002)   (1 citation)  (Correct)

....testing if the event has occurred. On both a superscalar and multiprocessors, spinning can waste processor resources due to the opportunity cost of not context switching to another thread. Researchers have measured and developed techniques addressing the costs of spinning on these processors [37, 4, 40, 42, 5, 50, 10, 36, 90]; we do not investigate them here. Spinning can exact a larger performance cost on SMT, because all threads share pipeline resources. We term pipeline resources as all resources shared between contexts that are necessary to execute instructions that is, all shared resources except the caches ....

....while a processor holds the lock; only one occurs when a processor writes the flag of another processor. In addition to increased bus contention, spinning also wastes resources due to the opportunity cost of not context switching to a thread that can perform useful work. A few papers such as [40, 42, 90, 89] investigate how to best choose when to context switch to another thread and how to schedule threads to reduce spinning cost. These techniques can work synergistically with the techniques to remove spinning evaluated in this chapter. Several studies examine inter thread interactions on ....

ZAHORJAN, J., LAZOWSKA, E. D., AND EAGER, D. L. The effect of scheduling discipline on spin overhead in shared memory parallel systems. IEEE Transactions on Parallel and Distributed Systems 2, 2 (April 1991).


Processor Management Policies for Multiprocessors - Yu (1994)   (Correct)

....jobs occupy the system for an extended period of time. This monopoly of the system results in large queueing delay for small jobs. These problems can be alleviated by multiprogramming. Multiprogramming in a shared memory multiprocessor environment has been studied by several researchers [31] [39]. All these studies mainly focus on the assignment of free processors to processes without considering the communication requirements. This is a valid approach for small shared memory systems since these machines are normally characterized by constant path length. However, arbitrary selection of ....

....The second cost is due to the preemption of a lock holding process. While the lock holding process is in the ready queue after being preempted, other processes of the same job may start spinning. This leads to process thrashing and once it happens, high lock contention may continue for a long time [39]. As much as 60 of processor time can be wasted due to process thrashing [37] Coscheduling [31] and gang scheduling [37] have been suggested to alleviate this problem where the overhead drops considerably. Cm , NYU Ultracomputer, 22 BBN TC2000 and SiliconGraphics s IRIX employ variations of the ....

[Article contains additional citation context not shown here]

J.Zahorjan, E.D.Lazowska and D.L.Eager, "The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems," IEEE Trans. Parallel and Distributed Systems, Vol.2 pp.180-198, Apr.1991.


Scheduling Multiprocessor Tasks - an Overview - Drozdowski (1996)   (21 citations)  (Correct)

....or the communication system. Coscheduling (called also gang scheduling) is a scheduling policy proposed to avoid these difficulties [50] Coscheduling consists in granting simultaneously (in the same time quantum) the processors to the threads of the same application. It has been demonstrated in [49, 84] that coscheduling performs well in a wide range of conditions and for various models of parallel applications. Thus, coscheduling is postulated in parallel systems. Note, that coscheduled applications are multiprocessor tasks because more than one processor is used simultaneously. Parallel ....

....to imagine a scheduler able to handle, analyze and optimize so big structures. Furthermore, threads of the application are indistinguishable for the scheduler. Hence, without additional information from the application it is not able to give a priority to important threads (e.g. holding a lock) [84]. Note, that the DAG is highly data dependent and can be precisely known after the execution rather than before. From the above we conclude that it is reasonable for the operating system to control only the number of processors granted to a parallel application and leave the control of threads to ....

J.Zahorjan, E.D.Lazowska, D.L.Eager, "The effect of scheduling discipline on spin overhead in shared memory parallel systems", IEEE Transactions on Parallel and Distributed Systems 2/2 (1991) 180-198.


Application Development using Compositional Performance Analysis - Rifkin (1999)   (Correct)

.... (for example, see Mao et al. s heuristics for on line algorithms for the single machine scheduling problem [MKR95] or simulation methods (for example, see Zahorjan 25 et al. s stochastic modeling of the effect of scheduling disciplines on spin overhead in shared memory parallel systems [ZLE91] To provide solutions to real world scheduling problems, restrictions on the parallel program and the machine representations can be relaxed. For example, since the problem of choosing the data partitioning and distribution to achieve the optimal performance is NP complete, we are more ....

J. Zahorjan, E. D. Lazowska, and D. L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991. 85


Job Scheduling in Multiprogrammed Parallel Systems - Feitelson (1997)   (16 citations)  (Correct)

....for simultaneous execution, e.g. through gang scheduling. For example, users may expect that threads that are spawned together actually execute in parallel, and therefore busy waiting can be used to synchronize them. If gang scheduling is not used, this can have grave performance implications [198, 631]. Moreover, if the scheduling is not preemptive at all, it could lead to deadlock [84] A related issue is the granularity of the interactions. In general, granularity refers to amount of computation performed between successive interactions [336] Thus in fine grain interactions, interactions ....

J. Zahorjan, E. D. Lazowska, and D. L. Eager, "The effect of scheduling discipline on spin overhead in shared memory parallel systems". IEEE Trans. Parallel & Distributed Syst. 2(2), pp. 180--198, Apr 1991.


Affinity Scheduling of Unbalanced Workloads - Saskatoon (1993)   (Correct)

....poli 19 cies employing coordinated processor reallocation, processor reallocation is performed in concert with the application, while in the uncoordinated reallocation, the operating system may remove a processor without interaction with the affected application. Previous work using modeling [66] [67] has shown that uncoordinated preemption can lead to very poor performance. First, many parallel applications use synchronization primitives that require spin waiting on a variable (remaining in a tight loop in which the variable s value is read and tested) until it is set . If the thread that ....

J. Zahorjan, E. D. Lazowska, D. L. Eager, "The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems", IEEE Transactions on Parallel and Distributed Systems, Vol. 2, No. 2 (April 1991), pp. 180-198.


Thread Prioritization: A Thread Scheduling Mechanism for.. - Fiske, Dally (1995)   (3 citations)  (Correct)

....that the flag on which each thread spins while waiting in the queue is locally allocated and generates no global traffic while spinning. Scheduling threads waiting for a lock raises a number of issues. It is clear that once a lock is acquired the thread owning the lock should not be swapped out [26, 17]. This is because all other threads waiting for the lock will be unable to make progress until the lock is released, and so performance can be seriously degraded. In the case of the queue lock there is an additional factor to be considered: the order in which threads are going to acquire the lock ....

John Zahorjan, Edward D. Lazowska, and Derek L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991.


Implicit Coscheduling: Coordinated Scheduling with Implicit.. - Arpaci-Dusseau (1998)   (5 citations)  (Correct)

....cannot be compared to that of spin waiting with coordinated scheduling. As we discuss in Chapter 4, the key to good performance with local scheduling and two phase waiting is to achieve coordinated scheduling. With coordinated scheduling, the waiting algorithm will react to smaller waiting times [176, 177]. With smaller waiting times, spinning for some amount of time before blocking improves performance relative to blocking immediately. 3.3.2 Explicit Coscheduling To improve the performance of fine grain parallel applications, coscheduling ensures that communicating processes are scheduled ....

John Zahorjan, Edward D. Lazowska, and Derek L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991.


Concurrent Update on Multiprogrammed Shared Memory.. - Michael, Scott (1996)   (2 citations)  (Correct)

....by mutual exclusion locks. In order to achieve acceptable response time and high utilization, most multiprocessors are multiprogrammed by time slicing processors among processes. The performance of mutual exclusion locks in parallel applications degrades on time slicing multiprogrammed systems [29] due to preemption of processes holding locks. Any other processes busy waiting on the lock are then unable to perform useful work until the preempted process is rescheduled and subsequently releases the lock. Alternative multiprogramming schemes to time slicing have been proposed in order to ....

J. Zahorjan, E. D. Lazowska, and D. L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991.


Static and Dynamic Processor Scheduling Disciplines in.. - Menascé, Saha (1995)   (1 citation)  (Correct)

....shared among multiple tasks. In this case, there are three factors that can adversely affect performance of multiprogrammed parallel systems, namely: overhead of context switching between multiple tasks, synchronization primitives that require spin waiting on a variable, and low cache hit ratio [19, 22, 27, 47, 53]. In addition to being static or dynamic, scheduling disciplines for multiprogramming can be either preemptive or non preemptive. The utility of preemption was observed in situations when a parallel workload exhibits a high variability in processor demands and low variability in number of tasks ....

Zahorjan, J., Lazowska, E., and Eager, D. The effect of scheduling discipline on spin overhead in shared memory multiprocessors. Technical Report 89-07-03, Department of Computer Science, University of Washington, July 1989.


Coscheduling Based on Run-Time Identification of Activity.. - Feitelson, Rudolph   (26 citations)  (Correct)

.... its use can be extended to large distributed memory machines as well [26, 24] A number of papers have evaluated the performance implications of coscheduling and gang scheduling, and compared them with dynamic partitioning and other scheduling policies for multiprogrammed multiprocessors [10, 28, 22, 16, 35]. The Mach scheduler also allows processors to be allocated to jobs dynamically, but performs the scheduling of activities on these processors in the kernel rather than leaving it to the application [1] Like the other dynamic partitioning schemes, it does not require a one to one relation between ....

J. Zahorjan, E. D. Lazowska, and D. L. Eager, "The effect of scheduling discipline on spin overhead in shared memory parallel systems". IEEE Trans. Parallel & Distributed Syst. 2(2), pp. 180--198, Apr 1991.


Adaptive Operating System Abstractions: A Case Study of.. - Mukherjee, Schwan (1994)   (Correct)

....of various kernel components [2, 3, 4, 27] and applications on synchronization. For example, Zahorjan, Lazowska and Eager [61] first examine the extent to which multiprogramming and data dependencies in an application complicate user level decisions to spin or block, and they also evaluate in [62] how the overhead of spinning is affected by various scheduling policies. The structure and design of configurable locks borrows from our own past work[23] and from notions commonly used in object oriented operating systems like Choices [14, 13, 46] and Renaissance [48] Some degree of lock ....

....not disjoint in functionality and or performance. For example, Ousterhout[42] demonstrates that co scheduling can considerably decrease interthread communication synchronization overheads. Similarly, lock configurations and threads cache affinities significantly impact thread scheduling overheads [62] and policies, which in turn, can affect the overheads of thread exception handling [39] Moreover, in NUMA multiprocessors, memory management policy impacts on IPC overheads and scheduling policy. In response to these characteristics, the configurable micro kernel (the C Kernel) being ....

Zahorjan, J., Lazowska, E., and Eager, D. The effect of scheduling discipline on spin overhead in shared memory parallel systems. IEEE Transactions on Parallel and Distributed Systems 2, 2 (April 1991), 180--98.


Scheduler-Conscious Synchronization - Kontothanassis, Wisniewski, Scott (1994)   (19 citations)  (Correct)

....implementation cost. or to minimize contention. The algorithm s performance is then vulnerable not only to preemption of the process in the critical section, but also to preemption of processes near the head of the waiting list the algorithm may give the lock to a process that is not running [48]. Similarly, a barrier algorithm may keep processes in a tree, in order to replace O(n) serialized operations on a counter with O(log n) operations on the longest path in the tree. But then processes must execute their portions of the barrier algorithm in the order imposed by the tree. If the ....

....factor of optimal worst case performance. the length of critical sections, rather than to cope with preemption. Competitive spinning works best when the behavior of a lock does not change rapidly with time, so that past behavior is an appropriate indicator of future behavior. Zahorjan et al. [46, 48] present a formal model of spin wait times. For lock based applications in which all processes on a given processor belong to the same application, they show that performance problems can be avoided if the operating system simply partitions processes among processors and allows the application to ....

[Article contains additional citation context not shown here]

J. Zahorjan, E. D. Lazowska, and D. L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991.


Non-Blocking Algorithms and Preemption-Safe Locking on.. - Michael, Scott (1998)   (17 citations)  (Correct)

....exclusion locks. In order to achieve acceptable response time and high utilization, most multiprocessors are multiprogrammed by time slicing processors among processes. The performance of mutual exclusion locks in parallel applications degrades significantly on time slicing multiprogrammed systems [44] due to the preemption of processes holding locks. Any other processes busy waiting on the lock are then unable to perform useful work until the preempted process is rescheduled and subsequently releases the lock. Alternative multiprogramming schemes to time slicing have been proposed in order to ....

J. Zahorjan, E. D. Lazowska, and D. L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991.


A Survey of Multiprocessor Operating System Kernels - Mukherjee, Schwan, Gopinath (1993)   (1 citation)  (Correct)

....methods used. However, for applications that have sizable working sets that fit into the cache, process control performs better than gang scheduling. For the applications considered 4 , the performance gains due to hand off scheduling and processor affinity are shown to be small. In [266] the authors study the effects of two environmental factors, multiprogramming and data dependent execution times, on spinning overhead of parallel applications, and how the choice of scheduling discipline can be used to reduce the amount of spinning in each case. Specifically, they conclude that ....

....effects of other kernel components [8, 9, 10, 99] as well as applications on synchronization. For example, Zahorjan, Lazowska and Eager [265] first examines the extent to which multiprogramming and data dependencies in an application complicate an user s decision to spin or block, then evaluate [266] how the overhead of spinning is affected by various scheduling policies. Read Write Locks: A read write lock allows either multiple readers or a single writer to enter a critical section at the same time. The waiting processes may either spin or block depending on whether the lock is implemented ....

J. zahorjan, E. Lazowska, and D. Eager. The effect of scheduling discipline on spin overhead in shared memory parallel systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--98, April 1991.


Multi-Application Support in a Parallel Program Performance Tool - Bruce Irvin (1993)   (3 citations)  (Correct)

....are scheduled at the same time, and a process that is busy waiting will use its entire time quantum before releasing its processor. Therefore, other processes remain blocked until the end of the time quantum before running. The problems with this type of always spin barrier are well understood [16], and several solutions have been proposed to fix them. One solution is to use barriers that block after only a small amount of spinning [3] and the other is to co schedule the processes of each application [11] We implemented the latter alternative and the results are summarized in the fourth ....

....Multi Application Metric Table Figure 8. Figure 8. The results show that when all of the processes of an application are scheduled together, the cost of always spin barriers is reduced and the elapsed time of each application is reduced substantially. The table s data also confirm the prediction [16] that waiting time at spin locks is not significantly affected by competing processes. Our co scheduler is a simple server that allows processes to register themselves with an application identifier. The server then uses UNIX signals to schedule all processes with identical application ....

J. Zahorjan, E. D. Lazowska and D. L. Eager, The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Processors, University of Washington Technical Report 89-07-03, July 1989. -- --


A Cooperative Approach to Two-Phase Waiting - Arpaci-Dusseau, Culler   (Correct)

....machines. When a message arrives at a workstation, the message cannot be handled until the destination process is scheduled. Since the sending process may have to wait a long time for the destination to be scheduled, local scheduling can perform orders of magnitude worse than explicit coscheduling [2, 10, 14, 24, 31, 32]. In the presence of long completion times, the optimal behavior of local scheduling is to blockimmediately. Simulations have shown that two phase waiting with a spin time equal to the local contextswitch cost performs two times worse than blocking immediately, since very few of the communication ....

J. Zahorjan, E. D. Lazowska, and D. L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--


Processor Allocation Policies for Message-Passing Parallel.. - Mccann (1994)   (49 citations)  (Correct)

....rather than a per thread one. This highlights the need to view scheduling as a two level procedure, separating the kernel level processor allocation decisions from the application level decisions of which user level threads to execute on each processor. Work by Ousterhout [32] Zahorjan et al. [47, 48] and Gupta et al. 20] shows that single level schedulers, where a kernel schedules all threads of all applications (usually from a single queue of ready threads) provide poor performance to individual jobs. These schedulers also provide unfair service when applications contain different numbers ....

....that, as in uniprocessor systems, average response time can be improved by allocating resources preferentially to smaller jobs, and that in the absence of a priori job characterizations, allocating an equal fraction of total processing power to each job is an effective heuristic. Zahorjan et al. [47, 48] use modelling to examine the effect of scheduling discipline on 10 spinning overhead and show that preemption of processors must be done in a coordinated manner. If a scheduling policy preempts a processor from a running job, and that processor was executing a thread that held a spin lock, then ....

J. Zahorjan, E. Lazowska, and D. Eager. The effects of scheduling discipline on spin overhead in shared memory parallel systems. IEEE Transactions on Parallel and Distributed Systems, (2)2:180--189, April 1991.


Algorithms for Scalable Synchronization on Shared-Memory.. - Mellor-Crummey, Scott (1991)   (25 citations)  (Correct)

....with exponential backoff will allow latecomers to acquire the lock when processes that arrived earlier are not running. In this situation the test and set lock may be preferred to the FIFO alternatives. Additional mechanisms can ensure that a process is not preempted while actually holding a lock [47, 52]. All of the spin lock algorithms we have considered require some sort of fetch and Phi instructions. The test and set lock of course requires test and set. The ticket lock requires fetch and increment. The MCS lock requires fetch and store, 13 and benefits from compare and swap. Anderson s ....

J. Zahorjan, E. D. Lazowska, and D. L. Eager. The effect of scheduling discipline on spin overhead in shared memory parallel processors. Technical Report TR-89-07-03, Computer Science Department, University of Washington, July 1989.


Scheduler Activations: Effective Kernel Support for.. - Anderson, Bershad.. (1992)   (281 citations)  Self-citation (Lazowska)   (Correct)

....are running on top of kernel threads, however, time slicing can lead to problems. For example, a kernel thread could be preempted while its user level thread is holding a spin lock; any user level threads accessing the lock will then spin wait until the lock holder is rescheduled. Zahorjan et al. [28] have shown that time slicing in the presence of spin locks can result in poor performance. As another example, a kernel thread running a user level thread could be preempted to allow another kernel thread to run that happens to be idling in its user level scheduler. Or a kernel thread running a ....

....yet addressed is that a user level thread could be executing in a critical section at the instant when it is blocked or preempted. 1 There are two possible ill effects: poor performance (e.g. because other threads continue to test an application level spin lock held by the pre empted thread) [28], and deadlock (e.g. the preempted thread could be holding the ready list lock; if so, deadlock would occur if the upcall attempted to place the preempted thread onto the ready list) Problems can occur even when critical sections are not protected by a lock. For example, FastThreads uses ....

ZAHORJAN, J., LAZOWSKA, E., AND EAGER, D. The effect of scheduling discipline on spin overhead in shared memory multiprocessors. IEEE Trans. Parallel Distrib. Syst. 2, 2 (Apr. 1991), 180--198. Received June 1991; revised August 1991; accepted September 1991


Implicit Coscheduling: Coordinated Scheduling with Implicit.. - Arpaci-Dusseau (1998)   (5 citations)  (Correct)

No context found.

John Zahorjan, Edward D. Lazowska, and Derek L. Eager. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991.


Towards Scalable Multiprocessor Virtual Machines - Uhlig, LeVasseur, Skoglund.. (2004)   (1 citation)  (Correct)

No context found.

John Zahorjan, Edward D. Lazowska, and Derek L. Eager. The effect of scheduling discipline on spin overhead in shared memory parallel systems. IEEE Transactions on Parallel and Distributed Systems, 2(2):180--198, April 1991. 14


Scheduler Activations: Effective Kernel Support for.. - Anderson, Bershad.. (1992)   (281 citations)  (Correct)

No context found.

Zahorjan, J., Lazowsk, E., and Eager, D. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 2(2):180-198, April 1991. 109


A Scalable Multi-Discipline, Multiple-Processor Scheduling.. - James Barton Nawaf (1995)   (11 citations)  (Correct)

No context found.

J. Zahorjan et al., "The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Multiprocessors", IEEE Transactions on Distributed Systems, April 1991.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC