33 citations found. Retrieving documents...
Feitelson D, Rudolph L. Coscheduling based on run-time identification of activity working sets. Interational Journal of Parallel Programming 1995; 23(2):135--160.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Thread Scheduling for Multiprogrammed Multiprocessors - Arora, Blumofe, Plaxton (2001)   (41 citations)  (Correct)

....on P dedicated processors. Such scheduling algorithms dynamically map threads onto the processors with the goal of achieving P fold speedup. Though such algorithms will work in some multiprogrammed environments, in particular those that employ static space partitioning [15, 30] or coscheduling [18, 30, 33], they do not work in the multiprogrammed environments being supported by modern shared memory multiprocessors and operating systems [9, 15, 17, 23] The problem lies in the assumption that a fixed collection of processors are fully available to perform a given computation. This research is ....

....bounds on time and space. The practical application and possible adaptation of this idea to multiprogrammed environments is an open question. Prior work that has considered multiprogrammed environments has focused on the kernel level scheduler. With coscheduling (also called gang scheduling) [18, 33], all of the processes belonging to a computation are scheduled simultaneously, thereby giving the computation the illusion of running on a dedicated machine. Interestingly, it has recently been shown that in networks of workstations coscheduling can be achieved with little or no modification to ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on runtime identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995.


SWAP: A Scheduler With Automatic Process Dependency Detection - Zheng, Nieh (2004)   (1 citation)  (Correct)

....to resolve priority inversion in general and do not resolve priority inversions that arise due to process dependencies that are not explicitly identified in advance. Co scheduling mechanisms have been developed to improve the performance of parallel applications in parallel computing environments [14, 15, 16, 17]. These mechanisms try to schedule cooperating threads or processes belonging to the same parallel application to run concurrently. This reduces busy waiting and context switching overhead and improves the degree of parallelism that can be used by the application. Because many of these ....

D. G. Feitelson and L. Rudolph, "Coscheduling Based on Run-Time Identification of Activity Working Sets," International Journal of Parallel Programming, vol. 23, pp. 136--160, April 1995.


Dynamic Coscheduling on Workstation Clusters - Sobalvarro, Pakin, Weihl, Chien (1998)   (40 citations)  (Correct)

....consist of thread collections, explicitly indicating which threads should be coscheduled. A variety of implicit schemes which do not require explicit programmer annotation have been explored. On distributed memory systems, the need for coscheduling has typically been associated with communication [17, 6, 3, 16]. Feitelson s Runtime Activity Working Set Identification (RAWSI) monitors the communication between processes or threads 3 to determine their rate of communication. Working sets of processes (which require coscheduling) are identified based on their rate of communication. RAWSI collects the ....

....Mechanisms for Achieving Coscheduling Once thread clusters have been identified, a mechanism for coscheduling must be used. In many systems, particularly those with shared memory, a gang scheduler which has the capability to achieve coordinated context switches across processors has been assumed [8, 12, 4, 6]. Such systems replace the basic process scheduler in the operating system, and schedule the related threads across the processing nodes. These schedulers can achieve high system efficiency on regular parallel applications, but have difficulty in selecting alternate jobs to run when processes ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on run-time identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995.


Using Runtime Measured Workload Characteristics in.. - Nguyen, Vaswani.. (1996)   (31 citations)  (Correct)

....and (2) McCann et al. s scheduler attempts to reallocate processors at a much finer grain than does ours. Thus, the effectiveness of their scheduler is dependent on the existence of application idle periods that are long relative to processor reallocation overheads. Feitelson and Rudolph [8] take a similar approach to ours, proposing to dynamically gather information about communicating sets of processes in an attempt to relax the constraints of co scheduling. Sobalvarro and Weihl [25] also propose several ways to use runtime identification of sets of communicating processes to relax ....

D. G. Feitelson and L. Rudolph. Coscheduling Based on Runtime Identification of Activity Working Sets. International Journal of Parallel Programming, 23(2):135--160, Apr. 1995.


Modeling Communication Locality in Multiprocessors - Salisbury, Chen, Melhem (1999)   (2 citations)  (Correct)

....Interprocessor communication is required to coordinate tasks executing on different processors. The exploitation of locality through assignment of tasks to processors and the impact of data distribution on communication for cache coherence have been examined in numerous studies, including [7, 18, 20]. Communication in parallel programs 1 often involves simultaneous, synchronized communication between processors and repetitive communication patterns arising from looping programs. Different loops may incorporate common algorithms for processor coordination, and hence have similar communication ....

D. Feitelson and L. Rudolph. Coscheduling based on runtime identification of activity working sets. International Journal of Parallel Programming, 23(2):135--159, April 1995.


Demand-based Coscheduling of Parallel Jobs on Multiprogrammed.. - Sobalvarro (1997)   (63 citations)  (Correct)

....speedup. If process control as it 21 is described in [24] were used as the only means of timesharing a multiprocessor, we would expect that such applications would show poor performance when the job load was high. 2. 4 Runtime Activity Working Set Identification Feitelson and Rudolph describe in [9] an algorithm called runtime activity working set identification for scheduling parallel programs on a timeshared multiprocessor (we shall call this algorithm RAWSI, for brevity s sake) While demand based coscheduling was developed independently from RAWSI, 1 the two have significant ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on run-time identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995. 92


Dynamic Coscheduling on Workstation Clusters - Sobalvarro (1998)   (40 citations)  (Correct)

....consist of thread collections, explicitly indicating which threads should be coscheduled. A variety of implicit schemes which do not require explicit programmer annotation have been explored. On distributed memory systems, the need for coscheduling has typically been associated with communication [17, 6, 3, 16]. Feitelson s Runtime Activity Working Set Identification (RAWSI) monitors the communication between processes or threads 3 to determine their rate of communication. Working sets of processes (which require coscheduling) are identified based on their rate of communication. RAWSI collects the ....

....Mechanisms for Achieving Coscheduling Once thread clusters have been identified, a mechanism for coscheduling must be used. In many systems, particularly those with shared memory, a gang scheduler which has the capability to achieve coordinated context switches across processors has been assumed [8, 12, 4, 6]. Such systems replace the basic process scheduler in the operating system, and schedule the related threads across the processing nodes. These schedulers can achieve high system e#ciency on regular parallel applications, but have di#culty in selecting alternate jobs to run when processes block, ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on run-time identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995.


Improving Parallel Job Scheduling Using Runtime Measurements - daSilva, Scherson   (Correct)

....final remarks. 2 Previous Work In [1] Arpaci Dusseau, Culler and Mainwaring use information available at run time (in this case the number of incoming messages) to decide if a task should continue to spin or block in the pairwise cost benefit analysis in the implicit cosheduling algorithm. In [14], Feitelson and Rudolph used runtime information to identify activity working sets, i.e. the set of activities (tasks) that should be scheduled together, through the monitoring of the utilization pattern of communication objects by the activities. Their work can be considered complementary to ours ....

D. Feitelson and L. Rudolph. Coscheduling Based on Runtime Identification of Activity Working Sets. International Journal of Parallel Programming, 23(2):135-- 160, 1995.


Using Runtime Measured Workload Characteristics in Parallel.. - Thu Nguyen Raj (1996)   (31 citations)  (Correct)

....and (2) McCann et al. s scheduler attempts to reallocate processors at a much finer grain than does ours. Thus, the effectiveness of their scheduler is dependent on the existence of application idle periods that are long relative to processor reallocation overheads. Feitelson and Rudolph [8] take a similar approach to ours, proposing to dynamically gather information about communicating sets of processes in an attempt to relax the constraints of co scheduling. Sobalvarro and Weihl [25] also propose several ways to use runtime identification of sets of communicating processes to relax ....

D. G. Feitelson and L. Rudolph. Coscheduling Based on Runtime Identification of Activity Wo r k i ng Se t s International Journal of Parallel Programming, 23(2):135--160, Apr. 1995.


Implicit Coscheduling: Coordinated Scheduling with Implicit.. - Arpaci-Dusseau (1998)   (5 citations)  (Correct)

....its time context switching between processes; the resulting performance for fine grain parallel applications is not acceptable. Over the years, numerous researchers have found that the slowdown of local scheduling may be orders of magnitude worse than ideal for frequently communicating processes [7, 35, 55, 56, 70, 99]. To resolve the performance inefficiencies of local scheduling, coscheduling [134] or gang scheduling, explicitly schedules the cooperating processes from a single job simultaneously across processors. A strict round robin global schedule of processes is constructed, and a global context switch ....

....when the waiting time for an event is high relative to the cost of relinquishing the processor, better system throughput is achieved if processes block immediately. Previous research has investigated the trade off of blocking immediately and spinwaiting on applications performing only barriers [55, 56]. In their analysis, the waiting time of the barrier is strictly the amount of load imbalance in the application, where loadimbalance is the time between the arrival of the first and last process to reach a barrier synchronization; the cost of relinquishing the processor is equal to the time for a ....

[Article contains additional citation context not shown here]

Dror G. Feitelson and Larry Rudolph. Coscheduling Based on Run-Time Identification of Activity Working Sets. International Journal of Parallel Programming, 23(2):136-- 160, April 1995.


Hood: A User-Level Threads Library for Multiprogrammed.. - Blumofe, Papadopoulos (1998)   (1 citation)  (Correct)

....Similarly, users expect multiprocessor compute servers to support multiprogrammed work loads that include parallel applications. Unfortunately, unless parallel applications are coscheduled [25] or subject to process control [27] they display poor performance in such multiprogrammed environments [5, 12, 13, 14, 16]. As an alternative to coscheduling or process control, in this paper we present Hood, a C user level threads library, implemented entirely at user level with no operating system modifications, whose scheduler achieves efficient performance under multiprogramming. Hood runs on shared memory ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on runtime identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995.


The Performance of Work Stealing in Multiprogrammed.. - Blumofe, Papadopoulos (1998)   (5 citations)  (Correct)

....Similarly, users expect multiprocessor compute servers to support multiprogrammed work loads that include parallel applications. Unfortunately, unless parallel applications are coscheduled [40] or subject to process control [44] they display poor performance in such multiprogrammed environments [10, 17, 18, 19, 26]. As an alternative to coscheduling or process control, in this paper we investigate the use of dynamic, user level, thread scheduling in order to achieve efficient performance under multiprogramming. We show that a non blocking implementation of the well known and provably efficient ....

.... of kernel level resources, specifically processes [26, 32, 35, 40, 41, 44, 47] A number of studies have compared various process scheduling strategies, and all have concluded that the traditional time sharing, priority based local scheduler found in most operating systems is inadequate [10, 17, 18, 19, 26]. In addition, all of these studies have concluded that some form of coscheduling or space partitioning with process control offers the best solution. Coscheduling [40] which is a generalization of gang scheduling, attempts to run all of the processes of any given parallel program concurrently ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on runtime identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995.


Thread Scheduling for Multiprogrammed Multiprocessors - Arora, Blumofe, Plaxton (1998)   (41 citations)  (Correct)

....on P dedicated processors. Such scheduling algorithms dynamically map threads onto the processors with the goal of achieving P fold speedups. Though such algorithms will work in some multiprogrammed environments, in particular those that employ static space partitioning [13, 26] or coscheduling [15, 26, 29], they do not work in the multiprogrammed environments being supported by modern shared memory multiprocessors and operating systems [9, 13, 14, 20] The problem This research is supported in part by the Defense Advanced Research Projects Agency (DARPA) under Grant F30602 97 1 0150 from the U.S. ....

....upper bounds on time and space. The practical application and possible adaptation of this idea to multiprogrammed environments is an open question. Prior work that has considered multiprogrammed environments has focused on the kernel level scheduler. With coscheduling (also called gang scheduling) [15, 29], all of the processes belonging to a computation are scheduled simultaneously, thereby giving the computation the illusion of running on a dedicated machine. Interestingly, it has recently been shown that coscheduling can be achieved with little or no modification to existing multiprocessor ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on runtime identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995.


Scheduling with Implicit Information in Distributed.. - Arpaci-Dusseau, Culler.. (1998)   (9 citations)  (Correct)

....the coscheduling of client server applications and requires pessimistic assumptionsaboutwhich processescommunicate with one another. Simulations have shown that communicating processes can be identified at run time, but local schedulers still must agree on a common schedule and context switch time [16]. Finally, explicit coscheduling of parallel programs interacts poorly with interactive jobs and with jobs performing I O [3, 5, 13, 22] Alternatively, with dynamic or implicit coscheduling, independent schedulers on each workstation coordinate parallel jobs through local events that occur ....

D. G. Feitelson and L. Rudolph. Coscheduling Based on RunTime Identification of Activity Working Sets. International Journal of Parallel Programming, 23(2):136--160, April 1995.


Hood: A User-Level Thread Library for Multiprogramming.. - Papadopoulos (1998)   (Correct)

....Similarly, users expect multiprocessor compute servers to support multiprogrammed work loads that include parallel applications. Unfortunately, unless parallel applications are coscheduled [42] or subject to process control [48] they display poor performance in such multiprogrammed environments [10, 17, 19, 20, 27]. As an alternative to coscheduling or process control, Hood employs dynamic, user level, thread scheduling and achieves efficient performance under multiprogramming. Hood is a C user level threads library for parallel programming targeted for shared memory multiprocessors. It supports the ....

.... of kernel level resources, specifically processes [27, 34, 37, 42, 43, 48, 51] A number of studies have compared various process scheduling strategies, and all have concluded that the traditional time sharing, priority based local scheduler found in most operating systems is inadequate [10, 17, 19, 20, 27]. In addition, all of these studies have concluded that some form of coscheduling or space partitioning with process control offers the best solution. Coscheduling [42] which is a generalization of gang scheduling, attempts to run all of the processes of any given parallel program concurrently ....

Dror G. Feitelson and Larry Rudolph. Coscheduling based on runtime identification of activity working sets. International Journal of Parallel Programming, 23(2):135--160, April 1995.


Theory and Practice in Parallel Job Scheduling - Feitelson, Rudolph.. (1994)   (60 citations)  Self-citation (Feitelson Rudolph)   (Correct)

.... job an appropriate number of processors to make it operate at a near optimal ratio of execution time to efficiency [16] With the knowledge of how many processors each job uses, policies for packing the jobs into frames for gang scheduling are investigated by Feitelson [18] Feitelson and Rudolph [22] describe a discipline in which processes that communicate frequently are identified, and it is assured that the corresponding threads are all activated at the same time. Similar schemes in which co scheduling is triggered by communication events were described by Sobalvarro and Weihl [83] and by ....

D. G. Feitelson and L. Rudolph, "Coscheduling based on runtime identification of activity working sets". Intl. J. Parallel Programming 23(2), pp. 135--160, Apr 1995.


Implications of I/O for Gang Scheduled Workloads - Walter Lee Matthew (1997)   (1 citation)  Self-citation (Rudolph)   (Correct)

....problem of deciding whether to gang schedule based on indirect measurements. A fully dynamic solution to gang scheduling includes the identification of gang members at run time. Sobalvarro [17] uses individual message arrivals as cues to the identification of a gang, while Feitelson and Rudolph [10] monitor the rate at which shared communication objects are being accessed to determine whether and which processes need to be ganged. Given the processes that make up each job, our system monitors communication rate between job members to identify those jobs that require coscheduling versus ....

D. G. Feitelson and L. Rudolph. Coscheduling Based on Runtime Identification of Activity Working Sets. In International Journal of Parallel Programming, pages 135--160, April 1995.


Job Scheduling in Multiprogrammed Parallel Systems - Feitelson (1997)   (16 citations)  Self-citation (Feitelson)   (Correct)

....the job to exceed the number of PEs in the system. The criterion for grouping is simple: threads that interact at fine granularity should be 64 scheduled together, and therefore they should be grouped into a gang 18 . This can be done based on runtime observation of interactions among threads [196, 541]. Alternatively, it has been suggested that syntactic program structures can be used to define gangs. Relevant structures are parfor or parbegin in languages like ParC [57] PAR in Occam [285] and the spawn system call in Symunix [182] Thus, when a set of threads are spawned together by a ....

D. G. Feitelson and L. Rudolph, "Coscheduling based on runtime identification of activity working sets". Intl. J. Parallel Programming 23(2), pp. 135--160, Apr 1995. 135


Job Scheduling in Multiprogrammed Parallel Systems - Feitelson (1997)   (16 citations)  Self-citation (Feitelson)   (Correct)

....into a gang 13 . This can be done 13 Ousterhout used the term process working set [268] based on the analogy with the working set of memory pages that must be simultaneously resident in primary memory for efficient computing. 41 based on runtime observation of interactions among threads [119, 326]. Alternatively, it has been suggested that syntactic program structures can be used to define gangs. Relevant structures are parfor or parbegin in languages like ParC [35] PAR in Occam [174] and the spawn system call in Symunix [106] Thus, when a set of threads are spawned together by a ....

D. G. Feitelson and L. Rudolph, "Coscheduling based on runtime identification of activity working sets". Intl. J. Parallel Programming 23(2), pp. 135--160, Apr 1995.


Parallel Job Scheduling: Issues and Approaches - Feitelson, Rudolph (1995)   (26 citations)  Self-citation (Feitelson Rudolph)   (Correct)

.... and research prototypes [32, 13, 19, 22] An interesting variant of gang scheduling is based on the observation that coordinated scheduling is only needed if the job s threads interact frequently [16] Therefore the rate of interaction can be used to drive the grouping of threads into gangs [17, 43]. Other variants include coscheduling, which attempts to schedule a large subset of the gang if it is impossible to schedule all the threads at once [32] and family scheduling, which allows more threads than processors and uses a second level of internal time slicing [7] The price of gang ....

D. G. Feitelson and L. Rudolph, "Coscheduling based on runtime identification of activity working sets". Intl. J. Parallel Programming 23(2), pp. 135--160, Apr 1995.


Theory and Practice in Parallel Job Scheduling - Feitelson, Rudolph.. (1997)   (60 citations)  Self-citation (Feitelson Rudolph)   (Correct)

.... by Feitelson [19] Sobalvarro and Weihl describe a discipline in which processor pairs that communicate frequently are identified, and it is assured that the corresponding threads are all activated at the same time [84] A similar scheme for shared memory was described by Feitelson and Rudolph [23]. Taking system load and minimum and maximum parallelism of each job into account as well, still higher throughputs can be sustained [78] Chiang et al. show that use of knowledge of some job characteristics plus permission to use a single preemption per job allows run to completion policies to ....

D. G. Feitelson and L. Rudolph, "Coscheduling based on runtime identification of activity working sets". Intl. J. Parallel Programming 23(2), pp. 135--160, Apr 1995.


Loosely Coordinated Coscheduling In The Context Of . . . - Sodan (2005)   (Correct)

No context found.

Feitelson D, Rudolph L. Coscheduling based on run-time identification of activity working sets. Interational Journal of Parallel Programming 1995; 23(2):135--160.


Implicit Coscheduling: Coordinated Scheduling with Implicit.. - Arpaci-Dusseau (1998)   (5 citations)  (Correct)

No context found.

Dror G. Feitelson and Larry Rudolph. Coscheduling Based on Run-Time Identification of Activity Working Sets. International Journal of Parallel Programming, 23(2):136-- 160, April 1995.


SWAP: A Scheduler with Automatic Process Dependency Detection - Zheng, Nieh (2004)   (1 citation)  (Correct)

No context found.

Dror G. Feitelson and Larry Rudolph. Coscheduling Based on Run-Time Identification of Activity Working Sets. International Journal of Parallel Programming, 23(2):136--160, April 1995.


Modeling Communication Locality in Multiprocessors - Salisbury, Chen, Melhem (1999)   (2 citations)  (Correct)

No context found.

D. Feitelson and L. Rudolph, Coscheduling based on runtime identification of activity working sets, Internat. J. Parallel Programming 23 (1995), 135#159.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC