58 citations found. Retrieving documents...
M. T. Vandevoorde and E. S. Roberts. WorkCrews: An abstraction for controlling parallelism. Internat. J. Parallel Programming 17(4), 347--366, Aug. 1988.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Thread Scheduling for Multiprogrammed Multiprocessors - Arora, Blumofe, Plaxton (2001)   (41 citations)  (Correct)

.... is efficient with respect to both space and communication [8] Moreover, when coupled with dag consistent distributed shared memory, work stealing is also efficient with respect to page faults [6] For these reasons, work stealing is practical and variants have been implemented in many systems [7, 19, 20, 24, 34, 38]. For general multithreaded computations, other scheduling algorithms have also been shown to be simultaneously efficient with respect to time and space [4, 5, 13, 14] Of particular interest here is the idea of deriving parallel depth first schedules from serial schedules [4, 5] which produces ....

Mark T. Vandevoorde and Eric S. Roberts. WorkCrews: An abstraction for controlling parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988. 26


An Architecture for Highly Concurrent, Well-Conditioned Internet.. - Welsh   (Correct)

....processed. This is accomplished using cohort scheduling, which batches the execution of related aspects of request processing. For example, requests within a stage are processed in last in first out order, increasing the likelihood that recently processed request data is in the cache. Work Crews [137] is another system that makes use of structured event queues and limited numbers of threads to manage concurrency. Work Crews deals with the issue of exploiting potential (compute bound) parallelism within a multiprocessor system while avoiding the creation of large numbers of threads. A Work Crew ....

M. Vandevoorde and E. Roberts. Work crews: An abstraction for controlling parallelism. Technical Report Research Report 42, Digital Equipment Corporation Systems Research Center, February 1988.


Executing Multithreaded Programs Efficiently - Blumofe (1995)   (12 citations)  (Correct)

....by having each processor work as if it is the only one (i.e. in serial, depth first order) and having idle processors steal threads from others, space requirements and communication requirements should be curbed. Since then, many researchers have implemented variants on this strategy [41, 42, 44, 50, 67, 70, 77, 81, 84, 94, 103]. Cilk s work stealing scheduler is very similar to the schedulers in some of these other systems, though Cilk s algorithm uses randomness and is provably efficient. Many multithreaded programming languages and runtime systems are based on heuristic scheduling techniques. Though systems such as ....

....one heuristic and one algorithmic. First, to lower communication costs, we would like to steal large amounts of work, and in a tree structured computation, shallow threads are likely to spawn more work than deep ones. This heuristic notion is the justification cited by earlier researchers [20, 42, 52, 77, 103] who proposed stealing work that is shallow in the spawn tree. We cannot, however, prove that shallow threads are more likely to spawn work than deep ones. What we prove in Section 5.3 is the following algorithmic property. The threads that are on the critical path in the dag, are always at the ....

Mark T. Vandevoorde and Eric S. Roberts. WorkCrews: An abstraction for controlling parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988.


SEDA: An Architecture for Well-Conditioned, Scalable.. - Welsh, Culler, Brewer (2001)   (15 citations)  (Correct)

....(DDS) 20] layer also makes use of a structured event processing framework. In DDS, storage servers emulate asynchronous network and disk I O interfaces by making use of fixed size thread pools, and software components are composed using either explicit event queues or implicit upcalls. Work Crews [56] and the TSS 360 queue scanner [35] are other examples of systems that make use of structured event queues and limited numbers of threads to manage concurrency. In each of these systems, the use of an event queue decouples the execution of two components, which improves modularity and robustness. ....

M. Vandevoorde and E. Roberts. Work crews: An abstraction for controlling parallelism. Technical Report Research Report 42, Digital Equipment Corporation Systems Research Center, February 1988.


Software Support for Distributed and Parallel Computing - Freeh (1996)   (Correct)

....all other processors lists and takes filaments from the processor with the most filaments. 4 Newly forked filaments are put on the tail of a processor s local list and filaments are stolen from the front of the list because the largest units of work tend to be at the front of processor lists [54]. Therefore, a stolen filament will likely keep the processor busy for a while. Another optimization for fork join filaments is pruning, which is analogous to the implicit coarsening used for iterative filaments. When enough work has been created to keep all processors busy, forks are turned into ....

....cannot efficiently support thread per point decompositions for problems like Jacobi, as Lin [39] and others have noted. Filaments can execute the fine grain execution model efficiently in many cases, and when it needs to switch to a coarse grain execution model, it does so at run time. WorkCrews [54] supports fork join parallelism on small scale, shared memory multiprocessors. WorkCrews introduced the concepts of pruning and of ordering 45 queues to favor larger threads. Filaments has borrowed these ideas in its implementation of fork join threads. TAM [17] is a compiler controlled threaded ....

M. Vandevoorde and E. Roberts. WorkCrews: An abstraction for controlling parallelism. International Journal of Parallel Programming, (4):347--366, August 1988.


Pthreads for Dynamic and Irregular Parallelism - Narlikar, Blelloch (1998)   (4 citations)  (Correct)

....lightweight threads packages written for shared memory machines. In particular, we are interested in implementing a scheduler that efficiently supports dynamic and irregular parallelism. 2. 1 Scheduling lightweight threads A variety of lightweight, user level threads systems have been developed [6, 11, 14, 15, 25, 29, 33, 37, 40, 45, 53], including mechanisms to provide coordination between the kernel and the user level threads library [2, 49, 31] Although the main goal of the threads schedulers in previous systems has been to achieve good load balancing and or locality, a large body of work has also focused on developing ....

....thread b, which is the child of thread a. of S 1 , by maintaining per processor stacks of ready threads. When a processor runs out of threads on its own stack, it picks another processor at random, and steals from the bottom of its stack. Various other systems use a similar work stealing strategy [25, 33, 37, 53] to control the parallelism. In previous work [35] we presented a new, provably space efficient scheduling algorithm that uses a shared parallel stack and provides a space bound of S 1 O(p Delta D) for a program with a critical path length of D. The algorithm prioritizes threads according to ....

M. T. Vandevoorde and E. S. Roberts. WorkCrews: an abstraction for controlling parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988.


Scheduler Activations: Effective Kernel Support for.. - Anderson, Bershad.. (1992)   (281 citations)  (Correct)

....of Scheduler Activations: Effective Kernel Support l 55 ACM Transactions on Computer Systems, Vol. 10, No. 1, February 1992. multiprogramming and I O) As a result, user level threads have ultimately been implemented on top of the kernel threads of both Mach (C Threads [8] and Topaz (WorkCrews [24]) User level threads are built on top of kernel threads exactly as they are built on top of traditional processes; they have exactly the same performance, and they suffer exactly the same problems. The parallel programmer, then, has been faced with a difficult dilemma: employ user level threads, ....

....of the applications that use it, since different applications can be linked with different user level thread libraries. As an example, most kernel thread systems implement preemptive priority scheduling, even though many parallel applications can use a simpler policy such as first in firstout [24]. These factors would not be important if thread management operations were inherently expensive. Kernel trap overhead and priority scheduling, for instance, are not major contributors to the high cost of UNIX like processes. However, the cost of thread operations can be within an order of ....

VANDEVOORDE, M., AND ROBERTS, E. WorkCrews: An abstraction for controlling parallelism. Int. J. Parallel Program. 17, 4 (Aug. 1988), 347--366.


Obtaining Efficient Single-Processor Performance From.. - Lowenthal, Greene (1999)   (Correct)

....performance of fine grain, parallel recursive programs. Initially there was a large focus on implementing efficient futures (see [WC93] for example) Another early focus was allowing thread creation until there is enough parallelism; at that point, regular procedure calls were made [VR88] However, either (1) the success of this is dependent on run time behavior, and it is possible to get stuck executing an entire recursive subtree sequentially, or (2) an (expensive) check must be made before each recursive call to see if it should be done sequentially. In the last several ....

M. T. Vandevoorde and E. S. Roberts. WorkCrews: an abstraction for controlling parallelism. International Journal of Prallel Programming. 12


Scheduling Threads for Low Space Requirement and Good Locality - Narlikar (1999)   (6 citations)  (Correct)

....thread creation and scheduling are typically local operations, they incur low overhead and contention. Further, threads close together in the computation graph are often scheduled on the same processor, resulting in good locality. Several systems have used work stealing to provide high performance [11, 17, 18, 20, 26, 39, 42, 44]. When each processor treats its own ready queue as a LIFO stack (that is, adds or removes threads from the top of the stack) and steals from the bottom of another processor s stack, the scheduler successfully throttles the excess parallelism [8, 39, 41, 44] For fully strict computations, such a ....

.... high performance [11, 17, 18, 20, 26, 39, 42, 44] When each processor treats its own ready queue as a LIFO stack (that is, adds or removes threads from the top of the stack) and steals from the bottom of another processor s stack, the scheduler successfully throttles the excess parallelism [8, 39, 41, 44]. For fully strict computations, such a mechanism was proved to require space on processors, where is the serial, depth first space requirement [9] A computation with work (total number of operations) and depth (length of the critical path) was shown to require ....

[Article contains additional citation context not shown here]

M. T. Vandevoorde and E. S. Roberts. WorkCrews: an abstraction for controlling parallelism. Intl. J. Parallel Programming, 17(4):347--366, August 1988.


Job Scheduling in Multiprogrammed Parallel Systems - Feitelson (1997)   (16 citations)  (Correct)

.... kernel [209] This is turn is reminiscent of many concurrent object oriented languages using a process model, where method invocation may be viewed as executing a chore [118] In WorkCrews new chores are created in a lazy fashion, so as to reduce overhead if they end up being executed locally [597]. Alternatively, chores can be created by a compiler [466, 465, 414] A common extension of independent chores is the concept of virtual PEs the difference being that the different code fragments can interact with each other with the mapping to physical PEs hidden in the runtime system ....

....new thread and execute it. The communication registers are also used to join threads upon termination [296, 578, sect. 3.4] Single PE allocation can also be used beneficially in distributed systems with local queues rather than a central queue. This usually goes under the name work stealing [597, 411, 71, 70]. As in the centralized systems, PEs volunteer to join in the computation on their own initiative when they find themselves idle; the difference is that they have to search for work in the local queues of other PEs. This approach is used in the Cilk and Phish systems [70, 71] and some other ....

[Article contains additional citation context not shown here]

M. T. Vandevoorde and E. S. Roberts, "WorkCrews: an abstraction for controlling parallelism ". Intl. J. Parallel Programming 17(4), pp. 347--366, Aug 1988.


Virtual Topologies: A New Concurrency Abstraction for.. - Philbin, Jagannathan, ..   (Correct)

....with the value yielded by another thread T 2 , we can map T 1 and T 2 on the same virtual processor. In fine grained programs where processors are busy most of the time, the ability to schedule data dependent threads on the same processor leads to opportunities for improved thread granularity [10, 14, 23]. 5. Dynamic Topologies: Certain algorithms have a process structure that unfolds as the computation progresses; adaptive tree algorithms [12] are a good example. These algorithms are best executed on topologies that permit dynamic creation and destruction of virtual processors. The remainder of ....

M. Vandevoorde and E. Roberts. WorkCrews: An Abstraction for Controlling Parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988.


Space-Efficient Scheduling of Multithreaded Computations - Blumofe, Leiserson (1998)   (30 citations)  (Correct)

.... ) O(T 1 P ) In attempting to expose parallelism, however, schedulers often end up exposing more parallelism than the computer can actually exploit, and since each living thread requires the use of a certain amount of memory, such schedulers can easily overrun the memory capacity of the machine [15, 22, 24, 39, 43]. To date, the space requirements of multithreaded computations have been managed with heuristics or not at all [14, 15, 22, 24, 26, 32, 39, 43] In this paper, we use algorithmic techniques to address the problem of managing storage for multithreaded computations. Our goal is to develop ....

.... actually exploit, and since each living thread requires the use of a certain amount of memory, such schedulers can easily overrun the memory capacity of the machine [15, 22, 24, 39, 43] To date, the space requirements of multithreaded computations have been managed with heuristics or not at all [14, 15, 22, 24, 26, 32, 39, 43]. In this paper, we use algorithmic techniques to address the problem of managing storage for multithreaded computations. Our goal is to develop scheduling algorithms that expose su#cient parallelism to obtain linear speedup but without exposing so much parallelism that the space requirements ....

<F3.755e+05> M. T. Vandevoorde and E. S.<F3.854e+05> Roberts,<F4.047e+05> WorkCrews: An abstraction for controlling parallelism,<F3.854e+05> Internat. J. Parallel Programming, 17 (1988), pp. 347--366.


Scheduling Threads for Low Space Requirement and Good Locality - Narlikar (1999)   (6 citations)  (Correct)

....thread creation and scheduling are typically local operations, they incur low overhead and contention. Further, threads close together in the computation graph are often scheduled on the same processor, resulting in good locality. Several systems have used work stealing to provide high performance [11, 17, 18, 20, 26, 39, 42, 44]. When each processor treats its own ready queue as a LIFO stack (that is, adds or removes threads from the top of the stack) and steals from the bottom of another processor s stack, the scheduler successfully throttles the excess parallelism [8, 39, 41, 44] For fully strict computations, such a ....

.... high performance [11, 17, 18, 20, 26, 39, 42, 44] When each processor treats its own ready queue as a LIFO stack (that is, adds or removes threads from the top of the stack) and steals from the bottom of another processor s stack, the scheduler successfully throttles the excess parallelism [8, 39, 41, 44]. For fully strict computations, such a mechanism was proved to require p Delta S1 space on p processors, where S1 is the serial, depth first space requirement [9] A computation with W work (total number of operations) and D depth (length of the critical path) was shown to require W=p O(D) time ....

[Article contains additional citation context not shown here]

M. T. Vandevoorde and E. S. Roberts. WorkCrews: an abstraction for controlling parallelism. Intl. J. Parallel Programming, 17(4):347--366, August 1988.


Hood: A User-Level Threads Library for Multiprogrammed.. - Blumofe, Papadopoulos (1998)   (1 citation)  (Correct)

....by PA = minfP; 8 Gamma PA (cycler)g. Again, we find that one number, the normalized number of processes, predicts the utilization behavior. Moreover, it does so even when the program runs on a set of processors that grows and shrinks over time. 6 Related Work Though numerous threads libraries [11, 19, 24, 29] and multithreaded languages [3, 15, 17] have been developed, we are not aware of any besides Hood that do not suffer from the performance cliff. Like Hood, some of these libraries and languages are based on the two level scheduling model, with user level threads scheduled onto a pool of ....

Mark T. Vandevoorde and Eric S. Roberts. WorkCrews: An abstraction for controlling parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988.


The Importance of Locality in Scheduling and Load Balancing for.. - Keckler (1994)   (1 citation)  (Correct)

....some of their work. Thus only otherwise idle processors participate in any load balancing. Lazy task creation [23] uses work stealing. The local task queue is threaded through the execution stack and tasks are only added to the local task queue using the future construct of Mul T [16] WorkCrews [35] and optimizations to Qlisp [27] maintain local task queues and employ work stealing when processors become idle. All three of these schemes employ forms of lazy task creation to increase task granularity and reduce the overhead associated with task creation. 3 Case Studies This section presents ....

Vandevoorde, M. T., and Roberts, E. S. WorkCrews: An abstraction for controlling parallelism. International Journal of Parallel Programming 17, 4 (1988), 347--366.


A Migratable User-Level Process Package for PVM - Konuru (1995)   (5 citations)  (Correct)

....communicate with each other, the communication can potentially reduce the skew that may occur because of the non preemptive scheduling. Further, using a simple non preemptive scheduling policy rather a sophisticated preemptive policy has been shown to be sufficient for most parallel computations [VR88] For these reasons, the ULP library is expected to implement non preemptive scheduling of ULPs except when it receives a ULP migration event. In the event the currently executing ULP is to be migrated, then the ULP library, to maintain unobtrusiveness, can preempt the ULP and migrate it to the ....

M. Vandevoorde and E. Roberts. Workcrews: An abstraction for controlling parallelism. Int. J. Parallel Programming, 17(4):347--366, August 1988.


Space-Efficient Scheduling of Multithreaded Computations - Blumofe, Leiserson (1993)   (30 citations)  (Correct)

....MIT Laboratory for Computer Science with additional support from a National Science Foundation Graduate Fellowship. z MIT Laboratory for Computer Science, 545 Technology Square, Cambridge, Massachusetts, 02139 (cel mit.edu) 2 R. D. BLUMOFE AND C. E. LEISERSON memory capacity of the machine [15, 22, 24, 39, 43]. To date, the space requirements of multithreaded computations have been managed with heuristics or not at all [14, 15, 22, 24, 26, 32, 39, 43] In this paper, we use algorithmic techniques to address the problem of managing storage for multithreaded computations. Our goal is to develop ....

....for Computer Science, 545 Technology Square, Cambridge, Massachusetts, 02139 (cel mit.edu) 2 R. D. BLUMOFE AND C. E. LEISERSON memory capacity of the machine [15, 22, 24, 39, 43] To date, the space requirements of multithreaded computations have been managed with heuristics or not at all [14, 15, 22, 24, 26, 32, 39, 43]. In this paper, we use algorithmic techniques to address the problem of managing storage for multithreaded computations. Our goal is to develop scheduling algorithms that expose sufficient parallelism to obtain linear speedup, but without exposing so much parallelism that the space requirements ....

M. T. Vandevoorde and E. S. Roberts, WorkCrews: An abstraction for controlling parallelism, International Journal of Parallel Programming, 17 (1988), pp. 347--366.


A Customizable Substrate for Concurrent Languages - Jagannathan, Philbin (1992)   (13 citations)  (Correct)

....TG does. Users can parameterize thread state to inform the TC if a thread can steal or not; sting provides interface procedures for this purpose. Figure 4: Dynamics of thread stealing. Dashed lines indicate dataflow constraints, solid lines specify thread transitions. Like load based inlining[33] or lazy task creation[24] stealing throttles process creation. Unlike these other techniques, however, stealing also improves locality. Locality is increased because a stolen thread is run using the TCB of a currently evaluating thread; consequently, the stack and heap of this TCB remains in the ....

M. Vandevoorde and E. Roberts. WorkCrews: An Abstraction for Controlling Parallelism. International Journal of Parallel Programming, 17(4):347-- 366, August 1988.


The Performance of Work Stealing in Multiprogrammed.. - Blumofe, Papadopoulos (1998)   (5 citations)  (Correct)

....system of which we are aware. Finally, we point out that our use of work stealing and non blocking synchronization builds upon a long history in both areas, though they did not meet until now. The idea of work stealing goes back to 1981 [16] and has been used in many systems and applications since [20, 21, 27, 42, 46]. The first provably efficient work stealing algorithm [15] and implementation [14] is fairly recent, however. The idea of non blocking and wait free synchronization was developed by Herlihy [29] There has been a long line of work attempting to make the idea more practical via universal ....

Mark T. Vandevoorde and Eric S. Roberts. WorkCrews: An abstraction for controlling parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988.


Lazy Threads: Implementing a Fast Parallel Call - Goldstein (1996)   (8 citations)  (Correct)

No context found.

M. T. Vandevoorde and E. S. Roberts. WorkCrews: An abstraction for controlling parallelism. Internat. J. Parallel Programming 17(4), 347--366, Aug. 1988.


Babylon V2.0: Support for Distributed, Parallel and Mobile.. - van Heiningen   (Correct)

No context found.

M. T. Vandevoorde and E. S. Roberts. Workcrews: An abstraction for controlling parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988.


LISP AND SYMBOLIC COMPUTATION: An International Journal.. - Ts Scheme Distributed   (Correct)

No context found.

M. Vandevoorde and E. Roberts. WorkCrews: An Abstraction for Controlling Parallelism. International Journal of Parallel Programming, 17(4):347--366, August 1988.


Dynamic Language - Parallelization By Requirements   (Correct)

No context found.

M. T. Vandevoorde and E. S. Roberts. WorkCrews: An abstraction for controlling parallelism. International Journal of Parallel Programming, 17(4):347--366, 1988.


SOFTWARE---PRACTICE AND EXPERIENCE, VOL. 21(12).. - Analysis Seshadri And   (Correct)

No context found.

M. T. Vandevoorde and E. Roberts, `Workcrews: an abstraction for controlling parallelism', Technical Report 42, Digital Equipment Corporation Systems Research Center, 1989.


Scheduler Activations: Effective Kernel Support for.. - Anderson, Bershad.. (1992)   (281 citations)  (Correct)

No context found.

Vandevoorde, M. and Roberts, E. WorkCrews: An Abstraction for Controlling Parallelism. International Journal of Parallel Programing, 17(4):347-366, August 1988.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC