9 citations found. Retrieving documents...
R. D. Blumofe and D. Papadopoulos. The performance of work stealing in multiprogrammed environments (extended abstract). In SIGMETRICS '98/PERFORMANCE '98: Proceedings of the 1998.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Pthreads for Dynamic and Irregular Parallelism - Narlikar, Blelloch (1998)   (4 citations)  (Correct)

....The resulting program is architecture independent, since the parallelism is not statically mapped to a fixed number of processors. This is particularly useful in a multiprogramming environment, where the number of processors available to the computation may vary over the course of its execution [10, 52]. ffl Since the number of threads expressed is much larger than the number of processors, the threads can be effectively load balanced by the implementation. The programmer does not need to implement a load balancing strategy for every application that cannot be mapped statically. ffl A ....

R. D. Blumofe and D. Papadopoulos. The performance of work stealing in multiprogrammed environments, November 1997. Draft submitted for publication, available from http://www.cs.utexas.edu/users/rdb/papers.html.


CTK: Configurable Object Abstractions for Multiprocessors - Silva, al. (1997)   (Correct)

....and the runtime specification of the synchr vs. asynchr attribute value is interpreted by that policy. The example of a configurable object used throughout this paper is that of a distributed queue. The performance of many parallel programs and operating system services (e.g. schedulers[6]) has been shown sensitive to the implementation of the queue abstractions they use. The particular parallel program studied in this paper is a parallel branch and bound code, which uses a priority queue to store its subproblems that remain to be solved. The performance of this program strongly ....

R. D. Blumofe and D. Papadopoulos. The Performance of Work Stealing in Multiprogrammed Environments. Technical report, University of Texas at Austin, Department of Computer Science, May 1998.


Balanced PRAM Simulations via Moving Threads and Hashing - Leppänen   (Correct)

....fixed handler. The active messages approach is designed for current message passing multiprocessors, and it aims to improve the efficiency of message handling and to minimize communication overhead. Another approach to balance the workload of processors is work stealing [Blumofe, Leiserson 1994, Blumofe, Papadopoulos 1998] where processors needing work steal threads from other processors. Stealing requires ownership in the moving threads approach processors do not own threads. In the MuSE (Multithreaded Scheduling Environment) runtime environment [Leberecht 1996] the idea of moving threads (tokens of dataflow ....

R.D. Blumofe and D. Papadopoulos. The Performance of Work Stealing in Multiprogrammed Environments, 1998. Manuscript, submitted for publication.


Pthreads for Dynamic Parallelism - Narlikar, Blelloch (1998)   (Correct)

....The resulting program is architecture independent, since the parallelism is not statically mapped to a fixed number of processors. This is particularly useful in a multiprogramming environment, where the number of processors available to the computation may vary over the course of its execution [9]. ffl Since the number of threads expressed is much larger than the number of processors, the threads can be effectively load balanced by the implementation. The programmer does not need to implement a load balancing strategy for each application that cannot be mapped statically. ffl The ....

R. D. Blumofe and D. Papadopoulos. The performance of work stealing in multiprogrammed environments, November 1997. Draft submitted for publication, available from http://www.cs.utexas.edu/users/rdb/papers.html.


Thread Scheduling for Multiprogrammed Multiprocessors - Arora, Blumofe, Plaxton (2001)   (41 citations)  Self-citation (Blumofe)   (Correct)

.... Though such algorithms will work in some multiprogrammed environments, in particular those that employ static space partitioning [15, 30] or coscheduling [18, 30, 33] they do not work in the multiprogrammed environments being supported by modern shared memory multiprocessors and operating systems [9, 15, 17, 23]. The problem lies in the assumption that a fixed collection of processors are fully available to perform a given computation. This research is supported in part by the Defense Advanced Research Projects Agency (DARPA) under Grant F30602 97 10150 from the U.S. Air Force Research Laboratory. In ....

....specializes to match the O(T 1 =P T1 ) bound established earlier for fully strict computations executing in dedicated environments. Our non blocking work stealer has been implemented in a prototype C threads library called Hood [10] and numerous performance studies have been conducted [9, 10]. These studies show that application performance conforms to the O(T 1 =PA ) bound and that the constant hidden in the big Oh notation is small, roughly 1. Moreover, these studies show that non blocking data structures and the use of yields are essential in practice. If any of these ....

[Article contains additional citation context not shown here]

Robert D. Blumofe and Dionisios Papadopoulos. The performance of work stealing in multiprogrammed environments (extended abstract). In Proceedings of the 1998.


The Data Locality of Work Stealing - Acar, Blelloch, Blumofe (2000)   (2 citations)  Self-citation (Blumofe)   (Correct)

....work stealing improves the performance of work stealing up to 80 . 1 Introduction Many of today s parallel applications use sophisticated, adaptive algorithms which are best realized with parallel programming systems that support dynamic, lightweight threads such as Cilk [8] Nesl [5] Hood [10], and many others [3, 16, 17, 21, 32] The core of these systems is a thread scheduler that balances load among the processes. In addition to a good load balance, however, good data locality is essential in obtaining high performance from modern parallel systems. Several researches have studied ....

.... with locality guided work stealing give encouraging results, showing that for certain applications the performance is very close to that of static partitioning in dedicated mode (i.e. when the user can lock down a fixed number of processors) but does not suffer a performance cliff problem [10] in multiprogrammed mode (i.e. when processors might be taken by other users or the OS) Figure 1 shows a graph comparing work stealing, locality guided work stealing, and static partitioning for a simple over relaxation algorithm on a 14 processor Sun Ultra Enterprise. The over relaxation ....

[Article contains additional citation context not shown here]

Robert D. Blumofe and Dionisios Papadopoulos. The performance of work stealing in multiprogrammed environments. Technical Report TR-98-13, The University of Texas at Austin, Department of Computer Sciences, May 1998.


Hood: A User-Level Threads Library for Multiprogrammed.. - Blumofe, Papadopoulos (1998)   (1 citation)  Self-citation (Blumofe Papadopoulos)   (Correct)

....of the priocntl call. Likewise, a thief calls yield only after it has made enough unsuccessful steal attempts to amortize the cost of the yield call. Algorithmic and empirical analysis have both revealed the necessity of non blocking synchronization and yields in eliminating the performance cliff [10]. If mutual exclusion is used instead of non blocking synchronization, then a process that gets swapped out by the kernel while holding a lock can prevent other processes from making progress. Yields are needed to prevent the following anomaly. A process may get swapped out while in the middle of ....

Robert D. Blumofe and Dionisios Papadopoulos. The performance of work stealing in multiprogrammed environments. Technical Report TR-98-13, The University of Texas at Austin, Department of Computer Sciences, May 1998.


Thread Scheduling for Multiprogrammed Multiprocessors - Arora, Blumofe, Plaxton (1998)   (41 citations)  Self-citation (Blumofe)   (Correct)

.... Though such algorithms will work in some multiprogrammed environments, in particular those that employ static space partitioning [13, 26] or coscheduling [15, 26, 29] they do not work in the multiprogrammed environments being supported by modern shared memory multiprocessors and operating systems [9, 13, 14, 20]. The problem This research is supported in part by the Defense Advanced Research Projects Agency (DARPA) under Grant F30602 97 1 0150 from the U.S. Air Force Research Laboratory. In addition, Greg Plaxton is supported by the National Science Foundation under Grant CCR 9504145. Multiprocessor ....

....have PA = P , and our bound for multiprogrammed environments specializes to match the O(T1=P T1) bound established earlier for fully strict computations executing in dedicated environments. Our non blocking work stealer has been implemented and numerous performance studies have been conducted [9]. These studies show that application performance conforms to the O(T1=PA T1P=PA ) bound and that the constant hidden in the big Oh notation is small, roughly 1. Moreover, these studies show that nonblocking data structures and the use of yields are essential in practice. If any of these ....

[Article contains additional citation context not shown here]

Robert D. Blumofe and Dionisios Papadopoulos. The performance of work stealing in multiprogrammed environments (extended abstract). In Proceedings of the 1998 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, Poster Session, Madison, Wisconsin, June 1998.


Flux: A Language for Programming High-Performance Servers - Brendan Burns Kevin   (Correct)

No context found.

R. D. Blumofe and D. Papadopoulos. The performance of work stealing in multiprogrammed environments (extended abstract). In SIGMETRICS '98/PERFORMANCE '98: Proceedings of the 1998.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC