| C. D. Polychronopoulos and D. J. Kuck, "Guided self-scheduling: a practical scheduling scheme for parallel supercomputers," IEEE Transactions on Computers, vol. C-36, no. 12, pp. 1425-1439, December 1987. |
....resides in a single repository, and that the time required to transfer that data is a significant factor. Efficiently managing the resulting computation is a difficult and challenging problem, given the heterogeneous attributes of the underlying components. This problem is well recognized [3, 18, 21, 16, 13, 26, 36, 23, 7, 2], and there is a large body of applied research, see for instance [5, 10, 1, 17, 33] providing various practical approaches toward a solution. An added complexity is that resources in these environments exhibit dynamic performance characteristics and availability, and a number of the ....
....for coping with underlying changes in the computing platform. Of course, there are many questions about speed of adapting and stability that must be addressed in future work. 5. Related Work The question of scheduling independent tasks onto heterogeneous sets of resources is a well known problem [18, 21, 16, 13, 26, 36, 23, 7, 2, 6, 9, 28, 4, 39], which has been studied with various sets of assumptions concerning both the application and the computing platform. Our work departs from previous approaches in that we develop a distributed, autonomous scheduling strategy. The major advantage of our approach is that it accommodates large scale ....
C. Polychronopoulos and D. Kuck. Guided self-scheduling: a practical scheduling scheme for parallel supercomputers. IEEE Transactions on Computers, 36:1425--1439, 1987.
.... of this programming model, and previous studies of barrier performance include [2] and [5] Loop scheduling methods have an extensive literature, with most authors reporting performance studies, though the emphasis is on comparing algorithms rather than imple mentations (see, for example, 4] [8], 9] The remainder of this paper is organised as follows: Section II describes the techniques used to perform the overhead measurements, and Section III presents the results of the measurements on the three different systems. These results are analysed in Section IV and conclusions drawn in ....
Polychronopoulos, C. D. and Kuck, D. J. (1987) Guided Self- Scheduling: A Practical Scheduling Scheme for Parallel Su- percomputers, IEEE Transactions on Computers, C-36(12), pp. 1425-1439.
....positions among processors. This random load distribution policy distributes work fairly evenly. As shown in Table 4.2, the percentage of total execution time a processor on the average spends idle is small. Thus, employing more sophisticated scheduling techniques, such as self scheduling [22], is not warranted. 4.2.4 Prefetching In our experiments with prefetching, the execution time of the Improved DTABQ implementation can be improved at most 3.8 for problem size 6. We experimented with prefetching data in three places. First, when generating extensions from positions in the ....
C. Polychronopoulos and D. Kuck. "Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers." IEEE Transactions on Computers, December 1987. pp. 1425-1439.
....as before) that avoids redundant locks of the global list. How to divide the activities between the representatives Equal division may not be optimal if the activities are not identical. A better approach is to always allocate 1=P of the remaining activities. Polychronopoulos and Kuck [3] showed theoretically that independently of the startup time of the P processors that are scheduled under this policy, all processors finish executing a parallel construct within B units of time difference from each other (where B is the execution time of one activity) 3 Measurements on a ....
....on MIMD computers. Envelopes are a transparent instrument to keep simplicity and generality of the parallel programming language. One of the novel aspects of this work is that envelopes deal with dependencies among activities in run time, and not just independent tasks as most previous work [3, 2]. The experiments performed with typical program structures show that envelopes actually support both coarse and fine grain activities automatically with high efficiency. In order to minimize overhead, there is no task migration between PE s. We also take advantage of locality, by means of ....
C. D. Polychronopoulos and D. J. Kuck, "Guided self scheduling: a practical scheduling scheme for parallel supercomputers". IEEE Trans. Comput. C-36(12), pp. 1425--1439, Dec 1987.
....operation over the execution time of K iterations, resulting in less synchronization overhead. Uniform sized chunking has a greater potential for imbalance than self scheduling however, as processors finish within K iterations of each other in the worst case. 32 Guided self scheduling [Polychronopoulos and Kuck, 1987] is a dynamic algorithm that changes the size of chunks at run time, allocating large chunks of iterations at the beginning of a loop so as to reduce synchronization overhead, while allocating small chunks towards the end of the loop to balance the workload. Under guided self scheduling each ....
....12 that TRAPEZOID performs 10 15 worse than both AFS and GSS on this application. The reason for this can be traced to the load balancing properties of TRAPEZOID. When all iterations take the same time to execute, processors finish within one iteration of each other under guided self scheduling [Polychronopoulos and Kuck, 1987]. Under TRAPEZOID, processors finish within several iterations of each other [Tzen and Ni, 1993] When an iteration takes a long time to complete, the imbalance introduced by the trapezoid algorithm can be noticeable. Although the trapezoid algorithm requires fewer accesses to the work queue, the ....
C. D. Polychronopoulos and D. J. Kuck, "Guided Self- Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers," IEEE Trans- actions on Computers, C-36(12), December 1987.
....to generate more ecient code. Note that this construct does not add functionality, it is useful only for optimizations. The implementation can choose to balance the load between the P activities statically, by allocating n=P iterates to each, or dynamically, by using chunked self scheduling [11, 12, 13]. This is a further optimization that does not change the semantics. It was decided to add a special construct to the language instead of just adding a hint to the compiler, as found in many commercial parallel languages, because of the semantic implications. lparfor explicitly implies that the ....
C. D. Polychronopoulos and D. J. Kuck, `Guided self scheduling: a practical scheduling scheme for parallel supercomputers'. IEEE Trans. Comput. C-36, 1425-1439 (1987). 25
.... 2 Loop Scheduling In recent papers, Bull [2] and Bull et al. 3] see also Ford et al. 6] have proposed a loop scheduling algorithm, termed Feedback Guided Dynamic Loop Scheduling (FGDLS) The major difference between FGDLS and other scheduling algorithms, such as guided self scheduling (see [9]) and affinity scheduling (see [7] results from the assumption that the workload is changing only slowly from one execution of a loop to the next, so that observed timing information from the current execution of the loop can, and should, be used to guide the scheduling of the next execution of ....
....W t is approximately bisected in its first dimension; subsequent splitting points further approximately bisect W t in the other dimensions. 4. 2 Loop Coalescing This algorithm proceeds by coalescing the M parallel loops into a single loop of length NPOINTS1 NPOINTS2 NPOINTSM (see [1] or [9] for details of loop coalescing) For example, for the case M = 4 the coalesced loop is given by: DO SEQUENTIAL J = 1, NSTEPS DO PARALLEL K = 1, NPOINTS1 NPOINTS2 NPOINTS3 NPOINTS4 K1 = K 1) NPOINTS4 NPOINTS3 NPOINTS2) 1 K2 = MOD( K 1) NPOINTS4 NPOINTS3) NPOINTS2) 1 K3 = ....
C. D. Polychronopoulos and D. J. Kuck, (1987) Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers, IEEE Transactions on Computers, C-36(12), pp. 1425--1439.
....made whenever application or system characteristics are changed. The remaining allocation strategies we will discuss do not require this additional application specific information for calculating allocation sizes. Guided self scheduling (GSS) is a strategy introduced by Polychronopoulos and Kuck [62] which allocates work units in groups with an exponentially decreasing size. The following expression gives the allocation function for the GSS strategy, where s, N , and P have the same meaning as in the expression for the FSC allocation function. A(s, N,P ) max # # # # # # # # N # ....
....have identified work unit allocation as the scheduling function which is the primary means for applications to adapt to uncertainty and inaccuracy in acquiring different performance parameter values. A number of researchers, including Hagerup [39] Kruskal and Weiss [48] Polychronopoulos and Kuck [62], Tzen and Ni [81] Hummel, Schmidt, Uma, and Wein [44] and Lucco [54] have introduced individual work allocation strategies which have been shown through either analysis or simulation to produce benefits under di#erent sets of assumptions and conditions. To our knowledge, there has not been an ....
Polychronopoulos, C. D., and Kuck, D. J. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers. IEEE Transactions on Computers 36, 12 (Dec. 1987), 1425--1439.
....running in a wide area network environment connecting workstations at UCSD, and UTK. In addition to a simple one time xed allocation (FIXED) strategy, other tested work distribution strategies include: Self Scheduling (SS) 48] Fixed Size Chunking (FSC) 49] Guided Self Scheduling (GSS) [50], Trapezoidal Self Scheduling (TSS) 51] and Factoring (FAC2) 52] FIXED, SS, and FSC are examples of allocation strategies which apply the same allocation block sizes throughout an application run, while GSS, TSS, and FAC2 are examples of strategies which utilize decreasing block sizes as an ....
C. D. Polychronopoulos and D. J. Kuck, \Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers, " IEEE Transactions on Computers, vol. 36, no. 12, pp. 1425-1439, Dec. 1987.
....loop contains nothing but invocations of parallel versions of methods, the compiler generates parallel loop code instead of code that serially spawns each iteration of the loop. The generated code can then apply standard parallel loop execution techniques; it currently uses guided self scheduling [Polychronopoulos and Kuck 1987]. 5.2 Suppressing Excess Concurrency In practice, parallel execution inevitably generates overhead in the form of synchronization and task management overhead. If the compiler exploits too muchconcurrency,the resulting overhead may overwhelm the performance benefits of parallel execution. The ....
POLYCHRONOPOULOS, C. AND KUCK, D. 1987. Guided self-scheduling: A practical scheduling scheme for parallel computers. IEEE Transactions on Computers 36, 12 (Dec.), 1425--1439.
....a signi cant improvement in performance. Our goal is to leverage the exibility a orded by the SDSM system to e ect load balancing in autonomous environments. Load balancing and or locality management has been extensively studied by many researchers especially in the context of loop scheduling [12, 18, 27, 23, 15, 8]. All these studies deal with the issue of either load balance or locality or both but not with scheduling issues. Ioannidis et al. 12] propose a method of assigning loops to each of the pro cesses based on the observed relative power and the locality of data (in order to minimize steady state ....
C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: a practical scheduling scheme for parallel supercomputers. In IEEE Transactions on Computers, December 1987.
....task and are idle. The goal of self scheduling is to try to have the processors finish at the same time. This technique works particularly well for parallel loops with a high execution variance among different iterations. Variations of this technique have been proposed such as tapering [58][69] in which the system adaptively adjusts the size of the work chunk that is assigned based on problem characteristics. If there is little variance in the computation, assigning larger chunks of work is more efficient due to the overhead of work assignment. Self scheduling could be viewed as a ....
C. Polychronopoulis and D. Kuck, "Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers," IEEE Transactions on Computers, Vol. c-36(12), December 1987. 150
....the 3 representative applications used in this study. tion boundaries; the applications examine and adjust to the number of available processors each time they begin an iteration, but do not do so while executing any one iteration. It is clearly possible to do much more dynamic scheduling, e.g. [20, 1, 6, 14]; we did not do so because of the very large incremental implementation cost relative to our more restrictive change, and because we expect that ST EQUI would perform even better when jobs are more responsive to changes in their allocations. Of the three policies, STEQUI reallocates processors ....
C. Polychronopoulos and D. Kuck. Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers, C-36(12):1425--1439, Dec. 1987.
.... towards schemes of scheduling the loop iterations at run time (dynamic loop scheduling) in an attempt to balance each processor s workload in cases where each loop iteration does not perform the same amount of work; some of these schemes include self scheduling [11] guided self scheduling [10], trapezoid self scheduling [12] and affinity scheduling [9] Apart from the run time overhead incurred by these schemes, their main disadvantage, in the context of automatic parallelisation, is that they deprive the compiler of any information regarding the mapping phase. The availability of ....
....derivable. 4 Evaluation and Experimental Results In order to evaluate the partitioning methods described in the previous section we conducted experiments on the KSR1, a 32 processor virtual shared memory parallel computer, using the code shown in figure 2 which is based on loop nest L1 used in [10]. In our experiments, N1 = 100, N2 = 50, N3 = 4, L1 = 30, and L2 = 30; the subroutine X( performs a number of multiplications equal to the value of the parameter passed to it. Two sets of experiments were conducted; in the first set DOALL I1=1,N1 DOALL I2=1,N2 DOALL I3=1,N3 CALL X(W1) IF ....
C. D. Polychronopoulos, D. J. Kuck, "Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers", IEEE Transactions on Computers, 36-12, Dec. 1987, pp. 1425-- 1439.
No context found.
C. D. Polychronopoulos and D. J. Kuck, "Guided self-scheduling: a practical scheduling scheme for parallel supercomputers," IEEE Transactions on Computers, vol. C-36, no. 12, pp. 1425-1439, December 1987.
No context found.
C. D. Polychronopoulos. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers. IEEE Transactions on Computers, C--36(12):1425--1439, December 1987.
No context found.
C. D. Polychronopoulos and D. J. Kuck, "Guided self-scheduling: A practical scheduling scheme for parallel supercomputers," IEEE Transactions on Computers, vol. 36, no. 12, pp. 1425--1439, 1987.
No context found.
C. D. Polychronopoulos and D. J. Kuck, "Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers, " IEEE Transactions on Computers, vol. 36, no. 12, pp. 1425--1439, Dec. 1987.
No context found.
C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: a practical scheduling scheme for parallel supercomputers. In IEEE Transactions on Computers, September 1992.
No context found.
C.D. Polychronopoulos and D.J. Kuck, "Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers," IEEE Trans. Computers, vol. 36, no. 12, pp. 1425-1439, Dec. 1987.
No context found.
C. Polychronopoulos and D. Kuck. Guided self-scheduling: A practical scheduling scheme for parallel computers. IEEE Transactions on Parallel and Distributed Systems, pages 1425--1439, December 1987.
No context found.
C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers. IEEE Transactions on Computers, 36(12):1425--1439, December 1987.
No context found.
C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: a practical scheduling scheme for parallel supercomputers. In IEEE Transactions on Computers, December 1987.
No context found.
C. D. Polychronopoulos and D. J. Kuck. "Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers". IEEE Transactions on Computers, C-36(12), Decem- ber 1987.
No context found.
C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers. IEEE Transactions on Computers, C-36(12), December 1987.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC