| P. Krueger, T.-H. Lai, and V. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Trans. on Parallel and Distributed Systems, 5(5):488--497, 1994. |
....time. There has been considerable prior research into each of the two topics of i) scheduling of parallel jobs [1,2,6,10,14,19,20,23,24,26] and ii) contiguous node allocation strategies [3,4,5,11,13,27] There have also been a few studies that have considered both these issues in combination [12,15,18]. However, only [15] addresses the impact of contiguous node allocation schemes in conjunction with a job scheduling policy that takes fairness into consideration by use of a FCFS (First Come First Served) scheduling policy. In [15] contiguous and non contiguous node allocation schemes for ....
....in [17] 16] presents a strategy that minimizes network contention due to both communication and I O traffic. All the above studies focus exclusively on the topic of contiguous node allocation schemes, but do not address the issue of job scheduling onto the parallel systems. The studies in [12,18] focus on job scheduling techniques for improving the performance of hypercube computers. In [12] the roles of processor allocation and job scheduling in improving the performance of hypercube computers are compared. A new scheduling algorithm is proposed for improving the average response time ....
[Article contains additional citation context not shown here]
P. Krueger, T. Lai, and V.A. Dixit-Radiya, "Job Scheduling Is More Important than Processor Allocation for Hypercube Computers", IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 5, pp. 488-497, May 1994.
....scheduling model and provide several applications for the model. O(n log n) algorithms are developed for some of the problems formulated and some others are shown to be NP hard. 1 Introduction The problem of scheduling a multiprocessor computer system has received considerable attention [3, 4, 5, 6, 7, 8, 9, 10]. In this paper, we develop a model to schedule a parallel computer system in which the parallel computer operates under control of a host processor. The host processor is referred to as the master processor and the processors in the parallel computer are referred to as the slave processors. The ....
P. Krueger, T. Lai, and V. Dixit-Radiya, Job scheduling is more important than processor allocation for hypercube computers, IEEE Trans. on Parallel 4 Distributed Systems, 5, 5, 488-497, 1994.
.... truck that delivers the raw material for the job) need not be the same as the one that postprocesses job i (i.e. the truck that brings back the finished goods corresponding to this job) While the problem of scheduling multiprocessor computer systems has received considerable attention [3] 4] [10], 12] 14] 15] 18] 22] it appears that the master slave model has not been studied prior to the work of Sahni [19] It is interesting to note that the master slave scheduling model may be regarded as a variant of the job shop (see [1] 2] for a definition of a job shop as well as for ....
P. Krueger, T. Lai, and V. Dixit-Radiya, Job scheduling is more important than processor allocation for hypercube computers, IEEE Trans. on Parallel Distributed Systems, 5, 5, 488-497, 1994.
.... truck that delivers the raw material for the job) need not be the same as the one that post processes job i (i.e. the truck that brings back the finished goods corresponding to this job) While the problem of scheduling multiprocessor computer systems has received considerable attention [3] 4] [10], 12] 14] 15] 18] 21] it appears that the master slave model has not been studied prior to the work of Sahni [19] It is interesting to note that the master slave scheduling model may be regarded as a variant of the job shop (see [1] 2] for a definition of a job shop as well as for ....
....makespan schedule can be found in O(n log n) time. Fast polynomial time algorithms to obtain minimum makespan schedules in which the pre and post processing orders are the same (or reverse) and a job may wait between the completion of one task and the start of the next are also developed in [10]. For no wait scheduling, the single master master slave model and the coupled task model of [17] are identical. Orman and Potts [17] show that many versions of this latter problem are strongly NP hard. These results carry over to the no wait master slave model. The outline of the rest of this ....
P. Krueger, T. Lai, and V. Dixit-Radiya, Job scheduling is more important than processor allocation for hypercube computers, IEEE Trans. on Parallel 4 Distributed Systems, 5, 5, 488-497, 1994. 20
.... rst may be expected to lead to better performance [16] But what if there is a correlation between size and running time If this is an inverse correlation, we nd a win win situation: the larger jobs are also shorter, so packing them rst is statistically similar to using SJF (shortest job rst) [47]. But if size and runtime are correlated, and large jobs run longer, scheduling them rst may cause signi cant delays for subsequent smaller jobs, leading to dismal average performance [53] System Correlation CTC SP2 0:029 KTH SP2 0.011 SDSC SP2 0.145 LANL CM 5 0.211 SDSC Paragon 0.305 ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, \Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488-497, May 1994.
....obtained from the IEEE. manner to submitted jobs. But this approach su ers from fragmentation, where free processors cannot meet the requirements of the next job, and therefore remain idle until additional ones become available. As a result system utilization is typically in the range of 50 80 [21, 16, 8, 11, 15]. It is well known that the best solutions for this problem are to use dynamic partitioning [20] or gang scheduling [6] However, these schemes have practical limitations. The only ecient and widely used implementation of gang scheduling was the one on the CM5 Connection Machine; other commercial ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, \Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488-497, May 1994.
....of processors becomes available, the job is executed, and the processors are dedicated to it until it terminates or is killed. This scheme is called variable partitioning [3] Allocating partitions on a FCFS basis results in severe fragmentation, and typical utilization of such systems is 5080 [6, 7, 9, 12]. Two solutions that have proposed to this problem, dynamic partitioning [11, 1] and gang scheduling [4] are difficult to implement and do not enjoy much use. A far simpler approach is to use a non FCFS policy when allocating partitions, for example by allowing small jobs from the back of the ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers ". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488--497, May 1994.
....(FCFS) manner to submitted jobs. But, this approach suffers from fragmentation, where free processors cannot meet the requirements of the next job and therefore remain idle until additional ones become available. As a result, system utilization is typically in the range of 50 80 percent [21] [16], 8] 11] 15] It is well known that the best solutions for this problem are to use dynamic partitioning [20] or gang scheduling [6] However, these schemes have practical limitations. The only efficient and widely used implementation of gang scheduling was the one on the CM 5 Connection ....
P. Krueger, T.-H. Lai, and V.A. Dixit-Radiya, "Job Scheduling Is More Important than Processor Allocation for Hypercube Computers," IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 5, pp. 488-497, May 1994.
....workloads that conform to the di erent models leads to drastically di erent results. Consider a workload that is composed of jobs the use power of two processors. In this case a reasonable scheduling algorithm is to cycle through the di erent sizes, because the jobs of each size pack well together [16]. This works well for negatively correlated and even uncorrelated workloads, but is bad for positively correlated workloads [16, 17] The reason is that under a positive correlation the largest jobs dominate the machine for a long time, blocking out all others. As a result, the average response ....
....the use power of two processors. In this case a reasonable scheduling algorithm is to cycle through the di erent sizes, because the jobs of each size pack well together [16] This works well for negatively correlated and even uncorrelated workloads, but is bad for positively correlated workloads [16, 17]. The reason is that under a positive correlation the largest jobs dominate the machine for a long time, blocking out all others. As a result, the average response time of all other jobs grows considerably. But which model actually re ects reality Again, evaluation results depend on the selected ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, \Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488-497, May 1994.
....compare many real workload traces and synthetic workload models, both analytically and through simulation, observing their effects on the performance evaluation of several classes of scheduling and allocation strategies. Included were two scheduling algorithms: First Come First Served and ScanUp [16], a multi level queuing algorithm, and three static allocation strategies: First Fit [24] Frame Sliding [3] and Paging [17] The real traces were captured from four production machines in use for scientific computing at research labs and supercomputer sites around the world (two IBM SP 2s, an ....
....of processors that is a function of the number of free processors in the system and of characteristics of the job. algorithms. Of special interest to our study is Krueger s work on scheduling and allocation performance under workloads exhibiting negative correlations between jobsize and runtimes [16]. As we shall see, our work further explains some of the phenomena he observed. The only recent study that we know of that has focused on the experimental methodology itself, i.e. the effect choice of workload has on scheduling performance results, is that of Chiang et al. 2] They compare the ....
[Article contains additional citation context not shown here]
P. Krueger, T. Lai, and V. A. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994.
....job onto it [3, 4, 12, 20] In most cases, however, processor utilization is far from optimal because of fragmentation. Krueger et al. proposed a new job scheduling scheme, called scan , and found that job scheduling order, not mapping, is more important to achieve higher processor utilization [11]. All methods, however, are batch scheduling and an interactive programming environment can not be provided. CM 5[18] and Paragon[10] provide time sharing scheduling. In CM 5, partitioning can only be changed at system bootup time, and in Paragon (OSF 1) partitioning and the partition in which a ....
P. Krueger, T.-H. Lai, and V. A. Dixit-Radiya. Job Scheduling Is More Important than Processor Allocation for Hypercube Computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, 1994.
....linear programming has been proposed. The reported good experimental results for the second method have been explained by a particular topology of the criterial function. For P j cube j j C max Largest Dimension First (LDF) heuristic with tight performance ratio 2 0 1=m is proposed in [85] In [62] an experimental study is reported for dynamic scheduling (i.e. the set of tasks is not known in advance) for problem P j cube j j P C j . It is observed that even sophisticated processor allocation strategies alone cannot guarantee good performance. A set of Scan strategies is proposed which ....
P.Krueger, T.-H.Lai, V.A.Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers", IEEE Transactions on Parallel and Distributed Systems 5/5 (1994) 488-497. 23
....class of contiguous allocation strategies restricts the nodes allocated to a given job to form a convex shape. Fig. 1 shows an example. Performance suffers significantly due to processors being wasted because of internal and external fragmentation. Utilizations of only 34 to 66 are reported [9, 18, 8, 10]. In contrast, the class of noncontiguous allocation strategies allocates nodes that are dispersed throughout the system. Fig. 3 shows examples. They experience no fragmentation and thus outperform contiguous strategies reaching utilizations of up to 78 [10, 15, 14] To further improve the ....
....1 2 3 0 4 5 6 7 unallocated node node allocated to job A node allocated to job B link affected by job A link affected by job B Figure 2: External contention for the link between (6,0) and (7,0) machine. Using First Come, First Served scheduling, utilizations of only 34 to 66 are reported [9, 18, 8, 10] due to serious fragmentation problems. This is unfavorable and contradicts the goal of high throughput over a stream of jobs. 2.2 Non contiguous allocation strategies and contention To utilize unallocated nodes that are not necessarily contiguous, non contiguous processor allocation strategies ....
P. Krueger, T. Lai, and V. A. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994. 18
....compare many real workload traces and synthetic workload models, both analytically and through simulation, observing their effects on the performance evaluation of several classes of scheduling and allocation strategies. Included were two scheduling algorithms: First Come First Served and ScanUp [15], a multi level queueing algorithm, and three static allocation strategies: First Fit [23] Frame Sliding [3] and Paging [16] The real traces were captured from four production machines in use for scientific computing at research labs and supercomputer sites around the world (two IBM SP 2s, an ....
....and processor allocation algorithms have offered insights into the effects of workload characteristics on those algorithms. Of special interest to our study is Krueger s work on scheduling and allocation performance under workloads exhibiting negative correlations between jobsize and runtimes [15]. As we shall see, our work further explains some of the phenomena he observed. The only recent study that we know of that has focused on the experimental methodology itself, i.e. the effect choice of workload has on scheduling performance results, is that of Chiang et al. 2] They compare the ....
[Article contains additional citation context not shown here]
P. Krueger, T. Lai, and V. A. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994.
....rst come rst serve (FCFS) manner to submitted jobs. But this approach su ers from fragmentation, where free processors cannot meet the requirements of the next job, and therefore remain idle until additional ones become available. As a result system utilization is typically in the range of 50 80 [21, 16, 8, 11, 15]. It is well known that the best solutions for this problem are to use dynamic partitioning [20] or gang scheduling [6] However, these schemes have practical limitations. The only ecient and widely used implementation of gang scheduling was the one on the CM 5 Connection Machine; other ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, \Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488-497, May 1994.
....that execute the job is the same as that requested by the job. A job scheduler needs to select a job to dispatch in appropriate order so as to execute multiple jobs efficiently. Many job scheduling algorithms, e.g. conventional FCFS (First Come First Served) LJF [1] Backfilling [2, 3] Scan [4], etc. have been proposed, and performance of these algorithms have been evaluated. Many previous performance evaluation works assumed that characteristics of parallel jobs, or a parallel workload, followed a simple mathematical model. However, recent analysis of real workload logs, which are ....
....investigated performance of job scheduling algorithms under more realistic workloads, which have above job size characteristics [5, 6, 7, 9] For instance, Lo, Mache and Windisch compared performance of job scheduling algorithms under various workload models. They showed that the ScanUp algorithm [4] performed well, or increased processor utilization, as the proportion of jobs requesting power of two processors in the workload increased. However, mechanisms how the job size characteristics affect performance of job scheduling algorithms have not yet been clear. This paper presents ....
[Article contains additional citation context not shown here]
P. Krueger, T. Lai, and V. A. Dixit-Radiya. Job Scheduling Is More Important than Processor Allocation for Hypercube Computers. IEEE Trans. on Parallel and Distributed Systems, 5(5):488--497, 1994.
....higher resource utilization, than FCFS. However, the fragmentation can still be quite large [363] Despite the intuitive appeal of some of these scheduling policies, studies indicate that they do not necessarily perform better than a straightforward FCFS strategy in a partitioned environment [331, 330] 7 . Moreover, systems using these schemes tend to saturate under lighter loads than FCFS. Using uniform requests for subcubes of different sizes in a hypercube as a concrete example, saturation may occur at loads as low as 40 50 of capacity [330] One simple improvement that has been ....
....FCFS strategy in a partitioned environment [331, 330] 7 . Moreover, systems using these schemes tend to saturate under lighter loads than FCFS. Using uniform requests for subcubes of different sizes in a hypercube as a concrete example, saturation may occur at loads as low as 40 50 of capacity [330]. One simple improvement that has been suggested is to change the selection algorithm on line, depending on the characteristics of the workload. Thus a decreasing order of jobs will be used to improve utilization when large batch jobs are dominant, but small jobs will be scheduled first when the ....
[Article contains additional citation context not shown here]
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488--497, May 1994.
....first may be expected to cause less fragmentation, and therefore higher resource utilization, than FCFS. Despite the intuitive appeal of some of these scheduling policies, studies indicate that they do not necessarily perform better than a straightforward FCFS strategy in a partitioned environment [203, 202] 5 . Moreover, systems using these schemes tend to saturate under lighter loads than FCFS. Using uniform requests for subcubes of different sizes in a hypercube as a concrete example, saturation may occur at loads as low as 40 50 of capacity [202] 5 The more optimistic results in [230, 217] ....
....FCFS strategy in a partitioned environment [203, 202] 5 . Moreover, systems using these schemes tend to saturate under lighter loads than FCFS. Using uniform requests for subcubes of different sizes in a hypercube as a concrete example, saturation may occur at loads as low as 40 50 of capacity [202]. 5 The more optimistic results in [230, 217] are due to a model in which threads are independent, kept in a global queue, and PEs are allocated singly. In such a model, there is no loss to fragmentation. 21 now execution time time PEs reservation for large job terminated expected ....
[Article contains additional citation context not shown here]
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488--497, May 1994.
....jobs are submitted via a queuing system such as NQS. This approach results in severe fragmentation, because processors which cannot fulfill the demands of the next job in the queue must remain idle until more processors are freed. FCFS based schedulers show a typical system utilization of 50 80 [6, 10, 12, 16]. Two solutions have been proposed to this problem, but both suffer from practical limitations. This first is dynamic partitioning [15] in which jobs may gain or lose some of their processors dynamically during their lifetime. Jobs may also be halted in favor of other jobs, and renewed later, ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488--497, May 1994.
....and partitioned may be formed using arbitrary subsets of processors. A rigid job is submitted for execution along with a specification of the number of processors that it requires. The scheduler then creates a partition of that size and schedules the job to execute within that partition [53,32,20,1,9,31,33]. With moldable jobs, it is the scheduler that selects the partition size [44] Evolving and malleable jobs require partitions that are not only flexible but can also change dynamically at runtime. This places an added burden both on the programmer, who must write application code that requests ....
....This scheme has been called variable partitioning or pure space sharing in the literature. Despite its simplicity and the resulting drawbacks in terms of responsiveness, fragmentation, and reliability, this scheme is widely used. It is especially common on large distributed memory machines [53,27,20,9,31]. The reason is that it gets the job done, albeit not optimally, but with relatively little investment in system development. In an industry where time to market is a crucial element of success, this is a true virtue [1] As a result, users sometimes have to revert to signup sheets as the actual ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488--497, May 1994.
....directly, and to batch jobs that are submitted via a queueing system such as NQS. But this approach suffers from fragmentation, where processors cannot meet the requirements of the next queued job and therefore remain idle. As a result system utilization is typically in the range of 50 80 [12, 9, 4, 7]. It is well known that the best solutions for this problem are to use dynamic partitioning [11] or gang scheduling [3] However, these schemes have practical limitations. The only 1 efficient and widely used implementation of gang scheduling was the one on the CM 5 Connection Machine; other ....
P. Krueger, T-H. Lai, and V. A. Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers". IEEE Trans. Parallel & Distributed Syst. 5(5), pp. 488--497, May 1994.
....it requests. External fragmentation exists when a sufficient number of processors are available to satisfy a request, but they cannot be allocated contiguously. Experimental evidence has shown that little improvement in performance can be realized by refinements of contiguous allocation algorithms [15]. As a result, recent research efforts have focused on the choice of job scheduling policies and their impact on contiguous allocation schemes. Our research takes a different approach to overcoming the limitations of contiguous allocation. We are investigating processor allocation algorithms which ....
....strategies, Frame Sliding, First Fit and Best Fit, which were proposed by other researchers. Variations of these conventional algorithms, as well as an interesting algorithm with similarities to MBS, have been implemented on real mesh and hypercube systems. Work by Phillip Krueger, et al. [15], describes the performance limitations of all contiguous allocation schemes and thus motivates our investigation of non contiguous approaches. 2 D Buddy The two dimensional buddy strategy, a generalization of the one dimensional binary buddy system for memory management [26] 14] is proposed by ....
[Article contains additional citation context not shown here]
P. Krueger, T. Lai, and V. A. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994.
....high. When servicing a dynamic workload under certain job scheduling policies, significant pockets of wasted processors, too small for most jobs, become scattered around the system and lead to impaired utilization of the system s processor resources. In binary hypercubes, Kreuger, Lai and Radiya [6] have shown that the maximum utilization attainable by contiguous allocation for uniform workloads under FCFS scheduling is 58 , and we have shown that the maximum for 2D mesh architectures is only 46 [8] Significant improvement is achievable by the use of improved job scheduling policies. ....
....FCFS scheduling policy is used. This is due to the nature of the contiguity constraint and will be studied quantitatively in Section 6, where we will compare many allocation strategies by simulation. Improved performance requires exploration of other alternatives, including scheduling policies [6] and the approach we propose: non contiguous allocation. 5 New Non contiguous Strategies Non contiguous processor allocation algorithms overcome the fragmentation drawback of contiguous strategies by allowing a job s allocation to be dispersed between many non adjacent regions of the topology ....
Phillip Krueger, Ten-Hwang Lai, and Vibha A. DixitRadiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994.
....recognition ability and time complexity. The buddy allocation scheme is simple and has low time complexity. In spite of its low subcube recognition ability, it has been shown that the buddy strategy performs well in hypercube systems when compared with the perfect recognition allocation schemes [9]. The scheduling schemes proposed for non real time hypercubes include scan [9] and lazy [10] Krishnamurti and Narahari have proposed a preemptive scheduling scheme for non real time, partitionable parallel architectures [8] Hong and Leung have proposed an on line scheduling algorithm for ....
....and has low time complexity. In spite of its low subcube recognition ability, it has been shown that the buddy strategy performs well in hypercube systems when compared with the perfect recognition allocation schemes [9] The scheduling schemes proposed for non real time hypercubes include scan [9], and lazy [10] Krishnamurti and Narahari have proposed a preemptive scheduling scheme for non real time, partitionable parallel architectures [8] Hong and Leung have proposed an on line scheduling algorithm for real time tasks with one common deadline [5] Sahni has shown that when the common ....
P. Krueger, T. H. Lai, and V. A. Radiya, "Job Scheduling is More Important than Processor Allocation for Hypercube Computers," IEEE Trans. on Parallel and Distributed Systems, May 1994.
....to detect free nodes and to decide whether the free nodes can form a submesh for the execution of a job. Allocation algorithms with better recognition ability for available submeshes can improve the chance of assigning a job into the system and reduce the waiting delay. However, as studies [8, 9] showed, significant performance improvement cannot be obtained by refining the conventional allocation algorithms. A multicomputer is typically underutilized when operated in a dynamic environment because of fragmentation problems. Fragmentation occurs when there are free nodes in the system but ....
....comm so that the turnaround time of a job in Equation 1 can benefit from shorter queuing delay without being penalized for the communication latency. Experimental study indicates that the queuing delay can be greatly reduced resulting in a high DeltaT queue if the fragmentation can be reduced [8, 9, 10, 11]. Design of faster switching devices and wormhole routing techniques have made the message passing latency insensitive to the communication distance making DeltaT comm negligible. Non contiguous allocation is hence a very attractive alternative for processor allocation. Non contiguous allocations ....
[Article contains additional citation context not shown here]
P. Krueger, T. Lai, and V. A. Dixit-Radiya, "Job Scheduling is More Important than Processor Allocation for Hypercube Computers," IEEE Trans. on Parallel and Distributed Systems, 5, pp.488-197, May, 1994.
....only be made based on the past and current arrivals of the tasks. Task scheduling strategies can be generally classified as eager scheduling and lazy scheduling. Eager scheduling attempts to schedule the tasks whenever there are free processors available. For example, the SCAN policy proposed in [2] classify the tasks according to the number of processors requested. When there are tasks finishing execution and releasing processors, the scheduler examines the classes in a round robin fashion All tasks in the same class are allocated before the scheduler moves on to the next class. One major ....
Krueger, P., Lai, T.H., Dixit-Radiya, V.A.: Job scheduling is more important than processor allocation for hypercube computers. IEEE Trans. on Parallel and Distributed Systems 5 (1994)
....by experimentation on the Paragon under SUNMOS, an operating system that provides high communication performance. It has been argued that the number of processors assigned to the application is more important than the partition geometry or the process mapping withing the assigned partition [LKD94]. In this paper, we have demonstrated that the data and or process placement scheme can become a source of internal network contention and limit workload scalability, even for a small number of processors. Future work includes studying the impact on the workload execution time of external ....
P. Krueger, T. Lai, V.A. Dixit-Radiya, "Job scheduling is more important than processor allocation for hypercube computers," IEEE Trans. on Parallel and Distributed Systems, Vol. 5, No. 5, pp 488-497, May 1994.
....than the Simple Buddy scheme. The simulations show that there is hardly any gain from using a more exhaustive scheme, as predicted by our efficiency metric. Hence we would expect that future research should be addressed towards devising more efficient scheduling strategies as shown by Krueger [4]. The rest of the paper is organized as follows: Section 2 presents some background information about the Star Graph and some useful theorems are derived. Relevant properties of the k ary n cube are discussed in section 3. The efficiency metric is introduced in section 4. The simulation results ....
....uniform and uniformly decreasing. The uniform distribution uses a constant probability p(r) 1= n Gamma 1) for the request size, while the uniformly decreasing distribution uses probability p(r) c=r such that P n Gamma1 1 c=r = 1. The utilization of the system can be determined by [4]: mean request size Theta mean allocation rate Theta mean residence time Total number of nodes in the Star Graph We compared a Simple Buddy and an Extended Buddy scheme. The Extended Buddy required n times the storage as compared to the Simple Buddy scheme. Their respective recognization ....
[Article contains additional citation context not shown here]
P. Krueger, T. Lai, and V. Radiya, "Job scheduling is more important than processor allocation for hypercube computers," IEEE Trans. Parallel and Distrib. Systems., vol. 5, pp. 488 -- 497, May 1994.
....supercomputing sites. The class of contiguous allocation strategies restricts the nodes allocated to a given job to form a convex shape. Performance suffers significantly due to processors being wasted because of internal and external fragmentation. Utilizations of only 34 to 66 are reported [2, 10, 1, 3]. The class of non contiguous allocation strategies allocates nodes that are dispersed throughout the system. They experience no fragmentation and thus outperform contiguous strategies reaching utilizations of 78 [3, 8, 7] To further improve the performance of noncontiguous strategies, it is ....
P. Krueger, T. Lai, and V. A. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994.
....it requests. External fragmentation exists when a sufficient number of processors are available to satisfy a request, but they cannot be allocated contiguously. Experimental evidence has shown that little improvement in performance can be realized by refinements of contiguous allocation algorithms [5]. As a result, recent research efforts have focused on the choice of scheduling policies and their impact on contiguous allocation schemes. Our research takes a different approach to overcoming the limitations of contiguous allocation. We are investigating processor allocation algorithms which ....
....results. Section 6 summarizes our results and discusses future work. 2 Previous Research Work The Multiple Buddy Strategy proposed in this paper is an extension of the 2 D Buddy Strategy. Our simulations compare the performance of MBS with Frame Sliding, First Fit and Best Fit. The Kreuger paper [5] describes the performance limitations of all contiguous allocation schemes and thus motivates our investigation of non contiguous approaches. The two dimensional buddy strategy, a generalization of the one dimensional binary buddy system for memory management, is proposed by Li and Cheng [6] for ....
[Article contains additional citation context not shown here]
Phillip Krueger, Ten-Hwang Lai, and Vibha A. DixitRadiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994.
....ability. On the other hand, schemes like MSS and PCgraph have the complete subcube recognition ability at the cost of high time complexity. Comparison of all the hypercube allocation policies shows that the performance improvement due to better subcube recognition ability is not significant [9,12]. The difference between a policy having perfect subcube recognition ability and the simple buddy scheme is minimal. This is mainly because of the first come first serve (FCFS) discipline used for job scheduling. In a dynamic environment, FCFS scheduling may not efficiently utilize the system. ....
....logical to focus attention on efficient scheduling schemes to improve system performance while keeping the allocation complexity minimal. Little attention has been paid towards scheduling of jobs in a distributed system like hypercube. The first effort in this direction was by Krueger et al. [12]. They propose a scheme called scan, which segregates the jobs and maintains a separate queue for each possible cube dimension. This eliminates the blocking problem associated with the FCFS scheme. The queues are served similar to the c scan used in disk scheduling. It is shown that a significant ....
[Article contains additional citation context not shown here]
Krueger, P., Lai, T. H., Radiya, V. A. Job Scheduling is More Important than Processor Allocation for Hypercube Computers. IEEE Trans. on Parallel and Distributed Systems, May 1994, pp. 488-497.
....runs very high. When servicing a dynamic workload under FCFS scheduling, significant pockets of wasted processors become scattered around the system. These regions of wasted processors lead to impaired utilization of the system s processor resources. In binary hypercubes, Kreuger, Lai and Radiya [7] have shown that the maximum utilization attainable by contiguous allocation for uniform workloads is 58 . For the mesh architectures, we have shown in [9] that the maximum utilization attainable by contiguous allocation is 46 . Despite this drawback, contiguous strategies remain the most commonly ....
....allocation strategies for the k ary n cube. to the nature the contiguity constraint and will be studied quantitatively in Section 6, where we will compare many allocation strategies by simulation. Improved performance requires exploration of other alternatives, including scheduling policies [7] and the approach we propose: non contiguous allocation. Table 1 compares the recognition capability and complexity of the contiguous strategies presented. 5 New Non contiguous Strategies Non contiguous processor allocation algorithms overcome the fragmentation drawback of contiguous strategies ....
[Article contains additional citation context not shown here]
Phillip Krueger, Ten-Hwang Lai, and Vibha A. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Transactions on Parallel and Distributed Systems, 5(5):488--497, May 1994.
No context found.
P. Krueger, T.-H. Lai, and V. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Trans. on Parallel and Distributed Systems, 5(5):488--497, 1994.
No context found.
P. Krueger, T.-H. Lai, and V. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Trans. on Parallel and Distributed Systems, 5(5):488--497, 1994.
No context found.
P. Krueger, T.-H. Lai, and V. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Trans. on Parallel and Distributed Systems, 5(5):488--497, 1994.
No context found.
P. Krueger, T. H. Lai, and V. A. Radiya, "Job Scheduling is More Important than Processor Allocation for Hypercube Computers," IEEE Transactions on Parallel and Distributed Systems, vol. 5, pp. 488 -- 497, May 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC