| Cathy McCann and John Zahorjan. Processor Allocation Policies for Message-Passing Parallel Computers. In Proceedings of the 1994. |
....most cases, however, it is more difficult to support malleability in the way an application is written. One way of attaining a limited form of malleability is by creating as many threads in a job as the largest number of processors that would ever be used, and then using multiplexing (or folding [51,38]) to have the job execute on a lesser number of processors. Alternatively, a job can be made malleable by inserting application specific code at particular synchronization points to repartition the data in response to any change in processor allocation. The latter approach is somewhat more ....
....of the difference in Actually, EASY only guarantees that the first job in the queue will not be delayed. performance can be obtained by using knowledge of job characteristics, and assigning non preemptive priorities to certain job classes for admission to fixed partitions. McCann and Zahorjan [51] found that efficiency preserving scheduling using folding allowed performance to remain much better than with equipartitioning (EQUI) as load increases. Padhye and Dowdy [60] compare the effectiveness of treating jobs as moldable to that of exploiting their malleability by folding. They find ....
[Article contains additional citation context not shown here]
C. McCann and J. Zahorjan, "Processor allocation policies for message passing parallel computers". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 19--32, May 1994.
.... current practice, supercomputer schedulers accept rigid requests [17] 20] 22] 27] and thus much of the research available in the literature assume jobs to be rigid, e.g. 1] 2] 14] 21] 33] Closer to our own work, there has been studies on processor allocation [3] 8] 10] 12] 16] 19] [23] [25] 28] 29] 30] 31] Processor allocation consists of enabling the supercomputer scheduler to select how many processors to allocate to a parallel job based on information about the characteristics of the job (e.g. sequential fraction, average parallelism, and maximum 28 parallelism) and or ....
....to be fully moldable in the sense that they can use any number of processors, and the user typically do not provide request times. Strategies that use knowledge about the job have been proposed [3] 8] 12] 25] 28] 30] Adaptation to the system load has also been investigated before [3] 19] [23]. Downey has studied whether the job ahead of a FIFO queue should delay its start up to use more processors [10] Non work conserving strategies were also evaluated by Rosti et al. [29] The results of these efforts indicate that performance improves when processor allocation takes into account job ....
C. McCann and J. Zahorjan. "Processor allocation policies for message-passing parallel computers". Proc. of the 1994.
....[9] has developed a method for delay prediction that may be more useful. There has been a lot of research on how applications can use different number of processors in order to adjust to current load conditions. Two main categories are adaptive partitioning [5, 7, 8] and dynamic partitioning [4, 6]. Adaptive partitioning algorithms make decisions on job partition sizes before their start. Dynamic partitioning can change job partition size during its running time. Most of these studies use response time as metric and assume that all jobs can run on any partition size with the same ....
C. McCann and J. Zahorjan, "Processor Allocation Policies for Message-Passing Parallel Computers," in Proceedings of the 1994.
....in our analysis of dynamic partitioning. The times at which jobs arrive to the system are defined by the distributions A i , 0 i N , which are dependent upon the number of jobs i in the system. These arrival times are most often modeled by a Poisson distribution in the research literature [4, 42, 21, 22, 33, 17, 32, 25]. We thus assume that jobs come 3 The details of exactly how the dynamic partitioning policy allocates processors to jobs when i does not evenly divide P , as well as the service rates for each of these cases, are easily incorporated in our model (see Sections 2 and 3, and [37, 38] Here we make ....
C. McCann and J. Zahorjan. Processor allocation policies for message-passing parallel computers. In Proc. ACM SIGMETRICS Conf., 19--32, 1994.
....can in general limit and or eliminate the potential system performance benefits. We have been exploring several variants of dynamic partitioning to address these issues. One strategy for decreasing the overheads associated with dynamic equi partitioning is to use the folding approach found in [24], which reduces the number reconfigurations performed under the greedy dynamic policy, at the expense of a less equitable allocation of the nodes among the competing jobs. Another approach consists of using the equi partitioning method to equally divide the nodes among the jobs in the system ....
C. McCann and J. Zahorjan. Processor allocation policies for message-passing parallel computers. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 19--32, May 1994. 19
....multiprocessor architecture, and special requirements of certain application workloads are among the factors affecting the effectiveness of the processor allocation strategies for multiprogramming parallel systems. Several processor allocation policies have been proposed in the literature [Oust82, MEB88, Sev89, TG89, PD89, LV90, DCDP90, ZM90, GST91, GTU91, MEB91, ZB91, MVZ93, SST93, RSDSC94, Sev94, CMV94, MZ94]. Each of these policies is designed to perform well under certain conditions. In this paper preemptive and non preemptive space sharing policies are considered. Under preemptive policies, parallel programs can be stopped during execution to allow for resource redistribution according to changing ....
....are considered. Under preemptive policies, parallel programs can be stopped during execution to allow for resource redistribution according to changing system loads. Time sharing polices are inherently preemptive in nature. Several space sharing preemptive policies have also been proposed [TG89, MZ94]. Performance analysis of preemptive space sharing scheduling policies has been largely based on simulation studies and Markovian analysis of small systems. For a simulation study to be accurate and realistic, detailed knowledge of various parameters of the system under consideration is necessary. ....
[Article contains additional citation context not shown here]
C. McCann, J. Zahorjan, "Processor allocation policies for message-passing parallel computers," Proc. ACM SIGMETRICS, 1994, pp. 19-32.
....strategies for multiprogramming parallel systems. In this paper, effects of some of these factors on the performance of dynamic and adaptive space sharing policies are investigated empirically. Several dynamic and adaptive processor allocation policies have been discussed in the literature [MEB88, TG89, PD89, LV90, DCDP90, ZM90, GTU91, ZB91, MVZ93, SST93, RSDSC94, SEV94, CMV94, MZ94]. Performance analysis of dynamic space sharing scheduling policies has been largely based on simulation studies and Markovian analysis of small systems. For a simulation study to be accurate and realistic, detailed knowledge of various parameters of the system under consideration is necessary. ....
....policies have been done mainly for shared memory architectures [GTU91, TG89] In this paper, experimental analysis of dynamic processor partitioning policies for a message passing architecture is presented. To this end, two dynamic processor partitioning policies based on those discussed in [MZ94] and one adaptive policy presented in [RSDSC94] are implemented on the Intel Paragon. Two workload programs are used to compare behavior of the implemented scheduling policies. One is a synthetic workload program designed to emulate various speedup curves and other characteristics of common ....
[Article contains additional citation context not shown here]
C. McCann, J. Zahorjan, "Processor allocation policies for messagepassing parallel computers," Proc. ACM SIGMETRICS, 1994, pp. 19-32.
....the needs of both interactive applications belonging to workstation owners and multiple batch applications trying to scavenge idle cycles from the cluster. One of the scheduling policies that has been shown to have good performance in dedicated parallel environments is dynamic space sharing [6, 13, 17]. Under dynamic space sharing, the processors of a parallel system are divided into disjoint partitions that are allocated to individual jobs. However, the number of processors allocated to a partition can be changed dynamically in response to events such as new job arrivals or job departures. In ....
....applications, and presents measurement results for the cost of dynamic reconfiguration. Section 5 describes future work and conclusions. 2 Motivation for Dynamic Spacesharing In the last few years, several studies have examined job scheduling strategies for dedicated parallel environments [6, 13, 11, 7, 17]. Policies that employ dynamic space sharing have been shown to outperform other policies such as static space sharing and gangscheduling with static partitioning for many workloads. In addition to its advantages in dedicated environments, dynamic space sharing has some natural advantages in ....
[Article contains additional citation context not shown here]
C. McCann and J. Zahorjan. Processor allocation policies for message-passing parallel computers. In Proceedings of 1994 ACM Sigmetrics Conference, pages 19--32, Nashville, May 1994.
....running on each processor. Various techniques can be used to minimize the context switch overheads associated with this approach, as long as threads do not synchronize too frequently [GTU91] In message passing systems, threads can be folded to keep the work assigned to each processor balanced [MZ94] Second, the operating system can cooperate with the application when changing processor allocation so that the application can repartition and or redistribute the data and change the number of active processes accordingly [TG89] This allows the application to continue to execute efficiently ....
Cathy McCann and John Zahorjan. Processor allocation policies for message-passing parallel computers. In Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 19--32, 1994.
....establishments, resource management of these machines becomes a more pressing issue. Job scheduling is an important aspect of this work, and much research on the topic of job scheduling for parallel computers has been done in recent years. Some have focused on distributed memory machines [7, 8, 9, 11, 14], others on more general systems [3, 6, 10, 15] Many different strategies (static vs. dynamic, time sharing vs. space sharing, etc. have been investigated and compared. All these studies make underlying assumptions about the workload in order to quantify their results. Some of these studies use ....
C. McCann and J. Zahorjan. "Processor Allocation Policies for MessagePassing Parallel Computers". In Proceedings of ACM SIGMETRICS Conference, 1994. 22(1): p. 19-32.
....to a dynamically changing number of PEs is possible in other programming models as well, but it requires significant effort on the side of the program developer. For example, jobs written for distributed memory machines can adjust to a changing number of PEs by redistributing their data structures [422, 171, 396]. In some cases, such redistribution is supported by the system (albeit with the intention of redistribution for different phases of the computation, not for changes in the available PEs) 23, 378] As this involves considerable overhead, it is imperative that the frequency of reconfigurations be ....
....large partitions [620, 278] Efficiency can also become an issue. Consider a job that is written so that the work is divided into 8 equal pieces (e.g. 8 chores) Running such a job on a partition of 7 PEs would not accrue any benefits over a partition of only 4 PEs, leading to waste of resources [396]. A possible solution is to use a folding policy rather than an equipartition policy (Fig. 15) This policy assumes that the total number of PEs is a power of two, and that all applications are designed to execute optimally on the full machine. When a new job arrives, the largest 8 Acquisition of ....
[Article contains additional citation context not shown here]
C. McCann and J. Zahorjan, "Processor allocation policies for message passing parallel computers ". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 19--32, May 1994.
....to a dynamically changing number of PEs is possible in other programming models as well, but it requires significant effort on the side of the program developer. For example, jobs written for distributed memory machines can adjust to a changing number of PEs by redistributing their data structures [260, 97, 239]. As this involves considerable overhead, it is imperative that the frequency of reconfigurations be kept low in practice: otherwise the price of reconfiguration might even outweigh the benefits of changing the PE allocation [272, 97, 327] Alternatively, a model where reconfigurations are only ....
....different job classes [259] Efficiency can also become an issue. Consider a job that is written so that the work is divided into 8 equal pieces (e.g. 8 chores) Running such a job on a partition of 7 PEs would not accrue any benefits over a partition of only 4 PEs, leading to waste of resources [239]. A possible solution is to use a folding policy rather than an equipartition policy (Fig. 13) This policy assumes that the total number of PEs is a power of two, and that all applications are designed to execute optimally on the full machine. When a new job arrives, the largest current partition ....
[Article contains additional citation context not shown here]
C. McCann and J. Zahorjan, "Processor allocation policies for message passing parallel computers ". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 19--32, May 1994.
....between processes makes decision process more intricate. It need to consider at least, the communication patterns of the whole application, as well as the global state of the system. Most studies in this field have been done either for distributed systems (LAN) 9, 16] or for parallel machines [19, 17]. Only few have been made for large systems (Utopia [25] or Condor Flock [12] and often with severe limitations. Now, LB systems have to face another challenge : increasing connectivity, together with widely supported communication libraries such as PVM [3] or MPI [24] allow the development of ....
Catherine M. McCann. Processor Allocation Policies for Message-Passing Parallel Computers. PhD thesis, University of Washinton, September 1994.
....write application code that requests and adapts to such changes, and on the scheduler, that must handle the re allocation decisions and coordinate them with the applications. One common heuristic for dynamic partitioning is to strive for equal sized partitions (usually called equipartitioning ) [35]. The problem with this approach is that it might require all jobs to be interrupted whenever something changes. An alternative is to use folding [35] With folding, the number of processors allocated to a job can only grow or shrink by factors of 2. That is, the partition size may be halved or ....
....with the applications. One common heuristic for dynamic partitioning is to strive for equal sized partitions (usually called equipartitioning ) 35] The problem with this approach is that it might require all jobs to be interrupted whenever something changes. An alternative is to use folding [35]. With folding, the number of processors allocated to a job can only grow or shrink by factors of 2. That is, the partition size may be halved or doubled. When a partition is halved, the runtime system may choose to simply fold over the application, and time slice two tasks on each processor. ....
[Article contains additional citation context not shown here]
C. McCann and J. Zahorjan, "Processor allocation policies for message passing parallel computers". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 19--32, May 1994.
.... by considering the heterogeneity and time sharing effects [11] Scheduling policies on multiprocessor multicomputer systems are classical topics, and have been studied by many people on different types of systems, such as shared memory systems [8] 10] distributed memory hypercube [3] and mesh [6]. Since these policies algorithms are designed for a dedicated homogeneous system, they may not be directly applicable to heterogeneous NOW systems. In order to effectively manage the interaction between parallel jobs and user jobs and well utilize the resources of a heterogeneous NOW, a ....
C. M. McCann, Processor allocation policies for message-passing parallel computers, Ph.D. Dissertation, University of Washington, 1994.
....most cases, however, it is more difficult to support malleability in the way an application is written. One way of attaining a limited form of malleability is by creating as many threads in a job as the largest number of processors that would ever be used, and then using multiplexing (or folding [52, 39]) to have the job execute on a lesser number of processors. Alternatively, a job can be made malleable by inserting application specific code at particular synchronization points to repartition the data in response to any change in processor allocation. The latter approach is somewhat more ....
.... characteristics, and assigning non preemptive priori10 ties to certain job classes for admission to fixed partitions [57] McCann and Zahorjan found that efficiencypreserving scheduling using folding allowed performance to remain much better than with equipartitioning (EQUI) as load increases [52]. Padhye and Dowdy compare the effectiveness of treating jobs as moldable to that of exploiting their malleability by folding [61] They find that the former approach suffices unless jobs are irregular (i.e. evolving) in their pattern of resource consumption. Similarly, in the context of ....
[Article contains additional citation context not shown here]
C. McCann and J. Zahorjan, "Processor allocation policies for message passing parallel computers ". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 19--32, May 1994.
....has long been an active area of research. The problem is typically decomposed into two steps: allocation, where individual processes are placed on processors, and dispatching, where those processes are scheduled over time. The allocation step for parallel jobs has been investigated in detail [31, 33, 66, 70, 101, 118, 119, 120, 126, 139, 149]. Given the popular single programmultiple data (SPMD) parallel programming model, it has generally been found that the response time and throughput of the workload are best when competing processes from different jobs can share the same processor [35, 53, 82, 105, 144, 148, 173] In this work, ....
....can be used. Pure space sharing techniques that achieve good response time and throughput require that applications are malleable to the number of available processors, a non trivial programming task. A large number of studies have focused on the allocation step of parallel job scheduling [31, 33, 66, 70, 101, 118, 119, 120, 126, 139, 149]. In this dissertation, we focus on the second step of time sharing processes over time. Two popular methods exist for timesharing competing processes: local scheduling and explicit coscheduling. When processes are locally scheduled by the operating system on each workstation, frequently ....
Cathy McCann and John Zahorjan. Processor Allocation Policies for Message-Passing Parallel Computers. In Proceedings of the 1994 ACM SIGMETRICS Conference, pages 19--32, February 1994.
....service their tasks in a round robin or priority based scheme [Oust82, LV90] Space sharing policies may be preemptive if the partition allocated to an application is allowed to change during its execution. The number of processors assigned to an application changes depending on the system load [TG89, MZ94] or on changes in the applications parallelism [ZM90] Preemptive space sharing policies are also known as dynamic policies [TG89, DCDP90, ZM90] Hybrid partitioning schemes are also preemptive, where the multiprocessor is divided into possibly non disjoint partitions. In each partition, tasks of ....
C. McCann, J. Zahorjan, "Processor allocation policies for message-passing parallel computers," Proc. ACM SIGMETRICS, 1994, pp. 19-32.
....affiliation Universit e Paris 6 et Paris 7 systems can hardly be used efficiently by the user. Automatic Load Balancing (LB) mechanisms help users with automatically dispatching application processes but most studies have been done either for distributed systems (LAN) 1] or for parallel machines [3]. Only few have been made for large systems (Utopia or Condor Flock) or hybrid systems (Stardust) and often with severe limitations. Classical algorithms are not suitable for hybrid and large scale systems due to the wide variety and large number of resources. The major problem of actual LB ....
Catherine M. McCann. Processor Allocation Policies for Message-Passing Parallel Computers. PhD thesis, University of Washinton, September 1994.
....number of nodes allocated to a job can also be modified throughout its execution. In fact, dynamic partitioning can provide the best system performance for a wide variety of application workloads, especially those with lower efficiencies and or those with less variable service time requirements [45, 49, 14, 26, 31, 27]. This is because the dynamic policy can maintain very efficient node utilizations by adjusting node allocation according to workload changes. On the other hand, studies [31, 40, 18] have demonstrated that the overheads associated with dynamic partitioning in distributed environments can ....
....application enters or exits all nodes are equally allocated between the currently executing partitions. This instance of FDP exhibits the basic overheads of the various instances of FDP used in our system. We also compare the overheads with the basic folding strategy without rotation proposed in [27]. According to the latter strategy, the arrival of a job causes the system to split the largest sub partition into two equal sub partitions, one for the old application and one for the new. The completion of an application causes the system to grant the nodes that became available to the smallest ....
C. McCann and J. Zahorjan. Processor allocation policies for message-passing parallel computers. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 19--32, May 1994.
....executed, while the second determines when VPs must be executed. 3 Prior Work on Spatial and Temporal Scheduling There is a plethora of work on scheduling in the literature. However, almost all of the work is in the context of bus based multiprocessor systems and a fork and join programming model [22, 4, 5, 13, 14, 15, 23, 6, 21]. Perhaps the only commonality between the existing literature and the problem at hand is the word scheduling . The conclusions of the scheduling papers and books are primarily artifacts of the dynamically varying number of threads, and the disk access latencies associated with switching between ....
C. McCann and J. Zoharjan. Processor allocation policies for message-passing parallel computers. In Performance Evaluation Review, pages 19--32, May, 1994.
....scheduling scheme for small scale uniform memory access (UMA) multiprocessors, provided applications are coded in a style that can tolerate dynamic changes in the number of processors at runtime. Under certain circumstances, its use can be extended to large distributed memory machines as well [26, 24]. A number of papers have evaluated the performance implications of coscheduling and gang scheduling, and compared them with dynamic partitioning and other scheduling policies for multiprogrammed multiprocessors [10, 28, 22, 16, 35] The Mach scheduler also allows processors to be allocated to ....
C. McCann and J. Zahorjan, "Processor allocation policies for message passing parallel computers". In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 19--32, May 1994.
....be scheduled, and for each the up to J segments of the current frontier must be searched to find the min seg. Thus, BUDDY executes efficiently. The quality of BUDDY s schedules is capture by Theorem 1, which, due to space limitations, we state here without formal proof. The reader is referred to [10] for the complete proof. Theorem 1 The BUDDY algorithm produces an optimal schedule under the restrictions that N , J , and allocations A j for all jobs j are powers of two, and that all jobs are allocated equal resources. 5 While the formal proof that BUDDY is optimal is quite long, its ....
....Since that number must divide N , and since we are considering jobs in strict order of non decreasing minimum node requirement, there are up to p N possibilities to be tested. We state here the major result regarding the quality of the schedules produced by EQUI EPOCH. The reader is referred to [10] for the details of the proof. Theorem 2 When N = 2 n , for some integer n 0, EQUI EPOCH is optimal among epoch scheduling policies that guarantee equal resource allocations. Informally, EQUI EPOCH is optimal in this case because it is never advantageous to schedule fewer jobs in the epoch ....
[Article contains additional citation context not shown here]
Catherine M. McCann. Processor Allocation Policies for Message-Passing Parallel Computers. PhD thesis, University of Washington, 1994.
.... allocation to each runnable job in order to minimize average response time 1 Introduction Much of the work on scheduling policies for multiprogrammed multiprocessors has focused on how many processors to allocate to each runnable job without considering the memory requirements of those jobs [7, 14, 15, 6, 17, 18, 13, 8, 2, 9, 16]. In this paper we consider jobs whose memory requirements imply a lower bound on the amount of machine resource they can be allocated for execution. The interaction of processor scheduling and job memory usage has been considered in [12] However, they examined a paging environment, with the ....
....may not be able to respond to changes in their allocations. In particular, if an application is capable of partitioning of its data and computation into pieces only at load time (or even earlier) allocation of an incompatible number of processors can significantly degrade application efficiency [15, 9]. We do consider this problem in detail in examining our policies. 1 In fact, static disciplines also face possibly severe problems at job arrivals. For instance, if a job arrives to an idle system and is allocated the full machine, this will have a negative impact on jobs arriving soon after ....
[Article contains additional citation context not shown here]
C. McCann and J. Zahorjan. Processor allocation policies for message-passing parallel computers. In Proceedings of ACM SIGMETRICS Conference, pages 19--32, May 1994. 19
No context found.
Cathy McCann and John Zahorjan. Processor Allocation Policies for Message-Passing Parallel Computers. In Proceedings of the 1994.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC