Results 1 - 10
of
82
Job Scheduling in Multiprogrammed Parallel Systems
, 1997
"... Scheduling in the context of parallel systems is often thought of in terms of assigning tasks in a program to processors, so as to minimize the makespan. This formulation assumes that the processors are dedicated to the program in question. But when the parallel system is shared by a number of us ..."
Abstract
-
Cited by 176 (16 self)
- Add to MetaCart
Scheduling in the context of parallel systems is often thought of in terms of assigning tasks in a program to processors, so as to minimize the makespan. This formulation assumes that the processors are dedicated to the program in question. But when the parallel system is shared by a number of users, this is not necessarily the case. In the context of multiprogrammed parallel machines, scheduling refers to the execution of threads from competing programs. This is an operating system issue, involved with resource allocation, not a program development issue. Scheduling schemes for multiprogrammed parallel systems can be classified as one or two leveled. Single-level scheduling combines the allocation of processing power with the decision of which thread will use it. Two level scheduling decouples the two issues: first, processors are allocated to the job, and then the job's threads are scheduled using this pool of processors. The processors of a parallel system can be shared i...
Predicting Application Run Times Using Historical Information
, 1997
"... We present a technique for deriving predictions for the run times of parallel applications from the run times of "similar" applications that have executed in the past. The novel aspect of our work is the use of search techniques to determine those application characteristics that yield ..."
Abstract
-
Cited by 130 (14 self)
- Add to MetaCart
(Show Context)
We present a technique for deriving predictions for the run times of parallel applications from the run times of "similar" applications that have executed in the past. The novel aspect of our work is the use of search techniques to determine those application characteristics that yield the best definition of similarity for the purpose of making predictions. We use four workloads recorded from parallel computers at Argonne National Laboratory, the Cornell Theory Center, and the San Diego Supercomputer Center to evaluate the effectiveness of our approach. We show that on these workloads our techniques achieve predictions that are between 14 and 60 percent better than those achieved by other researchers; our approach achieves mean prediction errors that are between 40 and 59 percent of mean application run times.
Using run-time predictions to estimate queue wait times and improve scheduler performance
- Scheduling Strategies for Parallel Processing
, 1999
"... On many computers, a request to run a job is not serviced immediately but instead is placed in a queue and serviced only when resources are released bypreceding jobs. In this paper, we build on run-time prediction techniques that we developed inprevious research to explore two problems. The rst prob ..."
Abstract
-
Cited by 85 (1 self)
- Add to MetaCart
On many computers, a request to run a job is not serviced immediately but instead is placed in a queue and serviced only when resources are released bypreceding jobs. In this paper, we build on run-time prediction techniques that we developed inprevious research to explore two problems. The rst problem is to predict how long applications will wait in a queue until they receive resources. We show that run-time estimates can be used for this and that using our run-time estimates result in more accurate wait-time predictions than when the run-time prediction techniques of other researches are used. The second problem we investigate is improving scheduling performance. We use run-time predictions to improve the performance of the least work rst and back ll scheduling algorithms. We nd that using our run-time predictor results in lower mean wait times for the workloads with higher o ered loads when compared to alternative run-time predictors. 1
Utilization and Predictability in Scheduling the IBM SP2 with Backfilling
- In Proceedings of the 1st Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing (IPPS/SPDP-98), pages 542–547, Los Alamitos
, 1998
"... Scheduling jobs on the IBM SP2 system is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the order that the jobs arrive (FCFS scheduling) is fair and predictable, but su ers from severe fragmentation, leading to low utilization. This mo ..."
Abstract
-
Cited by 77 (8 self)
- Add to MetaCart
Scheduling jobs on the IBM SP2 system is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the order that the jobs arrive (FCFS scheduling) is fair and predictable, but su ers from severe fragmentation, leading to low utilization. This motivated Argonne National Lab, where the rst large SP1 was installed, to develop the EASY scheduler. This scheduler, which has since been adopted by many other SP2 sites, uses aggressive back lling: small jobs are moved ahead to ll in holes in the schedule, provided they do not delay the rst job in the queue. We show that a more conservative approach, in which small jobs move ahead only if they do not delay any job in the queue, produces essentially the same bene ts in terms of utilization. Our conservative scheme has the added advantage that queueing times can be predicted in advance, whereas in EASY the queueing time is unbounded. 1
Backfilling Using SystemGenerated Predictions Rather Than User Runtime Estimates,”
- IEEE Trans. Parallel & Distributed Syst.,
, 2007
"... ..."
(Show Context)
Gang Scheduling with Memory Considerations
- in Proc. of the 14th Intl. Parallel and Distributed Processing Symp., 2000
"... A major problem with time slicing on parallel machines is memory pressure, as the resulting paging activity damages the synchronism among a job’s processes. An alternative is to impose admission controls, and only admit jobs that fit into the available memory. Despite suffering from delayed executio ..."
Abstract
-
Cited by 59 (2 self)
- Add to MetaCart
(Show Context)
A major problem with time slicing on parallel machines is memory pressure, as the resulting paging activity damages the synchronism among a job’s processes. An alternative is to impose admission controls, and only admit jobs that fit into the available memory. Despite suffering from delayed execution, this leads to better overall performance by preventing the harmful effects of paging and thrashing. 1.
Benchmarks and Standards for the Evaluation of Parallel Job Schedulers
, 1999
"... The evaluation of parallel job schedulers hinges on the workloads used. It is suggested that this be standardized, in terms of both format and content, so as to ease the evaluation and comparison of different systems. The question remains whether this can encompass both traditional parallel systems ..."
Abstract
-
Cited by 57 (11 self)
- Add to MetaCart
The evaluation of parallel job schedulers hinges on the workloads used. It is suggested that this be standardized, in terms of both format and content, so as to ease the evaluation and comparison of different systems. The question remains whether this can encompass both traditional parallel systems and metacomputing systems. This paper is based on a panel on this subject that was held at the workshop, and the ensuing discussion; its authors are both the panel members and participants from the audience. Naturally, not all of us agree with all the opinions expressed here...
The Impact of More Accurate Requested Runtimes on Production Job Scheduling Performance
- In Job Scheduling Strategies for Parallel Processing
, 2002
"... Abstract. The question of whether more accurate requested runtimes can significantly improve production parallel system performance has previously been studied for the FCFS-backfill scheduler, using a limited set of system performance measures. This paper examines the question for higher performance ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
(Show Context)
Abstract. The question of whether more accurate requested runtimes can significantly improve production parallel system performance has previously been studied for the FCFS-backfill scheduler, using a limited set of system performance measures. This paper examines the question for higher performance backfill policies, heavier system loads as are observed in current leading edge production systems such as the large Origin 2000 system at NCSA, and a broader range of system performance measures. The new results show that more accurate requested runtimes can improve system performance much more significantly than suggested in previous results. For example, average slowdown decreases by a factor of two to six, depending on system load and the fraction of jobs that have the more accurate requests. The new results also show that (a) nearly all of the performance improvement is realized even if the more accurate runtime requests are a factor of two higher than the actual runtimes, (b) most of the performance improvement is achieved when test runs are used to obtain more accurate runtime requests, and (c) in systems where only a fraction (e.g., 60%) of the jobs provide approximately accurate runtime requests, the users that provide the approximately accurate requests achieve even greater improvements in performance, such as an order of magnitude improvement in average slowdown for jobs that have runtime up to fifty hours. 1
Supporting priorities and improving utilization of the IBM SP2 scheduler using slack-based backfilling
- In Proceedings of the 13th International Parallel Processing Symposium
, 1999
"... ..."
(Show Context)