Results 1  10
of
16
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of ComputationParallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract

Cited by 326 (5 self)
 Add to MetaCart
Devices]: Modes of ComputationParallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 03600300/99/12000406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
Benchmarking and Comparison of the Task Graph Scheduling Algorithms
, 1999
"... The problem of scheduling a parallel program represented by a weighted directed acyclic graph (DAG) to a set of homogeneous processors for minimizing the completion time of the program has been extensively studied. The NPcompleteness of the problem has stimulated researchers to propose a myriad of ..."
Abstract

Cited by 106 (2 self)
 Add to MetaCart
The problem of scheduling a parallel program represented by a weighted directed acyclic graph (DAG) to a set of homogeneous processors for minimizing the completion time of the program has been extensively studied. The NPcompleteness of the problem has stimulated researchers to propose a myriad of heuristic algorithms. While most of these algorithms are reported to be efficient, it is not clear how they compare against each other. A meaningful performance evaluation and comparison of these algorithms is a complex task and it must take into account a number of issues. First, most scheduling algorithms are based upon diverse assumptions, making the performance comparison rather purposeless. Second, there does not exist a standard set of benchmarks to examine these algorithms. Third, most algorithms are evaluated using small problem sizes, and, therefore, their scalability is unknown. In this paper, we first provide a taxonomy for classifying various algorithms into distinct categories a...
Benchmarking the task graph scheduling algorithms
 in IPPS/SPDP
, 1998
"... Abstract † The problem of scheduling a weighted directed acyclic graph (DAG) to a set of homogeneous processors to minimize the completion time has been extensively studied. The NPcompleteness of the problem has instigated researchers to propose a myriad of heuristic algorithms. While these algorith ..."
Abstract

Cited by 47 (2 self)
 Add to MetaCart
(Show Context)
Abstract † The problem of scheduling a weighted directed acyclic graph (DAG) to a set of homogeneous processors to minimize the completion time has been extensively studied. The NPcompleteness of the problem has instigated researchers to propose a myriad of heuristic algorithms. While these algorithms are individually reported to be efficient, it is not clear how effective they are and how well they compare against each other. A comprehensive performance evaluation and comparison of these algorithms entails addressing a number of difficult issues. One of the issues is that a large number of scheduling algorithms are based upon radically different assumptions, making their comparison on a unified basis a rather intricate task. Another issue is that there is no standard set of benchmarks that can be used to evaluate and compare these algorithms. Furthermore, most algorithms are evaluated using small problem sizes, and it is not clear how their performance scales with the problem size. In this paper, we first provide a taxonomy for classifying various algorithms into different categories according to their assumptions and functionalities. We then propose a set of benchmarks which are of diverse structures without being biased towards a particular scheduling technique and still allow variations in important parameters. We have evaluated 15 scheduling algorithms, and compared them using the proposed benchmarks. Based upon the design philosophies and principles behind these algorithms, we interpret the results and discuss why some algorithms perform better than the others.
Analysis, Evaluation, and Comparison of Algorithms for Scheduling Task Graphs on Parallel Processors
 In Proceedings of the Second International Symposium on Parallel Architectures, Algorithms, and Networks
, 1996
"... Abstract 1 In this paper, we survey algorithms that allocate a parallel program represented by an edgeweighted directed acyclic graph (DAG), also called a task graph or macrodataflow graph, to a set of homogeneous processors, with the objective of minimizing the completion time. We analyze 21 such ..."
Abstract

Cited by 38 (5 self)
 Add to MetaCart
(Show Context)
Abstract 1 In this paper, we survey algorithms that allocate a parallel program represented by an edgeweighted directed acyclic graph (DAG), also called a task graph or macrodataflow graph, to a set of homogeneous processors, with the objective of minimizing the completion time. We analyze 21 such algorithms and classify them into four groups. The first group includes algorithms that schedule the DAG to a bounded number of processors directly. These algorithms are called the bounded number of processors (BNP) scheduling algorithms. The algorithms in the second group schedule the DAG to an unbounded number of clusters and are called the unbounded number of clusters (UNC) scheduling algorithms. The algorithms in the third group schedule the DAG using task duplication and are called the task duplication based (TDB) scheduling algorithms. The algorithms in the fourth group perform allocation and mapping on arbitrary processor network topologies. These algorithms are called the arbitrary processor network (APN) scheduling algorithms. The design philosophies and principles behind these algorithms are discussed, and the performance of all of the algorithms is evaluated and compared against each other on a unified basis by using various scheduling parameters.
A Parallel Algorithm for CompileTime Scheduling of Parallel Programs on Multiprocessors
 PACT'97
, 1997
"... In this paper, we propose a parallel randomized algorithm, called Parallel Fast Assignment using Search Technique (PFAST), for scheduling parallel programs represented by directed acyclic graphs (DAGs) during compiletime. The PFAST algorithm has time complexity where e is the number of edges in th ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a parallel randomized algorithm, called Parallel Fast Assignment using Search Technique (PFAST), for scheduling parallel programs represented by directed acyclic graphs (DAGs) during compiletime. The PFAST algorithm has time complexity where e is the number of edges in the DAG. This lineartime algorithm works by first generating an initial solution and then refining it using a parallel random search. Using a prototype computeraided parallelization and scheduling tool called CASCH, the algorithm is found to outperform numerous previous algorithms while taking dramatically smaller execution times. The distinctive feature of this research is that, instead of simulations, our proposed algorithm is evaluated and compared with other algorithms using the CASCH tool with real applications running on the Intel Paragon. The PFAST algorithm is also evaluated with randomly generated DAGs for which optimal schedules are known. The algorithm generated optimal solutions for a majority of the test cases and closetooptimal solutions for the others. The proposed algorithm is the fastest scheduling algorithm known to us and is an attractive choice for scheduling under running time constraints.
Mapping Heavy Communication GridBased Workflows onto Grid Resources Within An SLA Context Using Metaheuristics
 International Journal of High Performance Computing and Application (IJHPCA
, 2007
"... Service Level Agreements (SLAs) is currently one of the major research topics in grid computing. Among many system components for the SLArelated grid jobs, the SLA mapping mechanism has received wide spread attention. It is responsible for assigning subjobs of a workflow to a variety of grid resou ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
Service Level Agreements (SLAs) is currently one of the major research topics in grid computing. Among many system components for the SLArelated grid jobs, the SLA mapping mechanism has received wide spread attention. It is responsible for assigning subjobs of a workflow to a variety of grid resources in a way that meets the user's deadline and costs as little as possible. With the distinguished workload and resource characteristics, mapping a heavy communication workflow within an SLA context gives rise to a complicated combinatorial optimization problem. This paper presents the application of various metaheuristics and suggests a possible approach to solve this problem. Performance measurements deliver evaluation results on the quality and efficiency of each method.
Performance Comparison of Algorithms for Static Scheduling of DAGs to Multiprocessors
 in Second Australasian Conference on Parallel and Realtime Systems
, 1995
"... . In this paper, we evaluate and compare algorithms for scheduling and clustering. These algorithms allocate a parallel program represented by an edgeweighted directed acyclic graph (DAG), also called a task graph or macrodataflow graph, to a set of homogeneous processors, to minimize the compl ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
. In this paper, we evaluate and compare algorithms for scheduling and clustering. These algorithms allocate a parallel program represented by an edgeweighted directed acyclic graph (DAG), also called a task graph or macrodataflow graph, to a set of homogeneous processors, to minimize the completion time. We examine several such classes of such algorithms and compare the performance of a class of algorithms known as the arbitrary processor network (APN) scheduling algorithms. We discuss the design philosophies and principles behind these algorithms and assess their merits and deficiencies. Experimental results have been obtained by testing the algorithms with a large number of test cases. Global and pairwise comparisons are made within each group whereby these algorithms are ranked according to their performance. 1. Introduction Scheduling and mapping of computations onto the processors is one of the crucial components of a parallel processing environment. Since the sched...
On Parallelization of Static Scheduling Algorithms
 IEEE Trans. Software Eng
, 1997
"... Most static scheduling algorithms that schedule parallel programs represented by directed acyclic graphs (DAGs) are sequential. This paper discusses the essential issues of parallelization of static scheduling and presents two efficient parallel scheduling algorithms. The proposed algorithms have be ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Most static scheduling algorithms that schedule parallel programs represented by directed acyclic graphs (DAGs) are sequential. This paper discusses the essential issues of parallelization of static scheduling and presents two efficient parallel scheduling algorithms. The proposed algorithms have been implemented on an Intel Paragon machine, and their performance has been evaluated. These algorithms produce highquality scheduling and are much faster than existing sequential and parallel algorithms. 1 Introduction Static scheduling utilizes the knowledge of problem characteristics to reach a global optimal, or near optimal, solution. Although many people have conducted their research in various manners, they all share a similar underlying idea: take a directed acyclic graph representing the parallel program as input and schedule it onto processors of a target machine to minimize the completion time. This is an NPcomplete problem in its general form [7]. Therefore, many hueristic algor...
A performance study of multiprocessor task scheduling algorithms
, 2008
"... Multiprocessor task scheduling is an important and computationally difficult problem. A large number of algorithms were proposed which represent various tradeoffs between the quality of the solution and the computational complexity and scalability of the algorithm. Previous comparison studies have ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Multiprocessor task scheduling is an important and computationally difficult problem. A large number of algorithms were proposed which represent various tradeoffs between the quality of the solution and the computational complexity and scalability of the algorithm. Previous comparison studies have frequently operated with simplifying assumptions, such as independent tasks, artificially generated problems or the assumption of zero communication delay. In this paper, we propose a comparison study with realistic assumptions. Our target problems are two well known problems of linear algebra: LU decomposition and Gauss–Jordan elimination. Both algorithms are naturally parallelizable but have heavy data dependencies. The communication delay will be explicitly considered in the comparisons. In our study, we consider nine scheduling algorithms which are frequently used to the best of our knowledge: min–min, chaining, A∗, genetic algorithms, simulated annealing, tabu search, HLFET, ISH, and DSH with task duplication. Based on experimental results, we present a detailed analysis of the scalability, advantages and disadvantages of each algorithm.
A Comparison of Clustering and Scheduling Techniques for Embedded Multiprocessor Systems
, 2003
"... In this paper we extensively explore and illustrate the effectiveness of the twophase decomposition of scheduling  into clustering and clusterscheduling or merging  and mapping task graphs onto embedded multiprocessor systems. We describe efficient and novel partitioning (clustering) and ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
In this paper we extensively explore and illustrate the effectiveness of the twophase decomposition of scheduling  into clustering and clusterscheduling or merging  and mapping task graphs onto embedded multiprocessor systems. We describe efficient and novel partitioning (clustering) and scheduling techniques that aggressively streamline interprocessor communication and can be tuned to exploit the significantly longer compilation time that is available to embedded system designers.