Results 1 - 10
of
31
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract
-
Cited by 142 (4 self)
- Add to MetaCart
Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.-K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 0360-0300/99/1200--0406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
Benchmarking and Comparison of the Task Graph Scheduling Algorithms
, 1999
"... The problem of scheduling a parallel program represented by a weighted directed acyclic graph (DAG) to a set of homogeneous processors for minimizing the completion time of the program has been extensively studied. The NP-completeness of the problem has stimulated researchers to propose a myriad of ..."
Abstract
-
Cited by 67 (2 self)
- Add to MetaCart
The problem of scheduling a parallel program represented by a weighted directed acyclic graph (DAG) to a set of homogeneous processors for minimizing the completion time of the program has been extensively studied. The NP-completeness of the problem has stimulated researchers to propose a myriad of heuristic algorithms. While most of these algorithms are reported to be efficient, it is not clear how they compare against each other. A meaningful performance evaluation and comparison of these algorithms is a complex task and it must take into account a number of issues. First, most scheduling algorithms are based upon diverse assumptions, making the performance comparison rather purposeless. Second, there does not exist a standard set of benchmarks to examine these algorithms. Third, most algorithms are evaluated using small problem sizes, and, therefore, their scalability is unknown. In this paper, we first provide a taxonomy for classifying various algorithms into distinct categories a...
Communication contention in task scheduling
- IEEE TRANS. PARALLEL DISTRIBUTED SYSTEMS
, 2005
"... Task scheduling is an essential aspect of parallel programming. Most heuristics for this NP-hard problem are based on a simple system model that assumes fully connected processors and concurrent interprocessor communication. Hence, contention for communication resources is not considered in task sc ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Task scheduling is an essential aspect of parallel programming. Most heuristics for this NP-hard problem are based on a simple system model that assumes fully connected processors and concurrent interprocessor communication. Hence, contention for communication resources is not considered in task scheduling, yet it has a strong influence on the execution time of a parallel program. This paper investigates the incorporation of contention awareness into task scheduling. A new system model for task scheduling is proposed, allowing us to capture both end-point and network contention. To achieve this, the communication network is reflected by a topology graph for the representation of arbitrary static and dynamic networks. The contention awareness is accomplished by scheduling the communications, represented by the edges in the task graph, onto the links of the topology graph. Edge scheduling is theoretically analyzed, including aspects like heterogeneity, routing, and causality. The proposed contention-aware scheduling preserves the theoretical basis of task scheduling. It is shown how classic list scheduling is easily extended to this more accurate system model. Experimental results show the significantly improved accuracy and efficiency of the produced schedules.
An Incremental Genetic Algorithm Approach to Multiprocessor Scheduling
- IEEE Transactions on Parallel and Distributed Systems
, 2004
"... Abstract—We have developed a genetic algorithm (GA) approach to the problem of task scheduling for multiprocessor systems. Our approach requires minimal problem specific information and no problem specific operators or repair mechanisms. Key features of our system include a flexible, adaptive proble ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Abstract—We have developed a genetic algorithm (GA) approach to the problem of task scheduling for multiprocessor systems. Our approach requires minimal problem specific information and no problem specific operators or repair mechanisms. Key features of our system include a flexible, adaptive problem representation and an incremental fitness function. Comparison with traditional scheduling methods indicates that the GA is competitive in terms of solution quality if it has sufficient resources to perform its search. Studies in a nonstationary environment show the GA is able to automatically adapt to changing targets. Index Terms—Genetic algorithm, task scheduling, parallel processing. 1
An Algorithm for Automatically Obtaining Distributed and Fault-Tolerant Static Schedules
- In International Conference on Dependable Systems and Networks, DSN’03
, 2003
"... Our goal is to automatically obtain a distributed and fault-tolerant embedded system: distributed because the system must run on a distributed architecture; fault-tolerant because the system is critical. Our starting point is a source algorithm, a target distributed architecture, some distribution c ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Our goal is to automatically obtain a distributed and fault-tolerant embedded system: distributed because the system must run on a distributed architecture; fault-tolerant because the system is critical. Our starting point is a source algorithm, a target distributed architecture, some distribution constraints, some indications on the execution times of the algorithm operations on the processors of the target architecture, some indications on the communication times of the data-dependencies on the communication links of the target architecture, a number of fail-silent processor failures that the obtained system must tolerate, and finally some real-time constraints that the obtained system must satisfy. In this article, we present a scheduling heuristic which, given all these inputs, produces a fault-tolerant, distributed, and static scheduling of the algorithm on the architecture, with an indication whether or not the real-time constraints are satisfied. The algorithm we propose consist of a list scheduling heuristic based active replication strategy, that allows at least +1 replicas of an operation to be scheduled on different processors, which are run in parallel to tolerate at most failures. Due to the strategy used to schedule operations, simulation results show that the proposed heuristic improve the performance of our method, both in the absence and in the presence of failures. Keywords: Fault Tolerance in Distributed and Real-Time Systems, Safety-Critical Systems, software implemented fault-tolerance, multi-component architectures, distribution heuristics.
Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflow
- ALGORITHMICA
, 2007
"... Mapping applications onto parallel platforms is a challenging problem, even for simple application patterns such as pipeline or fork graphs. Several antagonist criteria should be optimized for workflow applications, such as throughput and latency (or a combination). In this paper, we consider a si ..."
Abstract
-
Cited by 15 (12 self)
- Add to MetaCart
Mapping applications onto parallel platforms is a challenging problem, even for simple application patterns such as pipeline or fork graphs. Several antagonist criteria should be optimized for workflow applications, such as throughput and latency (or a combination). In this paper, we consider a simplified model with no communication cost, and we provide an exhaustive list of complexity results for different problem instances. Pipeline or fork stages can be replicated in order to increase the throughput by sending consecutive data sets onto different processors. In some cases, stages can also be data-parallelized, i.e. the computation of one single data set is shared between several processors. This leads to a decrease of the latency and an increase of the throughput. Some instances of this simple model are shown to be NP-hard, thereby exposing the inherent complexity of the mapping problem. We provide polynomial algorithms for other problem instances. Altogether, we provide solid theoretical foundations for the study of mono-criterion or bi-criteria mapping optimization problems.
On Parallelizing the Multiprocessor Scheduling Problem
- IEEE Trans. on Parallel and Distributed Systems
, 1999
"... Existing heuristics for scheduling a node and edge weighted directed task graph to multiple processors can produce satisfactory solutions but incur high time complexities which tend to exacerbate in more realistic environments with relaxed assumptions. Consequently, these heuristics do not scale wel ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Existing heuristics for scheduling a node and edge weighted directed task graph to multiple processors can produce satisfactory solutions but incur high time complexities which tend to exacerbate in more realistic environments with relaxed assumptions. Consequently, these heuristics do not scale well and cannot handle problems of moderate sizes. A natural approach to reducing complexity while aiming for a similar or potentially better solution is to parallelize the scheduling algorithm. This can be done by partitioning the task graphs and concurrently generating partial schedules for the partitioned parts, which are then concatenated to obtain the final schedule. The problem, however, is nontrivial as there exists dependencies among the nodes of a task graph which must be preserved for generating a valid schedule. Moreover, the time clock for scheduling is global for all the processors (that are executing the parallel scheduling algorithm), making the inherent parallelism invisible. In...
Fault Tolerant Scheduling of Precedence Task Graphs on Heterogeneous Platforms
, 2007
"... Fault tolerance and latency are important requirements in several applications which are time critical in nature: such applications require guaranties in terms of latency, even when processors are subject to failures. In this paper, we propose a fault tolerant scheduling heuristic for mapping preced ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Fault tolerance and latency are important requirements in several applications which are time critical in nature: such applications require guaranties in terms of latency, even when processors are subject to failures. In this paper, we propose a fault tolerant scheduling heuristic for mapping precedence task graphs on heterogeneous systems. Our approach is based on an active replication scheme, capable of supporting ε arbitrary fail-silent (fail-stop) processor failures, hence valid results will be provided even if ε processors fail. We focus on a bi-criteria approach, where we aim at minimizing the latency given a fixed number of failures supported in the system, or the other way round. Major achievements include a low complexity, and a drastic reduction of the number of additional communications induced by the replication mechanism. Experimental results demonstrate that our heuristics, despite their lower complexity, outperform their direct competitor, the FTBAR scheduling algorithm [8].
Advanced Reservation-based Scheduling of Task Graphs on Clusters
- In Proc. of the 13th Intl. Conference on High Performance Computing (HiPC
, 2006
"... Abstract. A Task Graph (TG) is a model of a parallel program that consists of many subtasks that can be executed simultaneously on different processing elements. Subtasks exchange data via an interconnection network. The dependencies between subtasks are described by means of a Directed Acyclic Grap ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. A Task Graph (TG) is a model of a parallel program that consists of many subtasks that can be executed simultaneously on different processing elements. Subtasks exchange data via an interconnection network. The dependencies between subtasks are described by means of a Directed Acyclic Graph. Unfortunately, due to their characteristics, scheduling a TG requires dedicated or uninterruptible resources. Moreover, scheduling a TG by itself results in a low resource utilization because of the dependencies among the subtasks. Therefore, in order to solve the above problems, we propose a scheduling approach for TGs by using advance reservation in a cluster environment. In addition, to improve resource utilization, we also propose a scheduling solution by interweaving one or more TGs within the same reservation block and/or backfilling with independent jobs. 1
On task scheduling accuracy: Evaluation methodology and results
- Journal of Supercomputing
, 2004
"... Abstract. Many heuristics based on the directed acyclic graph (DAG) have been proposed for the static scheduling problem. Most of these algorithms apply a simple model of the target system that assumes fully connected processors, a dedicated communication sub-system and no contention for the communi ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. Many heuristics based on the directed acyclic graph (DAG) have been proposed for the static scheduling problem. Most of these algorithms apply a simple model of the target system that assumes fully connected processors, a dedicated communication sub-system and no contention for the communication resources. Only a few algorithms consider the network topology and the contention for the communication resources. This article evaluates the accuracy of task scheduling algorithms and thus the appropriateness of the applied models. An evaluation methodology is proposed and applied to a representative set of scheduling algorithms. The obtained results show a significant inaccuracy of the produced schedules. Analyzing these results is important for the development of more appropriate models and more accurate scheduling algorithms.

