Results 1  10
of
21
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of ComputationParallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract

Cited by 326 (5 self)
 Add to MetaCart
Devices]: Modes of ComputationParallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 03600300/99/12000406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
PerformanceEffective and LowComplexity Task Scheduling for Heterogeneous Computing
 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
, 2002
"... Efficient application scheduling is critical for achieving high performance in heterogeneous computing environments. The application scheduling problem has been shown to be NPcomplete in general cases as well as in several restricted cases. Because of its key importance, this problem has been exte ..."
Abstract

Cited by 255 (0 self)
 Add to MetaCart
Efficient application scheduling is critical for achieving high performance in heterogeneous computing environments. The application scheduling problem has been shown to be NPcomplete in general cases as well as in several restricted cases. Because of its key importance, this problem has been extensively studied and various algorithms have been proposed in the literature which are mainly for systems with homogeneous processors. Although there are a few algorithms in the literature for heterogeneous processors, they usually require significantly high scheduling costs and they may not deliver good quality schedules with lower costs. In this paper, we present two novel scheduling algorithms for a bounded number of heterogeneous processors with an objective to simultaneously meet high performance and fast scheduling time, which are called the Heterogeneous EarliestFinishTime (HEFT) algorithm and the CriticalPathonaProcessor (CPOP) algorithm. The HEFT algorithm selects the task with the highest upward rank value at each step and assigns the selected task to the processor, which minimizes its earliest finish time with an insertionbased approach. On the other hand, the CPOP algorithm uses the summation of upward and downward rank values for prioritizing tasks. Another difference is in the processor selection phase, which schedules the critical tasks onto the processor that minimizes the total execution time of the critical tasks. In order to provide a robust and unbiased comparison with the related work, a parametric graph generator was designed to generate weighted directed acyclic graphs with various characteristics. The comparison study, based on both randomly generated graphs and the graphs of some real applications, shows that our scheduling algorithms significantly surpass previous approaches in terms of both quality and cost of schedules, which are mainly presented with schedule length ratio, speedup, frequency of best results, and average scheduling time metrics.
The relative performance of various mapping algorithms is independent of sizable variances in runtime predictions
 in 7th IEEE Heterogeneous Computing Workshop (HCW ’98
, 1998
"... In this paper we study the performance of four mapping algorithms. The four algorithms include two naive ..."
Abstract

Cited by 94 (9 self)
 Add to MetaCart
In this paper we study the performance of four mapping algorithms. The four algorithms include two naive
Efficient Scheduling of Arbitrary Task Graphs to Multiprocessors using A Parallel Genetic Algorithm
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1997
"... Given a parallel program represented by a task graph, the objective of a scheduling algorithm is to minimize the overall execution time of the program by properly assigning the nodes of the graph to the processors. This multiprocessor scheduling problem is NPcomplete even with simplifying assumptio ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
Given a parallel program represented by a task graph, the objective of a scheduling algorithm is to minimize the overall execution time of the program by properly assigning the nodes of the graph to the processors. This multiprocessor scheduling problem is NPcomplete even with simplifying assumptions, and becomes more complex under relaxed assumptions such as arbitrary precedence constraints, and arbitrary task execution and communication times. The present literature on this topic is a large repertoire of heuristics that produce good solutions in a reasonable amount of time. These heuristics, however, have restricted applicability in a practical environment because they have a number of fundamental problems including high time complexity, lack of scalability, and no performance guarantee with respect to optimal solutions. Recently, genetic algorithms (GAs) have been widely reckoned as a useful vehicle for obtaining high quality or even optimal solutions for a broad range of combinato...
An Integrated Technique for Task Matching and Scheduling onto Distributed Heterogeneous Computing Systems
 J. of Par. and Dist. Comp
, 2002
"... This paper presents a problemspace genetic algorithm (PSGA)based technique for efficient matching and scheduling of an application program that can be represented by a directed acyclic graph, onto a mixedmachine distributed heterogeneous computing (DHC) system.PSGA is an evolutionary technique th ..."
Abstract

Cited by 37 (0 self)
 Add to MetaCart
(Show Context)
This paper presents a problemspace genetic algorithm (PSGA)based technique for efficient matching and scheduling of an application program that can be represented by a directed acyclic graph, onto a mixedmachine distributed heterogeneous computing (DHC) system.PSGA is an evolutionary technique that combines the search capability of genetic algorithms with a known fast problemspecific heuristic to provide the bestpossible solution to a problem in an efficient manner as compared to other probabilistic techniques. The goal of the algorithm is to reduce the overall completion time through proper task matching, task scheduling, and intermachine data transfer scheduling in an integrated fashion.The algorithm is based on a new evolutionary technique that embeds a known problemspecific fast heuristic into genetic algorithms (GAs).The algorithm is robust in the sense that it explores a large and complex solution space in smaller CPU time and uses less memory space as compared to traditional GAs.Consequently, the proposed technique schedules an application program with a comparable schedule length in a very short CPU time, as compared to GAbased heuristics.The paper includes a performance comparison showing the viability and effectiveness of the proposed technique through comparison with existing
Static and adaptive distributed data replication using genetic algorithms
, 2004
"... Fast dissemination and access of information in large distributed systems, such as the Internet, has become a norm of our daily life. However, undesired long delays experienced by endusers, especially during the peak hours, continue to be a common problem. Replicating some of the objects at multipl ..."
Abstract

Cited by 33 (11 self)
 Add to MetaCart
Fast dissemination and access of information in large distributed systems, such as the Internet, has become a norm of our daily life. However, undesired long delays experienced by endusers, especially during the peak hours, continue to be a common problem. Replicating some of the objects at multiple sites is one possible solution in decreasing network traffic. The decision of what to replicate where, requires solving a constraint optimization problem which is NPcomplete in general. Such problems are known to stretch the capacity of a Genetic Algorithm (GA) to its limits. Nevertheless, we propose a GA to solve the problem when the read/write demands remain static and experimentally prove the superior solution quality obtained compared to an intuitive greedy method. Unfortunately, the static GA approach involves high running time and may not be useful when read/write demands continuously change, as is the case with breaking news. To tackle such case we propose a hybrid GA that takes as input the current replica distribution and computes a new one using knowledge about the network attributes and the changes occurred. Keeping in view more pragmatic scenarios in today’s distributed information environments, we evaluate these algorithms with respect to the storage capacity constraint of each site as well as variations in the popularity of objects, and also examine the tradeoff between running time and solution quality.
Scheduling workflow applications on processors with different capabilities
, 2006
"... Efficient scheduling of workflow applications represented by weighted directed acyclic graphs (DAG) on a set of heterogeneous processors is essential for achieving high performance. The optimization problem is NPcomplete in general. A few heuristics for scheduling on heterogeneous systems have been ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
Efficient scheduling of workflow applications represented by weighted directed acyclic graphs (DAG) on a set of heterogeneous processors is essential for achieving high performance. The optimization problem is NPcomplete in general. A few heuristics for scheduling on heterogeneous systems have been proposed recently. However, few of them consider the case where processors have different capabilities. In this paper, we present a novel list scheduling based algorithm to deal with this situation. The algorithm (SDC) has two distinctive features. First, the algorithm takes into account the effect of Percentage of Capable Processors (PCP) when assigning the task node weights. For two task nodes with same average computation cost, our weight assignment policy tends to give higher weight to the task with small PCP. Secondly, during the processor selection phase, the algorithm adjusts the effective Earliest Finish Time strategy by incorporating the average communication cost between the current scheduling node and its children. Comparison study shows that our algorithm performs better than related work overall.
Robust task scheduling in nondeterministic heterogeneous computing systems
 in Proc. IEEE Intl. Conf. Cluster Computing
, 2006
"... The paper addresses the problem of matching and scheduling of DAGstructured application to both minimize the makespan and maximize the robustness in a heterogeneous computing system. Due to the conflict of the two objectives, it is usually impossible to achieve both goals at the same time. We give ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
The paper addresses the problem of matching and scheduling of DAGstructured application to both minimize the makespan and maximize the robustness in a heterogeneous computing system. Due to the conflict of the two objectives, it is usually impossible to achieve both goals at the same time. We give two definitions of robustness of a schedule based on tardiness and miss rate. Slack is proved to be an effective metric to be used to adjust the robustness. We employ ǫconstraint method to solve the biobjective optimization problem where minimizing the makespan and maximizing the slack are the two objectives. Overall performance of a schedule considering both makespan and robustness is defined such that user have the flexibility to put emphasis on either objective. Experiment results are presented to validate the performance of the proposed algorithm.
Resource Management of Highly Configurable Tasks
"... In this paper we present an extension to our QoS optimization algorithm, QRAM[7][11], that can improve optimization time by several orders of magnitude when managing highly configurable tasks. A highly configurable task is one with a large number of QoS dimensions and/or a largenumber of qualit ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
In this paper we present an extension to our QoS optimization algorithm, QRAM[7][11], that can improve optimization time by several orders of magnitude when managing highly configurable tasks. A highly configurable task is one with a large number of QoS dimensions and/or a largenumber of quality levels on those dimensions. For example, an application that has ten QoS dimensions with ten quality levels each will have 1010 setpoints, or ways in whichit can be configured. While the existing QRAM algorithm has been shown to be a very effective resource managementtool, it must still explicitly perform computations on all of the setpoints for each task. For tasks with 1010 setpointsor more, this is clearly impractical. The key idea presented here is a new approximation algorithm for the concave majorant step in QRAM. By using this algorithm in a filtering step, the best performing subset of the setpoints can bequickly found without explicitly examining all of the setpoints. The idea is validated using a phased array radarsystem as an example application.
A Parallel Algorithm for CompileTime Scheduling of Parallel Programs on Multiprocessors
 PACT'97
, 1997
"... In this paper, we propose a parallel randomized algorithm, called Parallel Fast Assignment using Search Technique (PFAST), for scheduling parallel programs represented by directed acyclic graphs (DAGs) during compiletime. The PFAST algorithm has time complexity where e is the number of edges in th ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a parallel randomized algorithm, called Parallel Fast Assignment using Search Technique (PFAST), for scheduling parallel programs represented by directed acyclic graphs (DAGs) during compiletime. The PFAST algorithm has time complexity where e is the number of edges in the DAG. This lineartime algorithm works by first generating an initial solution and then refining it using a parallel random search. Using a prototype computeraided parallelization and scheduling tool called CASCH, the algorithm is found to outperform numerous previous algorithms while taking dramatically smaller execution times. The distinctive feature of this research is that, instead of simulations, our proposed algorithm is evaluated and compared with other algorithms using the CASCH tool with real applications running on the Intel Paragon. The PFAST algorithm is also evaluated with randomly generated DAGs for which optimal schedules are known. The algorithm generated optimal solutions for a majority of the test cases and closetooptimal solutions for the others. The proposed algorithm is the fastest scheduling algorithm known to us and is an attractive choice for scheduling under running time constraints.