Results 1  10
of
25
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of ComputationParallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract

Cited by 326 (5 self)
 Add to MetaCart
Devices]: Modes of ComputationParallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 03600300/99/12000406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
Dynamic CriticalPath Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors
 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1996
"... In this paper, we propose a static scheduling algorithm for allocating task graphs to fullyconnected multiprocessors. We discuss six recently reported scheduling algorithms and show that they possess one drawback or the other which can lead to poor performance. The proposed algorithm, which is calle ..."
Abstract

Cited by 163 (16 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a static scheduling algorithm for allocating task graphs to fullyconnected multiprocessors. We discuss six recently reported scheduling algorithms and show that they possess one drawback or the other which can lead to poor performance. The proposed algorithm, which is called the Dynamic CriticalPath (DCP) scheduling algorithm, is different from the previously proposed algorithms in a number of ways. First, it determines the critical path of the task graph and selects the next node to be scheduled in a dynamic fashion. Second, it rearranges the schedule on each processor dynamically in the sense that the positions of the nodes in the partial schedules are not fixed until all nodes have been considered. Third, it selects a suitable processor for a node by looking ahead the potential start times of the remaining nodes on that processor, and schedules relatively less important nodes to the processors already in use. A global as well as a pairwise comparison is c...
Efficient Memoization for Dynamic Programming with AdHoc Constraints
"... We address the problem of effective reuse of subproblem solutions in dynamic programming. In dynamic programming, a memoed solution of a subproblem can be reused for another if the latter’s context is a special case of the former. Our objective is to generalize the context of the memoed subproblem s ..."
Abstract

Cited by 15 (13 self)
 Add to MetaCart
(Show Context)
We address the problem of effective reuse of subproblem solutions in dynamic programming. In dynamic programming, a memoed solution of a subproblem can be reused for another if the latter’s context is a special case of the former. Our objective is to generalize the context of the memoed subproblem such that more subproblems can be considered subcases and hence enhance reuse. Towards this we propose a generalization of context that 1) does not add better solutions than the subproblem’s optimal, yet 2) requires that subsumed subproblems preserve the optimal solution. In addition, we also present a general technique to search for at most k ≥ 1 optimal solutions. We provide experimental results on resourceconstrained shortest path (RCSP) benchmarks and program’s exact worstcase execution time (WCET) analysis.
A Parallel Algorithm for CompileTime Scheduling of Parallel Programs on Multiprocessors
 PACT'97
, 1997
"... In this paper, we propose a parallel randomized algorithm, called Parallel Fast Assignment using Search Technique (PFAST), for scheduling parallel programs represented by directed acyclic graphs (DAGs) during compiletime. The PFAST algorithm has time complexity where e is the number of edges in th ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a parallel randomized algorithm, called Parallel Fast Assignment using Search Technique (PFAST), for scheduling parallel programs represented by directed acyclic graphs (DAGs) during compiletime. The PFAST algorithm has time complexity where e is the number of edges in the DAG. This lineartime algorithm works by first generating an initial solution and then refining it using a parallel random search. Using a prototype computeraided parallelization and scheduling tool called CASCH, the algorithm is found to outperform numerous previous algorithms while taking dramatically smaller execution times. The distinctive feature of this research is that, instead of simulations, our proposed algorithm is evaluated and compared with other algorithms using the CASCH tool with real applications running on the Intel Paragon. The PFAST algorithm is also evaluated with randomly generated DAGs for which optimal schedules are known. The algorithm generated optimal solutions for a majority of the test cases and closetooptimal solutions for the others. The proposed algorithm is the fastest scheduling algorithm known to us and is an attractive choice for scheduling under running time constraints.
Automated Partitioning Design in Parallel Database Systems
"... In recent years, Massively Parallel Processors (MPPs) have gained ground enabling vast amounts of data processing. In such environments, data is partitioned across multiple compute nodes, which results in dramatic performance improvements during parallel query execution. To evaluate certain relation ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
In recent years, Massively Parallel Processors (MPPs) have gained ground enabling vast amounts of data processing. In such environments, data is partitioned across multiple compute nodes, which results in dramatic performance improvements during parallel query execution. To evaluate certain relational operators in a query correctly, data sometimes needs to be repartitioned (i.e., moved) across compute nodes. Since data movement operations are much more expensive than relational operations, it is crucial to design a suitable data partitioning strategy that minimizes the cost of such expensive data transfers. A good partitioning strategy strongly depends on how the parallel system would be used. In this paper we present a partitioning advisor that recommends the best partitioning design for an expected workload. Our tool recommends which tables should be replicated (i.e., copied into every compute node) and which ones should be distributed according to specific column(s) so that the cost of evaluating similar workloads is minimized. In contrast to previous work, our techniques are deeply integrated with the underlying parallel query optimizer, which results in more accurate recommendations in a shorter amount of time. Our experimental evaluation using a real MPP system, Microsoft SQL Server 2008 Parallel Data Warehouse, with both real and synthetic workloads shows the effectiveness of the proposed techniques and the importance of deep integration of the partitioning advisor with the underlying query optimizer.
Mapping Heavy Communication GridBased Workflows onto Grid Resources Within An SLA Context Using Metaheuristics
 International Journal of High Performance Computing and Application (IJHPCA
, 2007
"... Service Level Agreements (SLAs) is currently one of the major research topics in grid computing. Among many system components for the SLArelated grid jobs, the SLA mapping mechanism has received wide spread attention. It is responsible for assigning subjobs of a workflow to a variety of grid resou ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
Service Level Agreements (SLAs) is currently one of the major research topics in grid computing. Among many system components for the SLArelated grid jobs, the SLA mapping mechanism has received wide spread attention. It is responsible for assigning subjobs of a workflow to a variety of grid resources in a way that meets the user's deadline and costs as little as possible. With the distinguished workload and resource characteristics, mapping a heavy communication workflow within an SLA context gives rise to a complicated combinatorial optimization problem. This paper presents the application of various metaheuristics and suggests a possible approach to solve this problem. Performance measurements deliver evaluation results on the quality and efficiency of each method.
Fast Rescheduling of MultiRate Systems for HW/SW Partitioning Algorithms
, 2005
"... In modern designs for heterogeneous systems with their extreme requirements on power consumption, execution time, silicon area and timetomarket, the HW/SW partitioning problem belongs to the most challenging ones. Usually its formulation, based on task or process graphs with complex communication ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
In modern designs for heterogeneous systems with their extreme requirements on power consumption, execution time, silicon area and timetomarket, the HW/SW partitioning problem belongs to the most challenging ones. Usually its formulation, based on task or process graphs with complex communication models, is intractable. Moreover most partitioning problems embed another NPhard problem in its core: a huge number of valid schedules exist for a single partitioning solution. Powerful heuristics for the partitioning problem rely on list scheduling techniques to solve this scheduling problem. This paper is based on a rescheduling algorithm that performs better than popular list scheduling techniques and still preserves linear complexity by reusing former schedules. A sophisticated communication model is introduced and the rescheduling algorithm is modified to serve multicore architectures with linear runtime.
A Local Dominance Procedure for MixedInteger Linear Programming
, 2007
"... Among the hardest MixedInteger Linear Programming (MILP) problems, the ones that exhibit a symmetric nature are particularly important in practice, as they arise in both theoretical and practical combinatorial optimization problems. A theoretical concept that generalizes the notion of symmetry is t ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Among the hardest MixedInteger Linear Programming (MILP) problems, the ones that exhibit a symmetric nature are particularly important in practice, as they arise in both theoretical and practical combinatorial optimization problems. A theoretical concept that generalizes the notion of symmetry is that of dominance. This concept, although known since a long time, is typically not used in generalpurpose MILP codes, due to the intrinsic difficulties arising when using the classical definitions in a completely general context. In this paper we study a generalpurpose dominance procedure proposed in the 80’s by Fischetti and Toth, that overcomes some of the main drawbacks of the classical dominance criteria. Both theoretical and practical issues concerning this procedure are considered, and important improvements are proposed. Computational results on a testbed made of hard (single and multiple) knapsack problems are reported, showing that the proposed method can lead to considerable speedup when embedded in a generalpurpose MILP solver.
Efficient Algorithms for Scheduling and Mapping of Parallel Programs onto Parallel Architectures
, 1994
"... ..."
Scheduling Driven Partitioning of Heterogeneous Embedded Systems
 Swedish Workshop on Computer Systems Architecture
, 1998
"... In this paper we present an algorithm for system level hardware/software partitioning of heterogeneous embedded systems. The system is represented as an abstract graph which captures both dataflow and the flow of control. Given an architecture consisting of several processors, ASICs and shared buss ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
In this paper we present an algorithm for system level hardware/software partitioning of heterogeneous embedded systems. The system is represented as an abstract graph which captures both dataflow and the flow of control. Given an architecture consisting of several processors, ASICs and shared busses, our partitioning algorithm finds the partitioning with the smallest hardware cost and is able to predict and guarantee the performance of the system in terms of worst case delay. 1 Introduction A great deal of research has been done on hardware/software partitioning [1]. Several research groups consider hardware/software architectures consisting of a single programmable processor and an ASIC. In this case, the behaviour is partitioned into one software partition and one hardware partition. However, for complex systems, such a restricted architecture doesn't allow an efficient design space exploration, and therefore we will concentrate on more general architectures. As the predictability...