| K. K. Parhi and D. G. Messerschmitt, "Static rate-optimal scheduling of iterative data flow programs via optimum unfolding," IEEE Trans. Comput., vol. 40, pp. 178--195, June 1991. |
....form is used widely in the design and implementation of digital signal processing (DSP) systems. In this paper, we assume that we are given a dataflow specification of an application, and an associated multiprocessor schedule (e.g. derived from scheduling techniques such as those discussed in [11, 14, 19]) Our objective is to reduce the overall IPC cost of the multiprocessor implementation, and the associated performance degradation, since IPC operations result in significant execution time and power consumption penalties, and are difficult to optimize thoroughly during the scheduling stage. PC ....
K. Parhi and D. G. Messerschmitt, "Static Rate Optimal Scheduling of Iterative Dataflow Programs via Optimum Folding," IEEE Transactions on Computers, Vol. 40, No. 2, pp. 178-194, Feb. 1991.
....two scheduling techniques, we introduce the terms M task and S task to denote a task that can run on multiple processors and a single processor, respectively. An M task can be either a purely data parallel task, or a mixed task data parallel routine. While pure data parallel scheduling techniques [3, 11, 12, 15, 24] could still be applied within data parallel M tasks, pure task scheduling techniques [17, 18, 19, 25, 26] are no longer applicable to schedule M tasks. As a result, new approaches have to be found that fully exploit the available parallelism. Scheduling is known to be NP complete even for the ....
K. Parhi and D. Messerschmitt. Static rate-optimal scheduling of iterative dataflow programs via optimum unfolding. IEEE Trans. Computers, 40(2):178--195, 1991.
....(b) schedule for an iterative algorithm. an overlapped schedule, apart from the parallelism in one execution of the algorithm (intra iteration parallelism) the parallelism of operations belonging to different iterations (inter iteration parallelism) should be taken into account during scheduling [20]. This is illustrated in Figure 1. Intra iteration and inter iteration parallelism can be described by a set of inequalities the solution of which gives lower and upper bounds for the time positions where an operation can be scheduled. The difference between these bounds is called the operation ....
K.K. Parhi and D.G. Messerschmitt. Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding. IEEE Trans- actions on Computers, 40(2):178-195, February 1991.
....Graph. We assume that the graph is executed several times corresponding to the iterative computations of the loop, involving varying data sets over time. Our loop representation is an extension of the models in [10, 13] and is similar to the signal processing data flow graph representation in [21]. Definition 2.1 A dependence graph (DG) is a directed graph denoted by a 5 tuple, DG = V; E; ffi; fi) V is the set of nodes representing the operations in the loop. E is the set of directed edges corresponding to the dependencies. V 7 N is a mapping which assigns an iteration index to ....
....the dependence graph. Initiation Interval (II) of a loop is the time interval between consecutive executions of its steady state. It is clear that CPL poses a bound on the the II of the graph. The resource constraints and the recurrences present in the DG also restrict the II of the steady state [21, 14, 10, 13]. Mathematically, MII res = max R i 2R 8u : fi(u) R i l(u) n i MII rec = 8 : 0 if there is no recurrence max r2G l(r) ffi (r) otherwise where MII res and MII rec are minimum initiation intervals due to resource and recurrence constraints respectively. R i is a specific ....
K. Parhi, D. Messerschmitt. "Static rate optimal scheduling of iterative data-flow programs via optimum unfolding". In IEEE Trans. on Computers, volume 40 n2, pages 178--195, February 1991.
....[7] All computations are repeated every T 0 control steps, where a control step corresponds to a single period of a system clock. This does not mean that all operations belonging to a single execution of an algorithm need to be performed within T 0 control steps. Assuming an overlapped scheduling [15] model, the schedule span can be larger than T 0 . One iteration period covers the control steps T # f0# 1#####T 0 #1g. The result of a scheduling algorithm, containing the mapping in time of a single iteration, will, however, have references to control steps in Z (the set of all integers) The ....
K.K. Parhi and D.G. Messerschmitt. Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding. IEEE Transactions on Computers, 40(2):178--195, February 1991.
....we conclude that if the targeted implementation platform is a single CMOS processor, reduction in the number of operations is the key to power minimization. 3 Related Work Power minimization efforts across all levels of design abstraction process are surveyed in [10] Parhi and Messerschmitt [6] presented optimal unfolding of linear DSP computations. Potkonjak and Rabaey [7] addressed the minimization of the number of multiplications and additions in linear computations in their maximally fast form so that the throughput is preserved. Sheliga and Sha [9] presented an approach to ....
K.K. Parhi and D.G. Messerschmitt. Static rateoptimal scheduling of iterative data-flow programs via optimum unfolding. IEEE Trans. on Computers, 40(2):178--195, 1991.
....two scheduling techniques, we shall use the terms M task and S task to denote a task that can run on multiple processors and a single processor, respectively. An M task can be either a purely data parallel task, or a mixed task data parallel routine. While pure data parallel scheduling techniques [1, 5, 13, 14, 16, 19, 29] could still be applied within data parallel M tasks, pure task scheduling techniques [12, 15, 17, 22, 23, 24, 30, 31] are no longer applicable to schedule M tasks. As a result, new approaches have to be found 2 that fully exploit the available parallelism. Scheduling is known to be NP complete ....
K. Parhi and D. Messerschmitt. Static rate-optimal scheduling of iterative dataflow programs via optimum unfolding. IEEE Transactions on Computers, 40(2):178--195, 1991.
....two scheduling techniques, we introduce the terms M task and S task to denote a task that can run on multiple processors and a single processor, respectively. An M task can be either a purely data parallel task, or a mixed task data parallel routine. While pure data parallel scheduling techniques [3, 11, 12, 15, 24] could still be applied within data parallel M tasks, pure task scheduling techniques [17, 18, 19, 25, 26] are no longer applicable to schedule M tasks. As a result, new approaches have to be found that fully exploit the available parallelism. Scheduling is known to be NP complete even for the ....
K. Parhi and D. Messerschmitt. Static rate-optimal scheduling of iterative dataflow programs via optimum unfolding. IEEE Trans. Computers, 40(2):178--195, 1991.
.... optimization techniques have been developed in the contexts of data memory minimization [1] joint minimization of code and data [8] 24] 29] high throughput block processing [25] multiprocessor scheduling (there have been numerous efforts in this category for example, see [2] 13] 17] [21], 27] synchronization optimization [23] as well as a variety of other objectives. 2410 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 49, NO. 10, OCTOBER 2001 Fig. 3. Cyclo static dataflow model compared with synchronous dataflow. Actor B is a distributor actor. a) SDF specification. b) CSDF ....
K. K. Parhi and D. G. Messerschmitt, "Static rate-optimal scheduling of iterative dataflow programs via optimum unfolding," IEEE Trans. Comput., vol. 40, pp. 178--195, Feb. 1991.
.... Delta 3 The main technical innovation of the research presented in this paper is the first approach for the minimization of the number of operations in arbitrary computations. The approach optimizes not only significantly wider set of computations than the other previously published techniques [Parhi and Messerschmitt 1991; Srivastava and Potkonjak 1996] but also outperforms or performs at least as well as other techniques on all examples. The novel divide and conquer compilation procedure combines and coordinates power and enabling effects of several transformations (using a well organized ordering of ....
....Those algorithms do not guarantee that all constants will be detected, but that each data declared constant is indeed constant over all possible executions of the program. A comprehensive survey of the most popular constant propagation algorithms can be found in [Wegman and Zadeck 1991] Parhi and Messerschmitt [1991] presented optimal unfolding of linear computations in DSP systems. Unfolding results in simultaneous processing of consecutive iterations of a computation. Potkonjak and Rabaey [1992] addressed the minimization of the number of multiplications and additions in linear computations in Delta 8 ....
Parhi, K. and Messerschmitt, D. 1991. Static rate-optimal scheduling of iterative dataflow programs via optimum unfolding. IEEE Transactions on Computers 40, 2, 178--195.
....two scheduling techniques, we introduce the terms M task and S task to denote a task that can run on multiple processors and a single processor, respectively. An M task can be either a purely data parallel task, or a mixed task data parallel routine. While pure data parallel scheduling techniques [1, 5, 14, 15, 16, 19, 29] could still be applied within data parallel M tasks, pure task scheduling techniques [13, 21, 22, 23, 24, 30, 31] are no longer applicable to schedule M tasks. As a result, new approaches have to be found that fully exploit the available parallelism. Scheduling is known to be NP complete even for ....
K. Parhi and D. Messerschmitt. Static rate-optimal scheduling of iterative dataflow programs via optimum unfolding. IEEE Transactions on Computers, 40(2):178--195, 1991.
....analyse the structure of the DAG for scheduling. New approaches exist that take genetic algorithms into account [24, 25] Apart from the DAG algorithms, algorithms based on the ITG are being implemented. For this graph model unfolding, re timing and software pipelining are popular techniques [26, 27, 11]. Some of these algorithms utilise again DAG scheduling algorithms for partially unfolded ITGs. To bene t from regular structures of graphs, especially from graphs derived from equations, techniques known from the VLSI processor design [20] are employed. These techniques use the regular structure ....
Keshab K. Parhi and David G. Messerschmitt. Static rate-optimal scheduling of iterative data-ow programs via optimum unfolding. IEEE Transactions on Computers, 40(2):178195, December 1991.
....desired number of times (with loop indices adjusted accordingly) This is depicted in Figures 2.9 and 2.10 for an unrolling factor of 2, i.e. the unrolled loop body contains 2 of the original loop bodies. A simple set of rules exist for transforming the original DDG to the unrolled DDG is given in [36]. Once the unrolling is done, the resulting loop may be software pipelined (using linear periodic schedules, for example) The second method of representing an unrolled loop maintains the original DDG for scheduling purposes, while extending the notion of linear periodic schedules (Equation 1.1) ....
K.K. Parhi and D.G. Messerschmitt. Static Rate-Optimal Scheduling of Iterative Data-flow Programs via Optimum Unfolding. IEEE Trans. on Computers, pages 178--195, Feb. 1991. 93
....are next presented: ffl UNRET performs an exhaustive analysis of the unrolling degrees of the loop that can derive optimal solutions for the available resources. Therefore, more than one instance of each operation can appear in the final schedule. Unlike other methods that perform loop unrolling [19, 27, 33], this paper will present a new approach that guarantees an optimal unrolling degree. ffl Similarly to [13, 8, 7] the number of folds for the steady state is automatically obtained by solving an ILP model for loop pipelining. The model has been extended to the minimization of resources under ....
....R is a set of edges that form a cycle. Let us define ffi R as ffi R = X (u;v)2R ffi(u; v) 2) For any operation u such that 9(u; v) 2 R, u i must be scheduled at least jRj cycles before u i ffi R . Therefore, R imposes a minimum initiation interval, RecMII R , on the execution of the loop [32, 5, 27]: RecMII R = jRj ffi R A loop can have several recurrences. Therefore: RecMII = max R RecMII R In the worst case, the number of cycles of a DG grows exponentially with jEj. However, finding RecMII can be done in polynomial time by using Karp s algorithm to calculate the weight of the ....
[Article contains additional citation context not shown here]
K.K. Parhi and D.G. Messerschmitt. Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding. IEEE Trans. Computers, 40(2):178--195, February 1991.
....schedule length and lengthens the memory part schedule, until a well balanced schedule is achieved, as shown in Figure 1. If we start off with the memory part schedule being longer than ALU part schedule, PBS cannot improve this kind of initial schedules. In this case, we first unfold (unroll) [5] the loop by a certain unfolding factor, which provides us a desired initial schedule (the ALU part schedule is longer than that of memory part) We present the upper bound of this unfolding factor in Section 3. Then PBS can be operated on the new initial schedules. The unfolding idea is portrayed ....
K. K. Parhi and D. G. Messerschmitt. Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding. IEEE Transactions on Computer, Vol. 40, No. 2, pages 178--195, Feb. 1991.
....there is at least an optimal schedule which is K periodic) Approximation algorithms producing K periodic schedules have been developed for CSIP and related problems. In [AN88] a list scheduling algorithm is performed on the expanded graph g of G until a K periodic structure is detected. In [PM91] it is shown that for every p 0, one can build an unfolded uniform graph G p that express the same precedence constraints as G by duplicating p times every generic task. The authors proved that if p is the least common multiple of the heights of the circuits of G, then the heights of the arcs of ....
....periodic scheduling on parallel processors. In this section, we present the main algorithms that have up to now been designed for the periodic job shop as well as for the periodic scheduling on identical processors problem. Many efficient heuristics for code generation have been proposed [Lam87] [PM91] [EW92] which cannot be detailed here. We have chosen to mention the most promising approaches that consist either in designing algorithms based on the results presented in the previous section, or in applying the powerful algorithms known in classical makespan minimization problems to cyclic ....
K.K. Parhi and D.G. Messerschmitt. Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding. IEEE Transactions on computers, 40(2):178--195, February 1991.
....them eciently by exploiting the parallelism available in the problems so that the iteration period (the time to execute all tasks of the data ow program once) is minimized. As the data ow program (DFP) is iterative with the feedback, it has a fundamental lower bound on the iteration period [PaMe 91] Thus, the tasks of DFP have to be scheduled optimally so as to obtain the iteration period asymptotically close to the lower bound. Critical path schedulers produce a schedule where the minimum possible iteration is equal to the critical path length of DFG [LuPa 93] Loop schedulers [LuPa 93] ....
....(data) between task T i and task T j , imposing the partial order that task T j can be executed only after the execution of task T i as in Fig. 1. The main di erence between general DFGs and the signal processing DFGs is the associated delay elements (registers) in the directed edges [PaMe 91] An edge without a register represents precedence between tasks within iteration. If an edge has n registers, it describes the precedence between tasks of di erent (i,n i) iterations which di er by n iterations. A simple example of a nonterminating, iterative, data ow program with feedback is ....
[Article contains additional citation context not shown here]
K.K. Parhi and D.G. Messerschmitt, "Static rate-optimal scheduling of iterative data- ow programs via optimum unfolding", IEEE Transactions on Computers, vol. 40, no. 2, pp. 178-194, Feb. 1991.
No context found.
K. K. Parhi and D. G. Messerschmitt, "Static rate-optimal scheduling of iterative data flow programs via optimum unfolding," IEEE Trans. Comput., vol. 40, pp. 178--195, June 1991.
No context found.
K. K. Parhi and D. G. Messerschmitt, "Static rate-optimal scheduling of iterative data flow programs via optimum unfolding," IEEE Trans. Comput., vol. 40, pp. 178--195, June 1991.
....by Wright Patterson AFB under contract number F33615 93 C 1309. 1 to as the iteration period. For all recursive signal processing algorithms, there exists an inherent fundamental lower bound on the iteration period referred to as the iteration period bound or simply the iteration bound [3, 4, 5]. This bound is fundamental to an algorithm and is independent of the implementation architecture. In other words, it is impossible to achieve an iteration period less than the bound even when infinite processors are available to execute the recursive algorithm. Determination of the iteration ....
....SRDFGs. An MRDFG can be expanded into the equivalent SRDFG [1] The equivalence means that the MRDFG and its expanded SRDFG express identical signal processing algorithm. In this section we describe a method to expand an 8 MRDFG into its equivalent SRDFG which is similar to unfolding an SRDFG [5]. 5.1 The number of invocations of node In SRDFGs, it is assumed that one invocation of a node consumes one data from every incoming edge and produces one data on every outgoing edge. Therefore, the number of invocations of each node is one in an iteration. In MRDFGs, on the other hand, one ....
K. K. Parhi and D. G. Messerschmitt, "Static Rate-Optimal Scheduling of Iterative Data-Flow Programs via Optimum Unfolding," IEEE Trans. Computers, vol. C-40, pp. 178--195, Feb. 1991.
No context found.
K. K. Parhi and D. G. Messerschmitt, "Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding," IEEE Trans. Comput., vol. 40, pp. 178--95, Feb. 1991.
No context found.
K.Parhi, D.Messerschmitt, "Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding", IEEE Trans. on Computers, Vol.40, No.2, pp.178-195, Feb. 1991.
No context found.
K.Parhi, D.Messerschmitt, "Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding", IEEE Trans. on Computers, Vol.40, No.2, pp.178-195, Feb. 1991.
No context found.
K. K. Parhi and D. G. Messerschmitt, "Static Rate-optimal Scheduling of Iterative Data-flow Programs via Optimum Unfolding," IEEE Transactions on Computers, vol. 40, pp. 178--194, 1991.
No context found.
K.K. Parhi and D.G. Messershmitt, "Static Rate-Optimal Scheduling of Iterative Data-Flow Programs Via Optimum Unfolding", IEEE Transactions on Computer, pp. 178-195, Feb. 1991.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC