Abstract — We introduce a transformation, named rephasing, that manipulates the timing parameters in control-data-flow graphs (CDFG’s) during the high-level synthesis of data-pathintensive applications. Timing parameters in such CDFG’s include the sample period, the latencies between input–output pairs, the relative times at which corresponding samples become available on different inputs, and the relative times at which the corresponding samples become available at the delay nodes. While some of the timing parameters may be constrained by performance requirements, or by the interface to the external world, others remain free to be chosen during the process of high-level synthesis. Traditionally high-level synthesis systems for data-pathintensive applications either have assumed that all the relative times, called phases, when corresponding samples are available at input and delay nodes are zero (i.e., all input and delay node samples enter at the initial cycle of the schedule) or have automatically assigned values to these phases as part of the data-path allocation/scheduling step in the case of newer schedulers that use techniques like overlapped scheduling to generate complex time shapes. Rephasing, however, manipulates the values of these phases as an algorithm transformation before the scheduling/allocation stage. The advantage of this approach is that phase values can be chosen to transform and optimize the algorithm for explicit metrics such as area, throughput, latency, and power. Moreover, the rephasing transformation can be combined with other transformations such as algebraic transformations. We have developed techniques for using rephasing to optimize a variety of design metrics, and our results show significant improvements in several design metrics. We have also investigated the relationship and interaction of rephasing with other high-level synthesis tasks. Index Terms—Behavioral synthesis, transformations. I.
|
5825
|
Introduction to Algorithms
– Cormen, Leiserson, et al.
- 1992
|
|
3148
|
Computer architecture : a quantitative approach, 3rd ed
– Hennessy, Patterson, et al.
- 2003
|
|
455
|
Software Pipelining, “An Effective Scheduling Technique for VLIW
– Lam
|
|
338
|
Combinatorial Optimization: Networks and Matroids
– Lawler
- 1976
|
|
286
|
Synchronous data flow
– Lee, Messerschmitt
- 1987
|
|
246
|
Retiming synchronous circuitry
– Leiserson, Saxe
- 1991
|
|
142
|
A characterization of the minimum cycle mean in a digraph
– Karp
- 1978
|
|
102
|
Fast prototyping of data path intensive architecture
– Rabaey, Chu, et al.
- 1991
|
|
79
|
Scheduling Parallel Computations
– Reiter
- 1968
|
|
73
|
High-Level Synthesis of ASICS under Timing and Synchronization Con.\"trainl.\". KI\I~'cr
– Ku
- 1992
|
|
69
|
A partial scan method for sequential circuits with feedback
– Cheng, Agarwal
- 1990
|
|
67
|
The Design and Analysis of VLSI Circuits
– Glasser, Dobberpuhl
- 1985
|
|
55
|
Algorithm transformation techniques for concurrent processors
– Parhi
- 1989
|
|
45
|
Optimizing two-phase, level-clocked circuitry
– Ishii, Leiserson, et al.
- 1997
|
|
41
|
Tutorial on highlevel synthesis
– McFarland, Parker, et al.
- 1988
|
|
40
|
Loop optimization in register-transfer scheduling for DSP-systems
– Goossens, Vandewalle, et al.
- 1989
|
|
36
|
Computing the initial states of retimed circuits
– Touati, Brayton
- 1993
|
|
32
|
Rangechart-guided iterative data-flow graph scheduling
– Groot, Gerez
- 1992
|
|
32
|
Behavioral Synthesis of Highly Testable Data Paths under Non-Scan and Partial Scan Environments
– Lee, Jha, et al.
- 1993
|
|
30
|
Scheduling and binding algorithms for high-level synthesis
– Paulin, Knight
- 1989
|
|
26
|
Interface optimization for concurrent systems under timing constraints
– FILO, KU, et al.
- 1993
|
|
23
|
Cathedral-III: Architecture-driven high-level synthesis for high throughput DSP applications
– Note, Geurts, et al.
- 1991
|
|
20
|
The maximum sampling rate of digital filters under hardware speed constraints
– Renfors, Neuvo
- 1981
|
|
18
|
The IBM system/360 model 91 floating point execution unit
– Anderson, Earle, et al.
- 1967
|
|
13
|
Critical path minimization using retiming and algebraic speed-up
– Iqbal, Potkonjak, et al.
- 1993
|
|
12
|
Circuit implementation of high-speed pipeline systems
– Cotten
- 1965
|
|
12
|
A new statistical approach for fault-tolerant VLSI systems
– Stapper
- 1992
|
|
11
|
Optimal automatic periodic multiprocessor scheduler for fully specified flow graphs
– Gelabert, Barnwell
- 1993
|
|
11
|
Schenk: Cooperation of Synthesis, Retargetable Code Generation and Test
– Marwedel, W
- 1993
|
|
10
|
Transformation-Based High-Level Synthesis of Fault-Tolerant ASICs
– Karri, Orailoglu
- 1992
|
|
10
|
Inserting Active Delay Elements to achieve wave pipelining”, ICCAD
– Wong, Micheli, et al.
- 1989
|
|
9
|
High level synthesis for reconfigurable datapath structures
– Guerra, Potkonjak, et al.
- 1993
|
|
9
|
Valid clocking in wavepipelined circuits
– Lam, Brayton, et al.
- 1992
|
|
8
|
Leiserson: “A Timing Analysis of Level-Clocked Circuitry
– Ishii, E
- 1990
|
|
7
|
Behavioral optimization using the manipulation of timing constraints
– Potkonjak, Srivastava
- 1998
|
|
6
|
Hyper-LP: A Design System for Power Minimization using Architectural Transformations”, ICCAD
– Chandrakasan
- 1992
|
|
6
|
Module Selection and data format conversion for cost-optimal DSP synthesis
– Ito, Lucke, et al.
- 1994
|
|
6
|
Transforming linear systems for joint latency and throughput optimization
– Srivastava, Potkonjak
- 1994
|
|
5
|
Crafting A Compiler. Menlo Park, CA: The Benjamin/Cummings
– Fischer, LeBlanc
- 1988
|
|
4
|
A System for Production Use of High-Level Synthesis
– Bergamaschi, Kuehlmann
- 1993
|
|
4
|
Wave pipelining: theory and CMOS implementation
– Gray, Liu, et al.
- 1994
|
|
4
|
Pipelining: Just another transformation
– Potkonjak, Rabaey
- 1992
|
|
2
|
Finding a cycle in a graph with minimum cost to time ratio with application to as hip routing problem
– Danzig, Blattner, et al.
- 1967
|
|
1
|
On cycle means and cycle staffing,” Univ. of North Carolina at Chapel Hill
– Hartman
- 1990
|
|
1
|
mapping of algorithms to predefined structures
– “Tree-based
- 1993
|