| A. Jantsch, P. Ellervee, J. Oeberg et. al., Hardware/Software Partitioning and Minimizing Memory Interface Traffic, IEEE/ACM Proc. of The European Conference on Design Automation (EuroDAC) 1994, pp. 220--225, 1994. |
....and in COSYMA [4] Both approaches use a fine grain partitioning(basic block) Coarse grainpartitioning on the other side means that whole functions or processes are moved from software to hardware or vice versa in order to find the best hardware software tradeoff. The approaches described in [5, 6, 7, 8, 10, 11, 12] use also fine grain partitioning whereas [13, 14, 16, 17, 18] use coarse grain partitioning. Some other approaches do not perform automatic partitioning but concentrate on interface synthesis [19, 20] or on co simulation [21] All these approacheshave in common that their granularity is ....
A. Jantsch, P. Ellervee, J. Oeberg et. al., Hardware/Software Partitioning and Minimizing Memory Interface Traffic, IEEE/ACM Proc. of The European Conference on Design Automation (EuroDAC) 1994, pp. 220--225, 1994.
....VI, we describe the allocation and binding algorithm to be used after the scheduling step and provide some discussions about the cost model used in this paper. Experimental results and conclusion are provided in Sections VII and VIII, respectively. II. Related work There are two published works [7] [8] on ne grain hardware software partitioning which use dynamic programming. In both of these works, the target architecture contains a single microprocessor and a single hardware chip (ASIC, FPGA, etc. The authors then try to nd the best combination of non overlapping sequences of ....
A. Jantsch, P. Ellervee, J. Oberg, A. Hemani, and H. Tenhunen, \Hardware/Software Partitioning and Minimizing Memory Interface Trac," in Proceedings European Design Automation Conference, 1994.
....high level synthesis. These coprocessors are as in PRISM and PRISM II called by the remaining software part of the program. The partitioning goal is to meet timing constraints with the least amount of additional hardware. Opposed to this the goal of the system proposed by Jantsch et al. [17, 18] is to reach the highest speedup for a C program with the given hardware resources (of the FCCM at hand) Here the partitioning search space is limited by defining hardware candidates (inner loops and functions) first and then selecting those which provide the highest estimated speedup. The ....
....with the given hardware resources (of the FCCM at hand) Here the partitioning search space is limited by defining hardware candidates (inner loops and functions) first and then selecting those which provide the highest estimated speedup. The partitioning method proposed by the author [19] follows [17, 18], but considers host coprocessor communication costs more detailed and uses integer programming for optimization. 3 Pipeline synthesis using parallel FOR loops As mentioned in section 2.2, due to the restrictions imposed by sequential input languages only few programs can be sped up significantly ....
A. Jantsch, P. Ellervee, J. Oberg, A. Hemani, and H. Tenhunen. Hardware /software partitioning and minimizing memory interface traffic. In Proc. of European Design Automation Conf. '94. IEEE Computer Society Press, September 1994.
....by a simulated annealing algorithm, moving a BSB to hardware would only be attempted if no mutually exclusive BSBs had already been placed in hardware. Such BSBs would then perhaps at a later iteration be moved back to software, thus enabling the initial BSB to be moved to hardware anyway. [17] presents an optimal (non heuristic) algorithm based on dynamic programming which handles mutually exclusive BSBs correctly (on a simple partitioning model) Unfortunately the presented algorithm has exponential memory requirements, thus making it impractical to use. Also, only full loops are ....
....which is capable of compiling to a number of target architectures. The GNU C compiler is capable of doing just that and target architectures can be specified very detailed. So this compiler would be suitable for a partitioning system, and has been used by a Swedish research group, see [16][17][18] But still, the compiled code would have to be simulated, ideally with a very advanced simulator which could take pipelining, internal and external caches, etc. into consideration. And the simulator should either call a hardware simulator whenever control should be transferred to the hardware ....
[Article contains additional citation context not shown here]
Axel Jantsch, Peeter Ellervee, Johnny Oberg, Ahmed Hermani, and Hannu Tenhunen. Hardware/software partitioning and minimizing memory interface traffic. In EURO-DAC '94, 1994.
....Ernst et al. 2] treat every C statement as a candidate. So even code regions which are not likely to yield a speedup are considered. Because this results in a large number of possible solutions they use a heuristic method (simulated annealing) to determine the partitioning. Jantsch et al. [3, 4] consider only preselected candidates which can be implemented in hardware and for which a speedup is expected. This allows finding an optimal solution of the partitioning problem using a dynamic programming technique. We follow Jantsch s definition of candidates [3, p. 97] A region of code is a ....
A. Jantsch, P. Ellervee, J. Oberg, A. Hemani, and H. Tenhunen. Hardware/software partitioning and minimizing memory interface traffic. In Proc. of European Design Automation Conf. '94. IEEE Computer Society Press, September 1994.
....and prototyping [8] Further, it is the most commonly used target architecture for automatic hardware software partitioning approaches. The hardware software partitioning of a system specification onto a single CPU, single ASIC architecture has been investigated by a number of research groups [1, 2, 3, 5, 6, 9] which have employed widely different input languages (C, C x , VHDL, etc. and system models, and have had different optimization goals and constraints in mind. This makes it very difficult to compare the approaches. The different approaches will be described and discussed in section 3 and 4. ....
....partitioning problems: 1. Optimize for speed with an area constraint. 2. Optimize for area with a speedup constraint (which equals the total software execution time minus an execution speed constraint) Approaches Assuming Instantaneous Communication The system model presented by Jantsch et al. [5] corresponds to partition model 1 as communication is ignored and BSBs cannot execute in parallel. The presented results are model domain results based on a realization model equal to model 1, i.e. no attempts to evaluate according to a realistic realization model have been made. Thus, the ....
[Article contains additional citation context not shown here]
Axel Jantsch, Peeter Ellervee, Johnny Oberg, Ahmed Hermani, and Hannu Tenhunen. Hardware/software partitioning and minimizing memory interface traffic. In EURO-DAC '94, 1994.
.... lack support for complex types and their data (storage and transfers) Work on system synthesis of protocol applications has been dealing with very particular aspects, e.g. solving timing constraints [9] and memory allocation for minimizing memory interface traffic during HW SW partitioning [8]. No global approach exists. In the SW community, distributed programming languages are used for programming general purpose multiprocessor systems or distributed networks of workstations [1, 10, 14] They rely on elaborate run time environments and are intended for SW implementation on large ....
A. Jantsch et al. Hardware/software partitioning and minimizing memory interface traffic. In Proc. of the EuroDAC, 1994.
....our dynamic programming algorithm for solving Problem 1.1. In Section 5, we describe the allocation and binding algorithm to be used after the scheduling step. Experimental results and conclusions are provided in Sections 6 and 7, respectively. 2 Related work There are two published works [4] [5] on fine grain hardware software partitioning which use dynamic programming. In both of these works, the target architecture contains a single microprocessor and a single hardware chip. The authors try to find the best combination of non overlapping sequences of fine grain basic scheduling ....
A. Jantsch, P. Ellervee, J. Oberg, A. Hemani, and H. Tenhunen. Hardware /Software Partitioning and Minimizing Memory Interface Traffic. In Proceedings European Design Automation Conference, 1994.
....first step a VHDL system specification is partitioned into two sets of candidates for hardware and software using profiling and user interaction. In the second step a process graph is constructed and partitioned into hardware and software parts using a simulated annealing algorithm [15] Jantsch [10] presents a partitioning approach where hardware candidates are preselected using profiling. All of these selected hardware candidates realize a system speedup of greater than 1. The goal is to speed up a system by incorporating hardware. A key feature is a memory allocation method which minimizes ....
A. Jantsch, P. Ellervee, J. ¨ Oberg, A. Hemani, and H. Tenhunen. Hardware/Software Partitioning and Minimizing Memory Interface Traffic. European Design Automation Conference (EURO-DAC), pages 226--231, 1994.
....chip and n the number of code fragments which may be placed in either hardware or software. 1 Introduction The hardware software partitioning of a system specification onto a target architecture consisting of a single CPU and a single ASIC has been investigated by a number of research groups [2, 5, 1, 7, 8, 11]. This target architecture is relevant in many areas where the performance requirements cannot be met by generalpurpose microprocessors, and where a complete ASIC solution is too costly. Such areas may be found in DSP design, construction of embedded systems, software execution acceleration and ....
....De Micheli [5] present a partitioning approach which starts from an all hardware solution. Their algorithm takes communication into account and is able to reduce communication when neighboring vertices are placed together in either software or hardware. The system model presented by Jantsch et al. [7] ignores communication. They present a dynamic programming algorithm based on the Knapsack algorithm which solves the partitioning problem for the case where some blocks include other blocks and are therefore mutually exclusive. The algorithm has exponential memory requirements which makes it ....
Axel Jantsch, Peeter Ellervee, Johnny Oberg, Ahmed Hermani, and Hannu Tenhunen. Hardware/software partitioning and minimizing memory interface traffic. In EURO-DAC '94, 1994.
....implementation of the interface between client server modules, termed interface synthesis. The main motivation is to adapt the interface during system implementation, rather than having a fixed communication architecture as is the case in most hardware software codesign approaches, e.g. [2, 5, 6] which are using memory mapped I O. The simplest system consists of a single client invoking one operation from a server, i.e. a point to point communication. This corresponds to the traditional view of hardware software codesign where we initially have an application which can not fulfill some ....
A. Jantsch, P. Ellervee, J. O berg, A. Hermani, and H. Tenhunen. Hardware/software partitioning and minimizing memory interface traffic. In proceedings of EURO-DAC, pages 226--231, 1994.
....vice versa. Using this technique, systems may be repartitioned whithout the complex and time consuming resynthesis of the functionality to be moved or the whole design. 1. Introduction In hardware software co design, partitioning is often done at a very early stage of the design process [4] 5] [6], 7] 12] 15] This means that most design decisions are still pending and therefore much information that would be of interest for the partitioning is not yet available. By allowing repartitioning the modification of the existing partition such information can be made use of when it ....
Axel Jantsch, Peeter Ellervee, Johnny Öberg, Ahmed Hemani, and Hannu Tenhunen. Hardware/ Software Partitioning and Minimizing Memory Interface Traffic. In Proc. of the European Design Automation Conference, pages 226--231, September 1994.
....which tries to cluster FSMs based on some closeness criteria. The target architecture for TOSCA is a single standard processor and one or more coprocessors embedded on a single chip. A number of researchers have focused on algorithm aspects rather than complete systems. Jantsch et al. 30] [31], 32] present a dynamic programming algorithm to solve the partitioning problem of optimizing an existing C program for speed given a hardware area constraint. The algorithm is derived from the Knapsack Stuffing algorithm [12] and solves (with exponential memory requirements) the partitioning ....
....Using the ACE models of the components in the hardware part of a given target architecture, the total hardware area for a given implementation can be estimated. A common way of estimating the hardware area of a BSB is to estimate how much area a full hardware implementation of the BSB will occupy [31], 35] This includes hardware to do the calculations of the BSB and hardware to control the sequencing of these calculations. If the total chip area is divided into a datapath area and a controller area, each BSB moved to hardware may be viewed as occupying a part of the datapath and a part of ....
Axel Jantsch, Peeter Ellervee, Johnny Oberg, Ahmed Hermani, and Hannu Tenhunen. Hardware /software partitioning and minimizing memory interface traffic. In EURO-DAC '94, 1994.
....pipelines than our compiler. Ideas used in this work have occured in several areas of computer science: Vectorization and parallelization techniques in supercomputer compilers [2] pipelining methods for data flow computers in [8] static partitioning algorithms in hardware software codesign [9, 10]. Finally, our pipeline synthesis uses methods from high level synthesis [11] and compiler optimization [1] 7 Results Our compiler prototype synthesizes pipelines from MODULA 2 programs for our target architecture. We implemented the example program with L = 8 and N = 200,000. The upper part of ....
A. Jantsch, P. Ellervee, J. Oberg, A. Hemani, and H. Tenhunen. Hardware/software partitioning and minimizing memory interface traffic. In Proc. of European Design Automation Conf. '94. IEEE Computer Society Press, September 1994.
....complexity of designs together with the number of tradeoffs possible during synthesis makes it impossible to completely search the design space. Besides their use as an exploratory tool, estimators could also serve as a potent guide in the synthesis phase by providing look ahead functionality. [6][7] 8] 13] The designer would like to have an estimate of area, power and performance of a given design for a particular target technology and a set of constraints. This should be possible because area, speed and power are functions of the synthesis tools involved, the target technology and ....
A.Jantsch, P.Ellervee, J.Öberg, A.Hemani, H.Tenhunen. Hardware/Software Partitioning and Minimizing Memory Interface Traffic. Proc. EURO-DAC'94, IEEE, September 1994, pp.226-231.
No context found.
Axel Jantsch, Peeter Ellervee, Johnny Oberg, Ahmed Hemani, and Hannu Tenhunen. Hardware/software partitioning and minimizing memory interface traffic. European Design Automation Conference (EURO-DAC), pages 226--231, 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC