50 citations found. Retrieving documents...
F. Catthoor et al. Global communication and memory optimizing transformations for low power signal processing systems. In Proc. of Int. Workshop on Low Power Design, pages 51--56, 1994.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
High-Level Synthesis of Distributed Logic-Memory.. - Huang, Ravi, Raghunathan, .. (2002)   (2 citations)  (Correct)

....Reducing the memory size to exactly fit application requirements results in area, access time, and power benefits. Hence, techniques to estimate storage requirements from a behavior were proposed in [3] 4] Behavioral transformation techniques to reduce memory requirements have been described in [5]. The problem of mapping (or binding) arrays in behavioral descriptions to one or more memories in RTL implementations has been addressed extensively in previous work. Some HLS systems map arrays in the behavioral description into a single monolithic memory [6] 7] 8] 9] while others take ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man, "Global communication and memory optimizing transformations for low power signal processing systems," in Proc. Int. Wkshp. Low Power Design, 1994, pp. 51--56.


Memory Design and Exploration for Low Power, Embedded Systems - Wen-Tsong Shiue And (2001)   (3 citations)  (Correct)

....space significantly. 1. Introduction In systems that involve multidimensional streams of signals such as images or video sequences, it has been shown that the majority of the area and power cost is not due to the datapath or controllers but due to the global communication and memory interactions [1][2] In fact, in embedded applications for real time signal processing, 50 80 of the power cost is due to memory traffic caused by transfers between the ASIC and the off chip memories. Even general purpose processors such as the 21164 DEC Alpha chip or the StrongArm SA 110 processor dissipate ....

....in data cache instruction cache exploration. Section 8 concludes the paper. 2. Loop Transformations Loop transformation procedures can be used to significantly reduce the number of accesses as well as reduce the size of the on chip off chip memory. This has been illustrated in several papers [1][15] 16] We demonstrate the power of loop transformations with the example in Figure 1. Here, loop reordering allows array c[ and array w[ to share memory space, thereby reducing the size of the off chip memory. Loop interchange helps to reduce the number of memory reads. Loop fusion reduces ....

[Article contains additional citation context not shown here]

F. Catthoor, F. Franssen, S. Wuytack,, L. Nachtergaele, and H. De Man, "Global Communication and Memory Optimizing Transformations for Low Power Signal Processing Systems", Workshop on VLSI Signal Processing, La Jolla, CA, Oct. 1994.


Influence of Compiler Optimizations on System Power - Kandemir, Vijaykrishnan.. (2000)   (37 citations)  (Correct)

....fusion have been proven to be very useful in optimizing performance of loop nests, e.g. enhancing cache performance and or improving parallelism. While such techniques have been thoroughly evaluated from the performance point of view, there has been little effort to analyze their energy impact [2]. In this paper, we present a quantitative evaluation of the impact of different state of the art high level compilation techniques on energy consumption. We measure energy consumption using a transition sensitive, cycle accurate, RTlevel energy simulator [9] for a set of representative codes ....

....are broken up into smaller pieces (to fit in the cache) When we consider power, potential benefits from tiling depend on the changes in power dissipation induced by the optimization on different system components. We can expect a decrease in power consumed in memory, due to better data reuse [2]. On the other hand, in the tiled code, we traverse the same iteration space of the original code using twice as many loops (in the most general case) this entails extra branch control operations and macro calls. These extra computations might increase the power dissipation in the core. Loop ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. DeMan. Global communication and memory optimizing transformations for low power signal processing systems. In Proc. the IEEE Workshop on VLSI Signal Processing, pages 178-187, 1994.


C-HEAP: A Heterogeneous Multi-processor Architecture.. - Nieuwland, Kang.. (2002)   (Correct)

....this memory via a shared interconnection network, then the network quickly becomes a bottleneck. Therefore, we choose to distribute memory over the di erent processing devices. This provides higher bandwidth with lower latency, which results in a higher performance at a lower power consumption [41, 16, 15, 34, 35]. By making the memories part of a global memory map, they become accessible to other processing devices as well (i.e. Distributed Shared Memory) This makes these memories suitable for mapping communication bu ers to. Communication bu ers are needed to decouple the di erent tasks to achieve a ....

Catthoor, F., F. Franssen, S. Wuytack, L. Nachtergaele, and H. de Man: 1994, `Global communication and memory optimizing transformations for low-power signal processing systems'. In: Proceedings of the IEEE Workshop on Signal Processing, La Jolla, CA.


Synthesis of Pipelined Memory Access Controllers for Streamed.. - Park, Diniz (2001)   (4 citations)  (Correct)

....Panda et al. 8,9] refined this approach by defining a time constrained based specification of the a centralized scheduler for handling external memory operations. Catthoor, Balasa et al. developed and evaluated memory optimizations for embedded systems for a particular application set [1,2,6]. This research focuses on optimizations to minimize memory area and power consumption. Catthoor also proposed a data packing scheme to reduce memory bandwidth requirements for dynamic data structure. Wuytack et al. suggested minimizing memory bandwidth requirements [14] by mapping highly accessed ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. DeMan "Global communication and memory optimizing transformations for low power signal processing systems", IEEE workshop on VLSI signal processing, La Jolla, Calif., Oct. 1994.


Memory Design And Exploration For Low Power, Embedded Systems - Wen-Tsong Shiue Electrical (2001)   (3 citations)  (Correct)

....time significantly. INTRODUCTION In systems that involve multidimensional streams of signals such as images or video sequences, it has been shown that the majority of the area and power cost is not due to the datapath or the controllers, but due to the global communication and memory interaction [1]. In fact, 50 80 of the power cost in application specific circuits (ASIC) for real time signal processing is due to memory traffic caused by transfers between the ASIC and the off chip memories. This implies that with proper design, reduction in the memory related power budget can far exceed the ....

....we focus on the ones that reduce the number of memory accesses and the size of the storage. Examples of such transformations are loop fusion, loop fission, loop interchange etc. Pioneering work on applying loop transformations to reduce power in data dominated applications has been done at IMEC [1]. The next step in our procedure is determining which variables should be assigned to register files and which variables should be assigned (cache size and line size) that best satisfies the system requirements of area, energy and number of cycles. Our memory exploration procedure is an extension ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, H. De Man, "Global Communication and Memory Optimizing Transformations for Low Power Signal Processing Systems", Workshop on VLSI Signal Processing, La Jolla CA, Oct 1994.


Memory Design and Exploration for Low Power Embedded Systems - Shiue, Chakrabarti (2001)   (3 citations)  (Correct)

....misses is reduced. In [Dutta et al. 1998] Dutta, Wolf, and Wolfe have studied memory system design for video processors. The performance metrics of their system are area, cycle time and utilization. Pioneering work in the area of memory management for low power power has been done at IMEC [Catthoor et al. 1994; Catthoor et al. 1998] The procedure is comprehensive and consists of global transformations to increase the locality and regularity of data accesses, systematic method for data reuse, and memory allocation and assignment that meets the timing constraints with as cheap as possible memory ....

CATTHOOR, F., FRANSSEN, F., WUYTACK, S., NACHTERGAELE, L., MAN, H. DE. 1994. Global communication and memory optimizing transformations for low power signal processing systems. Workshop on VLSI Signal Processing (La Jolla, CA, Oct).


Coupling-Driven Signal Encoding Scheme for Low-Power.. - Kim, Baek, Shanbhag.. (2000)   (4 citations)  (Correct)

....achievable transition activity have been derived for noiseless buses in [12] and for noisy buses in [5] In [17] a segmentation method was introduced to reduce power consumption. Specification transformation approaches were used to reduce the number of memory accesses at the behavioral level [2]. The effectiveness of various encoding schemes was compared at the system level in [4] Most of the previous bus encoding schemes were designed to minimize transition activities on each signal line as if each line were isolated from neighboring lines, hence ignoring coupling effects. Such an ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. D. Man. Global communication and memory optimizing transformations for low power signal processing systems. In VLSI Signal Processing VII, pages 178--187, 1994.


Memory Binding for Performance Optimization of.. - Khouri.. (1999)   (1 citation)  (Correct)

....towards reducing the area of memory have been proposed. Techniques to estimate the memory required to implement a functional specification have been proposed in [2, 3] Transformations to reorganize the loops and conditionals in the behavior to reduce the required memory have been presented in [4]. Past work in storage binding has considered the mapping of scalar variables to registers, register files and small memories [5, 6] Some high level synthesis systems map arrays in the behavior into a single, monolithic memory [7, 8, 9, 10] This approach has the advantage of being similar in ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man, "Global communication and memory optimizing transformations for low power signal processing systems," in Proc. Int. Wkshp. Low Power Design, pp. 51--56, 1994.


Memory Design And Exploration For Low Power, Embedded Systems - Shiue, Chakrabarti (2001)   (3 citations)  (Correct)

....time significantly. INTRODUCTION In systems that involve multidimensional streams of signals such as images or video sequences, it has been shown that the majority of the area and power cost is not due to the datapath or the controllers, but due to the global communication and memory interaction [1]. In fact, 50 80 of the power cost in application specific circuits (ASIC) for real time signal processing is due to memory traffic caused by transfers between the ASIC and the off chip memories. This implies that with proper design, reduction in the memory related power budget can far exceed the ....

....we focus on the ones that reduce the number of memory accesses and the size of the storage. Examples of such transformations are loop fusion, loop fission, loop interchange etc. Pioneering work on applying loop transformations to reduce power in data dominated applications has been done at IMEC [1]. The next step in our procedure is determining which variables should be assigned to register files and which variables should be assigned to on chip cache. This is done by an LP based procedure which tries to maximize the number of accesses to the register file (which is equivalent to minimizing ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, H. De Man, "Global Communication and Memory Optimizing Transformations for Low Power Signal Processing Systems", Workshop on VLSI Signal Processing, La Jolla CA, Oct 1994.


Software Design For Low Power - Roy, Johnson (1996)   (10 citations)  (Correct)

....of a system. Unfavorable access patterns (or a cache that is too small) will lead to cache misses and a lot of costly memory accesses. In multi dimensional signal processing algorithms, the order and nesting of loops can alter memory size and bandwidth requirements by orders of magnitude [2]. Compact machine code decreases memory activity by reducing the number of instructions to be fetched and reducing the probability of cache misses [11] Cache accesses are more energy efficient than main memory accesses. The cache is closer to the CPU than is main memory, resulting in shorter and ....

....loads instead of single word loads as much as possible. For multi dimensional signal processing applications, the nesting of loops controlling array operations and the order of operations can substantially influence the number of memory transfers and the total storage requirements. Catthoor et al. [2] presented three simple examples, representative of some typical signal processing loop structures, that illustrate the impact of loop nesting and operation ordering. These examples are reproduced below along with brief discussions. In each case, M and N are large integer values. Example 1: ....

[Article contains additional citation context not shown here]

Catthoor, F., Franssen, F., Wuytack, S., Nachtergaele, L., and DeMan, H. (1994). Global communication and memory optimizing transformations for low power signal processing systems. In Proceedings,IEEE Workshop on VLSI Signal Processing, 178--187.


Reconfiguration For Power Saving In Real-Time Motion Estimation - Park And (1997)   (3 citations)  (Correct)

.... I.O computation Figure 6: Power vs. Search area(p) 4.2. Resource Reuse In principle a search area is larger than a current block. Therefore overlapped areas exist among the adjacent search areas. These overlapped areas increase the memory traffic for a previous frame as a search area increases[7]. In reconfigurable motion estimation, unused hardware resources or returning resources by reducing the search area can be used as local memory, resulting in further power saving. Fig.7 shows the overlapped areas in adjacent search areas. Unused resources or returning resources from reducing a ....

F. Cattor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man, "Global communication and memory optimizing transformations for low power signal processing systems," Proc. IEEE workshop on VLSI signal processing, La Jolla, CA, Oct. 1994


Low-Power Encodings for Global Communication in CMOS VLSI - Stan, Burleson (1997)   (27 citations)  (Correct)

....this paper is founded, represent the first time a comprehensive methodology for low power I O at several levels of abstraction is proposed, of a larger scale than any of the isolated results previously published. Encoding address buses for low power was studied by Su et al. 37] Wuytack et al. [41] and Stan and Burleson [33] Generally, an address bus tends to have a sequential behavior and a Gray code is optimal in such a case. With only one transition per cycle, or 1=K (0.125 for an 8 bit bus) transitions per bus line, it represents a big improvement (45 for an 8 bit bus) over the ....

S. Wuytack, F. Catthoor, F. Franssen, L. Nachtergaele, and H. D. Man. Global communication and memory optimizing transformations for low power systems. In International Workshop on Low Power Design, pages 203--208, Napa, CA, Apr. 1994.


High-Level Synthesis Techniques for Reducing the Activity of.. - Musoll (1995)   (20 citations)  (Correct)

....architectural level to obtain lower power designs. In [6] the power consumption of additions and constant multiplications as a function of the operand activity is studied. From this study, a data flow graph transformation is described for a typical operation in signal processing applications. In [26] some memory transformations for low power systems are hinted. The aim of these transformations is to reduce both the activity of the address lines and the number of off chip references. In [4] the traditional transformations for faster and smaller circuits are applied in order to evaluate the ....

....constantsof the algorithm in the scheduling and register binding steps. 4 Loop Interchange The loop interchange technique has been traditionally implemented in compilers to obtain dependency graphs with a higher degree of parallelism or to increase data locality and, thus, reduce memory traffic [26]. We apply loop interchange with the goal of minimizing the number of operand changes on the functional unit inputs. This technique will be applied to the motion estimation algorithm for image compression [17] Figure2(a) to illustrate its efficiency. 4.1 Application of loop interchange In the ....

S. Wuytack, F. Catthoor, F. Franseen, L. Nachtergaele, and H. D. Man. Global communications and memory optimizing transformations for low power. In Proc. Int. Workshopon Low Power Design, pages 203--208, Apr. 1994.


Low-Power Architectural Synthesis and the Impact of.. - Mehra, Guerra, Rabaey (1996)   (12 citations)  (Correct)

....of registers. Since memory management decides whether variables should be stored in registers or background memory, it also influences the size of the register files and their physical capacitance. Previous works have explored the use of algorithm transformations for memory size reduction [13]. Consider the loop shown in Figure 3a. Arrays A and C are already available in memory; when A is consumed another array B is generated; when C is consumed a scalar value, D, is produced. Memory size can be reduced by executing the j loop before the i loop (Figure 3b) so that C is consumed before ....

....the output; array B stores intermediate values. Since only one value of B needs to be alive at a given time the array can be stored in a register eliminating the related memory accesses. Another way to reduce memory accesses is to use loop based transformations as was proposed by Catthoor et al. [13]. For register files, the accesses depend on the architecture model being used. For example, in a single centralized register file scenario, writes are determined by the algorithm (exactly equal to the number of variables) whereas for distributed register files, a single variable may need to be ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man, "Global Communication and Memory Optimizing Transformations for Low-Power Signal Processing Systems," VLSI Signal Processing Workshop, Oct. 1994, pp. 178-187.


Energy-Delay Efficient Data Storage and Transfer.. - Circuit Technology Versus   Self-citation (Catthoor)   (Correct)

No context found.

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", IEEE workshop on VLSI signal processing, La Jolla CA, Oct. 1994.


Data and Memory Optimization Techniques for Embedded.. - Panda, Catthoor, Dutt, .. (2001)   (14 citations)  Self-citation (Catthoor)   (Correct)

No context found.

CATTHOOR, F., FRANSSEN, F., WUYTACK, S., NACHTERGAELE, L., AND DE MAN, H. 1994. Global communication and memory optimizing transformations for low power systems. In Proceedings of the International Workshop on Low Power Design. 203--208.


Code Transformations for Reduced Data Transfer and .. - Brockmeyer.. (1999)   (1 citation)  Self-citation (Catthoor)   (Correct)

No context found.

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


Low Power Memory Storage and Transfer Organization .. - Brockmeyer.. (1999)   (3 citations)  Self-citation (Catthoor Nachtergaele De man)   (Correct)

No context found.

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


System-Level Transformations for Low Power Data.. - Catthoor, Wuytack, .. (1998)   (5 citations)  Self-citation (Catthoor Franssen Wuytack Nachtergaele De man)   (Correct)

No context found.

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", in VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


Global Multimedia System Design Exploration using.. - Vandecappelle.. (1999)   (3 citations)  Self-citation (Catthoor)   (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man. Global communication and memory optimizing transformations for low power signal processing systems. In J. Rabaey, P. Chau, and J. Eldon, editors, VLSI Signal Processing VII, pages 178--187. IEEE Press, New York, 1994.


Formalized Three-Layer System-Level Reuse Model.. - Vermeulen.. (2000)   (1 citation)  Self-citation (Catthoor De man)   (Correct)

....coding, medical image archival, multi media terminals, artificial vision, speech and audio coding, xDSL modems, and wireless LAN modems. In these application domains power management and reduction is becoming a major issue [1, 3, 5, 8] As demonstrated by recent work at Princeton [9] at IMEC [2] and in the IRAM project at Berkeley, the most important power contribution in such data dominated applications is due to the data storage and transfers, both in custom hardware and programmable processors. Also area and (in the programmable case) performance are heavily impacted by data accesses ....

....assembly compilation C C custom hardware embedded software behavioral synthesis interface synthesis Figure 3. Reuse design flow, identifying potential points of design transfer. At reuse time, the loop and indexing layer descriptions are used as input for a system level DTSE approach [2, 16], delivering an optimized data transfer and storage solution in the actual implementation context. The transformed behavioral description of the indexing and process control layer is mapped to an RT description using existing behavioral synthesis tools, instantiating the scalar layer entities. ....

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems," VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


Low power data transfer and storage exploration.. - Nachtergaele.. (1997)   Self-citation (Catthoor Nachtergaele)   (Correct)

.... by now that any future complex chip realisation has to take power reduction into account [1] Our previous research has clearly shown that the dominant power contribution in data dominated designs lies in the data transfer and storage of multi dimensional array signals and other complex data types [2], 3] In this paper we exploit this feature to achieve large savings in the system power without having to worry about the detailed data path, foreground registers, and controller architecture. The main contributions in this paper will be the evaluation of the applicability and effectiveness of ....

....from the Corporate R D labs of Texas Instruments Incorporated, Dallas, Texas. S. Janssens was a student from Erasmus Hogeschool and is now with IMEC D. Moolenaar was a student from Delft Univ. of Technology and is now with IMEC. algorithm. In addition, we have substantiated our earlier claims [2] that the cost of the background storage and related transfers is dominant during the system exploration. This will be shown in section VI by investigating the power in a representative data path in H.263, including its corresponding local memories. In the rest of this paper, we have concentrated ....

[Article contains additional citation context not shown here]

F.Catthoor, F.Franssen, S.Wuytack, L. Nachtergaele, and H. De Man, "Global Communication and Memory Optimizing Transformations for Low Power Signal Processing Systems," in VLSI Signal Processing VII, Jan Rabaey, Paul M. Chau, and John Eldon, Eds., New York, October 1994, IEEE workshop on VLSI signal processing, pp. 178--187, IEEE Press.


Optimization of Memory Organization and Hierarchy.. - Nachtergaele.. (1995)   (4 citations)  Self-citation (Catthoor Franssen Nachtergaele)   (Correct)

.... applied [6] Our activities have been mostly aimed at application specific architecture styles, but recently also predefined processors (e.g. DSP cores) are envisioned [7] The cost functions which we currently incorporate for the storage communication resources are both area and power oriented [5]. Due to the realtime nature of the targeted applications, the throughput is normally a constraint. The input of the ATOMIUM environment is a specification of the design written in a Data Flow oriented Language (called DFL) 8] The output is a netlist of memories and address generators, combined ....

F.Catthoor, F.Franssen, S.Wuytack, L. Nachtergaele, and H. De Man. Global Communication and Memory Optimizing Transformations for Low Power Signal Processing Systems. In Jan Rabaey, Paul M. Chau, and John Eldon, editors, VLSI Signal Processing VII, pages 178--187, New York, October 1994. IEEE workshop on VLSI signal processing, IEEE Press.


Memory Organization for Video Algorithms on.. - De Greef, Catthoor, De .. (1995)   (2 citations)  Self-citation (Catthoor De man)   (Correct)

....literature related to storage, it should be taken into account that data sheets of conventional 1Mbit SRAM s show power budgets between 0.5 and 1.5 W when fully utilized. The main influence is due to the transfer count but also size plays an important role because of increased capacitive loading [4]. Even future low power oriented SRAM s [19] still require about 0:25W at the required clock rates. This is considerably more than what the data path and controller logic consume for these submicron technologies (on the order of 10 100 mW) Another important source of consumption are the global ....

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", IEEE workshop on VLSI signal processing, La Jolla CA, Oct. 1994. Also in VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


Transforming Set Data Types to Power Optimal Data Structures - Sven Wuytack (1996)   (5 citations)  Self-citation (Wuytack Catthoor De man)   (Correct)

....target application domain because power for example is limiting the reliability and the MTBF of advanced network components. For many of these applications the major area and power cost is not involved in the data paths or the controllers but in the global communication and the memory organisation [20, 4]. An experiment with a Segment Protocol Processor demonstrated that the maximal power consumption for the 9 off chip memories was 6 W (90 dynamic 10 static power) while the ASIC containing all data paths, controllers, interfaces and local memories consumed about 2 W. So, it is important to ....

....taken into account by an empirical model based on actual current measurements. There are also some memory related power studies [3, 8, 14, 16] but these are oriented to caches in microprocessors and not for custom network components. Our own previous work was situated at the architectural level [20, 4]. 3 The Set Data Structure Model A set of records which are accessed with one or more keys can be represented by many different data structures. All these data structures have different characteristics in terms of memory occupation, number of memory accesses to locate a certain record, power ....

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power systems", 1994 International Workshop on Low Power Design, Napa Valley CA, pp.203-208, Apr. 1994.


Transforming Set Data Types to Power Optimal Data Structures - Sven Wuytack (1996)   (5 citations)  Self-citation (Catthoor Wuytack De man)   (Correct)

....target application domain because power for example is limiting the reliability and the MTBF of advanced network components. For many of these applications the major area and power cost is not involved in the data paths or the controllers but in the global communication and the memory organisation [20, 4]. An experiment with a Segment Protocol Processor demonstrated that the maximal power consumption for the 9 off chip memories was 6 W (90 dynamic 10 static power) while the ASIC containing all data paths, controllers, interfaces and local memories consumed about 2 W. So, it is important to ....

....taken into account by an empirical model based on actual current measurements. There are also some memory related power studies [3, 8, 14, 16] but these are oriented to caches in microprocessors and not for custom network components. Our own previous work was situated at the architectural level [20, 4]. 3 The Set Data Structure Model A set of records which are accessed with one or more keys can be represented by many different data structures. All these data structures have different characteristics in terms of memory occupation, number of memory accesses to locate a certain record, power ....

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", IEEE workshop on VLSI signal processing, La Jolla CA, Oct. 1994. Also in VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


System-Level Data-Flow Transformations For Power.. - Catthoor.. (1996)   (1 citation)  Self-citation (Catthoor Nachtergaele De man)   (Correct)

.... by now that any future complex chip realization has to take power reduction into account [13] Our previous research has clearly shown that the dominant power contribution in data dominated designs lies in the data transfer and storage of multidimensional array signals and other complex data types [3, 14]. In this paper we have exploited this feature to propose a formalized data flow transformation methodology, which can achieve large savings in the system power without having to worry about the detailed data path, foreground registers, and controller architecture. The approach is part of a ....

....P and PB mode, including the overlapped block motion compensation (OBMC) mode which is discussed later. The data flow for the continuous P mode is shown in figure 1. For data intensive applications, such as video decoding, data access to large storage units for MD arrays dominates the power budget [3, 9, 14]. Therefore the primary design goal is to reduce memory transfers between large frame memories and datapath units. The power cost of a data transfer is a function of the memory size, memory type, To appear in Proc. ICECS 96, October 13 16, Rhodos, Greece 1 c fl IEEE P T 1 Pext T oldframe ....

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power systems", IEEE Int. Wsh. on Low Power Design, Napa CA, pp.203-208, April 1994.


Global Communication and Memory Optimizing.. - Catthoor.. (1994)   (26 citations)  Self-citation (Wuytack Catthoor Franssen Nachtergaele De man)   (Correct)

.... the system power) can become an order of magnitude, even without incorporating the effect of reducing the supply voltage in newer technologies The same applies for table based communication subsystems where one experiment demonstrated that the power consumption for the 9 off chip memories was 6 W [20]. This includes the contributions from the 8 bit data communication and the 19 bit address transfers which are both important. This memory power budget should be compared to the Sigma2 W for the ASIC containing all the data paths, controllers, interfaces and local memories. It has to be stressed ....

....and memory optimizing system architecture transformation methods and automatable techniques to support these. First, a brief overview is provided with the stateof the art on these issues. Next, specific activities on memory power reduction are high lighted, summarizing our previous findings [20]. Finally, our main contribution in this paper is discussed, namely the application of such techniques to an industrial test vehicle. 2. Survey on memory and power oriented transformations Steering of data processing oriented transformations at the algorithmic or system level is in general not ....

[Article contains additional citation context not shown here]

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power systems", accepted for IEEE Intnl. Workshop on Low Power Design, Napa CA, April 1994.


Hierarchy Exploration in High Level Memory Management - Diguet, Wuytack, Catthoor.. (1997)   (1 citation)  Self-citation (Wuytack Catthoor De man)   (Correct)

....Power savings can be obtained by accessing heavily used data from smaller memories instead of from large background memories. Such an optimization requires architectural transformations that consist of adding layers of smaller and smaller memories to which frequently used data will be copied [20]. So, memory hierarchy optimization has to introduce copies of data from larger to smaller memories in the Data Flow Graph (DFG) This means that there is a trade off involved here: on the one hand, power consumption is decreased because data is now read mostly from smaller memories, while on the ....

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power systems", IEEE Intnl. Workshop on Low Power Design, Napa CA, pp.203-208, April 1994.


Transforming Set Data Types to Power Optimal Data Structures - Sven Wuytack (1996)   (5 citations)  Self-citation (Wuytack Catthoor De man)   (Correct)

....on the algorithm architecture level [5] are very useful for our research, though. There are also some memory related power studies [3, 8] but these are oriented to caches in microprocessors and not for custom network components. Our own previous work was situated at the architectural level [11]. 3 Set Data Structure Model A set of records which are accessed with one or more keys can be represented by many different data structures. All these data structures have different characteristics in terms of memory occupation, number of memory accesses to locate a certain record, power ....

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communicationand memory optimizing transformations for low power systems", 1994 Int. Wkshp on Low Power Design, Napa Valley CA, pp.203-208, Apr. 1994.


Design of heterogeneous ICs for mobile and personal.. - Goossens, Bolsens.. (1994)   (5 citations)  Self-citation (Catthoor)   (Correct)

....audio, voice or simple data processing, or employ more complex data structures like video frames or speech frames. In the scalar case, simple buffering by means of registers or small FIFOs is usually sufficient [44, 31] For more complex cases, communication may require large memory based buffers [12]. The different components in the heterogeneous IC architecture may use different clocking schemes with respect to each other and with respect to the external environment. Also, the programmable processor core(s) used and the external environment may already have a predefined protocol scheme. ....

....or FIFOs [44, 31] In more complex algorithms, including video applications, data exchange requires large memorybased storage. In this case the memory size and number of memory transfers can often be reduced by transforming the original specification, which reduces the area and power dissipation [55, 12, 4]. In addition to synthesising the storage architecture, the mapping of the data transfers on to the actual storage locations and busses must be performed. In the Cathedral environment, memory synthesis techniques have been developed. These techniques have been applied to the mobile terminal ....

F. Catthoor et al., "Global communication and memory optimizing transformations for low power signal processing systems", IEEE Workshop VLSI Signal Proc., La Jolla,Oct. 1994. To be published in : Proc. IEEE Int. Conf. on Comp.-Aided Design, San Jose, November 1994. Copyright ACM 1994.


Power Exploration for Dynamic Data Types through.. - Silva, Jr.. (1998)   (7 citations)  Self-citation (Catthoor)   (Correct)

....software. In the embedded processor design for our target domain, a large part of the area is due to memory units [21] Also the power for such data dominated applications is heavily dominated by the storage and transfers (as demonstrated by recent work in the IRAM project at Berkeley, at IMEC [4] and at Princeton [20] Given the data storage and transfers importance, we propose a systematic design methodology in which the dynamic storage related issues are globally optimized as a first step, before doing the software hardware or processor partitioning and the detailed compilation on an ....

F.Catthoor, et al., "Global communication and memory optimizing transformations for low power signal processing systems", IEEE workshop on VLSI signal processing, La Jolla CA, Oct. 1994. Also in VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


Co-Design Of DSP Systems - De Man, Bolsens, Lin, Van Rompaey.. (1996)   (5 citations)  Self-citation (De man)   (Correct)

....internally over shared memory. Interprocess communication is over unidirectional channels using an RPC protocol. 2. Box 3 in Figure 3 contains four consecutive steps. First, for array intensive DSP code optimisation takes place to optimise memory and power dissipation in memory accesses [1, 6, 15, 25, 26, 42]. This is followed by an interactive coarse partitioning of the specifications over a user allocated architecture. This leads to a merger of a number of compiler consistent processes to be mapped on the same component. The implication is that hardware and software (run time kernels, software ....

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man. Global communication and memory optimizing transformations for low power signal processing systems. In J. Rabaey, P. Chau, and J. Eldon, editors, VLSI Signal Processing, VII, pages 178 -- 188. IEEE Press, New York, NY, 1994.


Power Exploration for Data Dominated Video Applications - Wuytack, Catthoor.. (1996)   (7 citations)  Self-citation (Wuytack Catthoor Nachtergaele De man)   (Correct)

....and required memory storage as much as possible. Our power exploration methodology is based on the observation that in this type of data dominated applications, the system power consumption is dominated by the power consumed in the transfers and storage related to the main memory organisation [21]. So, the first stage in our power exploration methodology, is to come up with an optimized memory architecture. The derivation of an optimal memory architecture is done in a number of steps. The first step is the optimization of the control flow to increase the regularity and locality in the ....

....should be stored in a more centralized way and not fully distributed over a huge amount of local registers. This storage organisation then becomes the bottle neck 1 . As we have shown earlier, in principle much power can be gained by reducing the number of accesses to large frames or buffers [21]. Also other groups have made similar observations [13] for video applications. Up to now however no systematic approach has been published to target this important field. Indeed, most effort up to now has been spent, either on datapath oriented work (e.g. 3] on control dominated logic or on ....

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power systems", IEEE Intnl. Workshop on Low Power Design, Napa CA, pp.203-208, April 1994.


Transforming Set Data Types to Power Optimal Data Structures - Sven Wuytack (1996)   (5 citations)  Self-citation (Wuytack Catthoor De man)   (Correct)

....on the algorithm architecture level [7, 8] are very useful for our research, though. There are also some memory related power studies [3, 11] but these are oriented to caches in microprocessors and not for custom network components. Our own previous work was situated at the architectural level [16]. 3 The Set Data Structure Model A set of records which are accessed with one or more keys can be represented by many different data structures. All these data structures have different characteristics in terms of memory occupation, number of memory accesses to locate a certain record, power ....

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power systems", 1994 International Workshop on Low Power Design, Napa Valley CA, pp.203-208, Apr. 1994.


Low power storage exploration for H.263 video decoder - Nachtergaele, Catthoor..   Self-citation (Catthoor Nachtergaele)   (Correct)

.... by now that any future complex chip realization has to take power reduction into account [15] Our previous research has clearly shown the dominant power contribution in data dominated designs lies in the data transfer and storage of multi dimensional array signals and other complex data types [3, 18]. In this paper we have exploited this feature to achieve large savings in the system power without having to worry about the detailed data path, foreground registers, and controller architecture. This work was supported in part by Texas Instruments Incorporated, Dallas, Texas y Professor at ....

....and effectiveness of our methodology for data dominated applications [13] a study of the effect of the possible optimizations, and the application of the most promising alternatives in the correct sequence on the H263 decoder algorithm. In addition, we have substantiated our earlier claims [18] that the cost of the background storage and related transfers is dominant during the system exploration. This will be shown in Section 6 by investigating the power in a representative data path in H.263, including its corresponding local memories. In the rest of this paper, we have concentrated ....

S.Wuytack, F.Catthoor, F.Franssen, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power systems", IEEE Intnl. Workshop on Low Power Design, Napa CA, pp.203-208, April 1994.


Low power storage exploration for H.263 video decoder - Nachtergaele, Catthoor..   Self-citation (Catthoor Nachtergaele)   (Correct)

.... by now that any future complex chip realization has to take power reduction into account [15] Our previous research has clearly shown the dominant power contribution in data dominated designs lies in the data transfer and storage of multi dimensional array signals and other complex data types [3, 18]. In this paper we have exploited this feature to achieve large savings in the system power without having to worry about the detailed data path, foreground registers, and controller architecture. This work was supported in part by Texas Instruments Incorporated, Dallas, Texas y Professor at ....

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", IEEE workshop on VLSI signal processing, La Jolla CA, Oct. 1994. Also in VLSI Signal Processing VII, J.Rabaey, P.Chau, J.Eldon (eds.), IEEE Press, New York, pp.178-187, 1994.


Automatic Synthesis of Customized Local Memories for - Multicluster Application..   (Correct)

No context found.

F. Catthoor et al. Global communication and memory optimizing transformations for low power signal processing systems. In Proc. of Int. Workshop on Low Power Design, pages 51--56, 1994.


Behavioral Level Guidance Using Property-Based Design.. - Lisa Marie Guerra (1996)   (1 citation)  (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man, "Global communication and memory optimizing transformations for low power signal processing systems, IEEE Workshop on VLSI Signal Processing, VII, pp. 178-187, 1994. 116


Power Optimization and Management in Embedded Systems - Massoud Pedram University (2001)   (2 citations)  (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man, "Global communication and memory optimizing transformations for low power signal processing systems," Proc. Int. Wkshp. on Low Power Design, pages 203-208, Apr. 1994.


Energy Estimation for Piecewise Regular Processor Arrays - Hannig, Teich (2002)   (Correct)

No context found.

Francky Catthoor, Frank Franssen, Sven Wuytack, Lode Nachtergaele, and Hugo De Man. Global Communication and Memory Optimizing Transformations for Low Power Systems. In VLSI Signal Processing Workshop, pages 178--187, October 1994.


Energy Estimation of Nested Loop Programs - Frank Hannig Hannig (2002)   (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man. Global Communication and Memory Optimizing Transformations for Low Power Systems. In VLSI Signal Processing Workshop, pages 178--187, Oct. 1994.


Automatic Synthesis of Customized Local Memories for.. - Kudlur, Fan, Chu, Mahlke   (Correct)

No context found.

F. Catthoor et al. Global communication and memory optimizing transformations for low power signal processing systems. In Proc. of Int. Workshop on Low Power Design, pages 51--56, 1994.


Reducing Conflict Misses in Caches - de Langen, Juurlink (2003)   (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man. Global Communication and Memory Optimizing Transformations for Low-Power Signal Processing Systems. In VLSI Signal Processing Workshop, 1994.


A Flexible Simulator for Exploring Hardware Rasterizers - Antochi, Juurlink.. (2002)   (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man. Global Communication and Memory Optimizing Transformations for Low-Power Signal Processing Systems. In VLSI Signal Processing Workshop, 1994.


Off-Chip Memory Traffic Measurements of Low-Power Embedded.. - de Langen, Juurlink (2002)   (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man. Global Communication and Memory Optimizing Transformations for Low-Power Signal Processing Systems. In VLSI Signal Processing Workshop, 1994.


A Flexible Simulator for Exploring Hardware Rasterizers - Antochi, Juurlink..   (Correct)

No context found.

F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man. Global Communication and Memory Optimizing Transformations for Low-Power Signal Processing Systems. In VLSI Signal Processing Workshop, 1994.


System Specification and Storage Architecture Exploration for.. - Moolenaar (1996)   (1 citation)  (Correct)

No context found.

F.Catthoor, F.Franssen, S.Wuytack, L.Nachtergaele, H.De Man, "Global communication and memory optimizing transformations for low power signal processing systems", IEEE workshop on VLSI signal processing, La Jolla CA, Oct. 1994.


Low-Power Design of Page-Based Intelligent Memory - Oskin, Chong, Farooqui..   (Correct)

No context found.

S. Wuytack, F. Catthoor, F. Franssen, L. Nachtergaele, and H. D. Man. Global communication and memory optimizing transformations for low pwer systems. In Proc. Int. Workshop Low Power Design, Napa Valley, California, April 1994.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC