| S.Y. Liao, "Code Generation and Optimization for Embedded Digital Signal Processors," PhD thesis, Dept. of Electrical Eng. and Computer Science, Massachusetts Inst. of Technology, Cambridge, Mass., June 1996. |
....representation must consist of tree structures. Some authors therefore proposed pattern matching algorithms that directly support DAG structures. In [51] a code selection algorithm was presented that can generate optimal vertical code for DAG s, on a processor with only a single register. In [61] this algorithm has been further refined to support commutative operations and multiregister architectures similar to the TMS320C25 processor. 4) Bundling: The code selection techniques described hitherto rely on the availability of a template pattern base, possibly in the form of a regular tree ....
S. Liao, "Code generation and optimization for embedded digital signal processors," Ph.D. dissertation, MIT, June 1996.
....that do not interfere can be coalesced during CSOA. We show that variable coalescing can lead to a large improvement in code quality (66.5 fewer update instructions) when comparing to the best algorithm in O setStone [11] This result dismisses the rst assumptions to this problem, as in Liao [15], that seemed to indicate the opposite. Moreover, as a side e ect of CSOA, we also show that the proposed algorithm considerably reduces the nal stack size to 28.7 , when comparing to other approaches that do not perform coalescing. The remainder of this paper is organized as follows. Section 2 ....
S. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, Massachusetts Institute of Technology, 1996.
....energy consumption of different parts of a computing system, however, remains largely unstudied. This study is important because these optimizations are becoming popular in power aware systems, keeping pace with the increased use of high level languages and compilation techniques on these systems [29]. Through a detailed analysis of the energy variations brought by these techniques, architects can see which components are energy hotspots and develop suitable architectural solutions to account for the influence of these optimizations. Our expectation is that most compiler optimizations (in ....
S. Y. Liao, Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, Dept. of EECS, MIT, Cambridge, Massachusetts, June 1996.
....on energy consumption of different parts of a computing system, however, remains largely unstudied. This study is important because these optimizations are becoming popular in embedded systems, keeping pace with the increased use of high level laxtguages and compilation techniques on these systems [18]. Through a detailed analysis of the energy variations brought about by these techniques, architects can see which components are energy hotspots and develop suitable architectural solutions to account for the influence of these optimizations. Our expectation is that most compiler optimizations ....
S. Y. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, Dept. of EECS, MIT, Cambridge, Massachusetts, June 1996.
....and performance improved between 3 9 . However, in some instances, they found that performance could decrease 5 27 which suggests that the effect of procedure abstraction on cache performance needs more study. Mini subroutines Liao et al. propose a software method for supporting compressed code [Liao95, Liao96]. They find mini subroutines which are common sequences of instructions in the program. Each instance of a mini subroutine is removed from the program and replaced Count 2 lists G(p) total = 0; while (p) total ; p = p next; return(total) F( a = G(a ptr) b = G(b ptr) Count ....
....perl 2132 vortex 2878 Table 3.1: Maximum number of codewords used in baseline compression Maximum dictionary entry size is 4 instructions. 39 compression savings. The short entries contribute to a larger portion of the savings as the size of the dictionary increases. The compression method in [Liao96] cannot take advantage of this since the codewords are the size of single instructions, so single instructions are not compressed. Figure 3.6: Composition of dictionary for ijpeg Longest dictionary entry is 8 instructions. 1 2 3 4 5 6 7 8 Length of dictionary entry (number of instructions) 16 32 ....
S. Liao, Code Generation and Optimization for Embedded Digital Signal Processors, Ph.D. Dissertation, Massachusetts Institute of Technology, June 1996.
....decompression costs. Faster levels typically require faster decompression or, in the limit, a compressed form that can be interpreted directly, without the time or memory costs of a separate decompression step. Though recently there has been an increase in research on program compression e.g. [1, 3, 7, 8, 11, 13, 14, 16, 17, 20, 21, 22, 24, 27], very little of it has focused on methods, sometimes called code compaction methods, that avoid any decompression before execution. These methods produce representations that can be directly executed [7, 8] or interpreted [11, 16, 17, 21, 24] and, as a result, are more limited than schemes that ....
....compression e.g. 1, 3, 7, 8, 11, 13, 14, 16, 17, 20, 21, 22, 24, 27] very little of it has focused on methods, sometimes called code compaction methods, that avoid any decompression before execution. These methods produce representations that can be directly executed [7, 8] or interpreted [11, 16, 17, 21, 24] and, as a result, are more limited than schemes that have the flexibility to decompress before execution. Certain embedded systems supply one of the clearest examples of the need for zero overhead decompression. These systems typically store much of their code in ROM. Competition drives ....
S. Y. Liao. Code generation and optimization for embedded digital signal processors. Ph.D. thesis, MIT (1996).
....the arrays with similar access patterns and cluster them together. To achieve this, we use a graphbased approach which is sketched in Figures 3 and 4. Our approach operates in two steps. The first step (given in Figure 3) which is similar in spirit to the formulation of offset assignment problem [8] for scalar variable placement, builds a graph (called array relation graph or ARG for short) in which the nodes represent the arrays declared in the program and the weight of an edge (resp. hyper edge) represents the number of times (in cycles) two (resp. multiple) arrays that are incident on the ....
S. Y. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. Ph.D. Thesis, Dept. of EECS, MIT, Cambridge, Massachusetts, June 1996.
....as a post increment or post decrement operation. Experimental results show the effectiveness of our solution. 1 Introduction With the falling cost of microprocessors and the advent of very large scale integration, more and more processing power is being placed in portable electronic devices [5, 8, 9, 12]. Such processors (in particular, fixed point DSPs and micro controllers) can be found, for example in audio, video, and telecommunications equipment and have severely limited amounts of memory for storing code and data, since the area available for ROM and RAM is limited. This renders the ....
....for ROM and RAM is limited. This renders the efficient use of memory area very critical. Since the program code resides in the on chip ROM, the size of the code directly translates into silicon area and hence the cost. The minimization of code size is, therefore, of considerable importance [1, 2, 4, 5, 6, 7, 8, 13, 14, 15, 16], while simultaneously preserving high levels of performance. However, current compilers for fixed point DSPs generate code that is quite inefficient with respect to code size and performance. As a result, most application software is hand written or at least hand optimized, which is a very time ....
[Article contains additional citation context not shown here]
S. Y. Liao, Code Generation and Optimization for Embedded Digital Signal Processors, Ph.D. Thesis. MIT, June 1996.
....the generated code is inefficient as far code size is concerned. An unfortunate consequence of this is that programmers are forced to hand optimize their programs. Compiler optimizations specifically aimed at improving code size will therefore have a significant impact on programmer productivity [4, 5]. DSP processors such as the TI TMS320C5 and embedded micro controllers provide addressing modes with auto increment and auto decrement. This feature allows address arithmetic instructions to be part of other instructions. Thus, it eliminates the need for Department of Electrical and Computer ....
S. Y. Liao, Code Generation and Optimization for Embedded Digital Signal Processors, Ph.D. Thesis. MIT, June 1996.
....on energy consumption of different parts of a computing system, however, remains largely unstudied. This study is important because these optimizations are becoming popular in embedded systems, keeping pace with the increased use of high level languages and compilation techniques on these systems [18]. Through a detailed analysis of the energy variations brought about by these techniques, architects can see which components are energy hotspots and develop suitable architectural solutions to account for the influence of these optimizations. Our expectation is that most compiler optimizations ....
S. Y. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, Dept. of EECS, MIT, Cambridge, Massachusetts, June 1996.
....is increasingly important as system on a chip designs become popular in the embedded world. Code compression is one technique to reduce program size by applying compression algorithms to native instruction sets. There are many recent publications suggesting new compressed code representations [Araujo98, Benes97, Benes98, Bunda92, Ernst97, Fraser95, Kozuch94, Lefurgy97, Lekatsas98, Liao96, Wolfe92]. However, the increased instruction density has an accompanying performance cost because the instructions must be decompressed before execution. Although some work has addressed the issue of performance for decompression, on the whole, it remains much less studied than size optimizations for the ....
S. Liao, Code Generation and Optimization for Embedded Digital Signal Processors, Ph.D. Dissertation, Massachusetts Institute of Technology, June 1996.
....path covering (MWPC) problem and proved that it is NPcomplete. They also showed that the SOA solution could be used to solve the general o set assignment problem (GOA) which handles a xed number (k) of address registers and proposed ecient heuristic algorithms to solve the two problems. Liao [10] also demonstrated how the o set assignment solution could be extended across basic blocks and applied to an entire procedure. Leupers and Marwedel [9] extended the work done by Liao et al. by proposing a tie breaking heuristic and a variable partitioning strategy in an attempt to reduce the SOA ....
....Each node N i of this extended access sequence contains the access sequence of basic block i. An edge is directed from node Nx to node Ny if and only if an edge is directed from basic block x to basic block y in the CFG. The extended access sequence is analyzed with the framework developed in [10]. 3 The variable access sequence is generated prior to instruction selection and instruction scheduling using a pattern matching framework similar to [2] In addition to our main goal of decreasing static code size, we also attempt to reduce the dynamic instruction count. We used the above ....
[Article contains additional citation context not shown here]
S. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, MIT Department of EECS, Jan. 1996.
....and Wolf [10] describe two methods. First a system like the system described by Wo l f et al. 7,8] where binary arithmetic encoding is used instead of Huffman encoding. The second system they describe is based on dictionary encoding, where the dictionary can be application specific. Liao [11] detects common code sequences and places them in mini subroutines so that the common code sequences can be replaced by a mini subroutine call. The mini subroutine call has the length of the mini subroutine as parameter so that the mini subroutine does not require a return instruction. This ....
....evaluation stack do not need operand specifiers. 2. An instruction set with both base instructions of the TriMedia architecture as well as superinstructions for frequently occurring patterns of base instructions. This dictionary type of compression is the basis of many code compression systems [11, 13 15]. 3. A constant pool for frequently used large constants (32 bits) and addresses. 4. Short (8 bit offset) and long (16 bit offset) distance relative jumps. 5. The absence of save and restore code of caller and callee saved registers because registers are saved automatically by the interpreter. ....
S. Y. Liao, `Code generation and optimization for embedded digital signal processors', PhD Thesis, Massachusetts Institute of Technology, June 1996.
....code. In addition to code size and performance, the other important constraint on embedded systems is power dissipation. Code that executes more quickly consumes less energy, so that if clock frequency can be lowered while maintaining throughput requirements, power consumption diminishes as well [23]. Hence, we focus our efforts on generating compact and efficient code. 5 1.3 Compilation for Embedded Processors In order to guarantee that code density and real time performance constraints are satisfied, system designers typically hand code embedded software in assembly language. Although, ....
....registers. However, they did not demonstrate how their techniques perform on realistic programs. Sudarsanam et al. 29] studied the offset assignment problem in the presence of an auto increment auto decrement feature that varies from Gammal to l, and allowing access to k address registers. Liao [23] demonstrated how the offset assignment solution could be applied to an entire procedure by extending the framework across basic blocks with a control flow formulation. Since the SOA heuristic is used as a core procedure for GOA, this thesis focuses on reducing the cost of the SOA solution. The ....
[Article contains additional citation context not shown here]
S. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, MIT Department of EECS, January 1996.
....path covering (MWPC) problem and proved that it is NPcomplete. They also showed that the SOA solution could be used to solve the general offset assignment problem (GOA) which handles a fixed number (k) of address registers and proposed efficient heuristic algorithms to solve the two problems. Liao [6] also demonstrated how the offset assignment solution could be extended across basic blocks and applied to an entire procedure. Leupers and Marwedel [5] extended the work done by Liao et al. by proposing a tie breaking heuristic and a variable partitioning strategy in an attempt to reduce the ....
....Each node N i of this extended access sequence contains the access sequence of basic block i. An edge is directed from node N x to node N y if and only if an edge is directed from basic block x to basic block y in the CFG. The extended access sequence is analyzed with the framework developed in [6]. The variable access sequence is generated prior to instruction selection and instruction scheduling using a pattern matching framework similar to [1] In addition to our main goal of decreasing static code size, we also attempt to reduce the dynamic instruction count. We used the above ....
S. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, MIT Department of EECS, Jan. 1996.
....or at the level of native instructions. Sequences of code that are identical, except for the values used, can be bound to the same abstracted function and supplied with arguments for the appropriate values. 2.4. 2 Mini subroutines Liao et al. propose a software method for supporting compressed code [Liao95, Liao96]. They find mini subroutines which are common sequences of instructions in the program. Each instance of a mini subroutine is removed from the program and replaced with a call instruction. The mini subroutine is placed once in the text of the program and ends with a return instruction. ....
....3.7 shows which dictionary entries contribute the most to compression. Dictionary entries with 1 instruction achieve between 46 and 60 of the compression savings. The short entries contribute to a larger portion of the savings as the size of the dictionary increases. The compression method in [Liao96] cannot take advantage of this since the codewords are the size of single instructions, so single instructions are not compressed. Figure 3.6: Composition of dictionary for ijpeg. Longest dictionary entry is 8 instructions. 1 2 3 4 5 6 7 8 Length of dictionary entry (number of instructions) 16 ....
S. Liao, Code Generation and Optimization for Embedded Digital Signal Processors, Ph.D. Dissertation, Massachusetts Institute of Technology, June 1996.
....representation must consist of tree structures. Some authors therefore proposed pattern matching algorithms that directly support DAG structures. In [51] a code selection algorithm was presented that can generate optimal vertical code for DAGs, on a processor with only a single register. In [61] this algorithm has been further refined to support commutative operations and multi register architectures similar to the TMS320C25 processor. Bundling The code selection techniques described hitherto rely on the availability of a template pattern base, possibly in the form of a regular tree ....
S. Liao, Code generation and optimization for embedded digital signal processors, Ph.D thesis, MIT, June 1996.
....front end, it selects a set of patterns that covers the tree with a minimum total cost; second, it produces assembly code by executing the semantic actions of the selected patterns. In the case of embedded special purpose processors, in contrast, code generation remains a most difficult problem [29, 25, 26, 27]. With many digital signal processors (DSPs) 24] used in real time telecommunication systems, for instance, high level language compilation is currently ruled out by efficiency requirements. In general, especially because of the irregularity of the specialized processor architecture, the case ....
S. Y.-H. Liao. Code generation and optimization for embedded digital signal processors. Ph. D. thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. Cambridge (Massachusetts, USA), 1996.
....codeword can encode an entire group of instructions. In addition, our compression method does not need a LAT mechanism since we patch all branches to use the new instruction addresses in the compressed program. 2. 4 Liao et al. A purely software method of supporting compressed code is proposed in [Liao96]. The author finds mini subroutines which are common sequences of instructions in the program. Each instance of a mini subroutine is removed from the program and replaced with a call instruction. The mini subroutine is placed once in the text of the program and ends with a return instruction. ....
....a return instruction. Mini subroutines are not constrained to basic blocks and may contain branch instructions under restricted conditions. The prime advantage of this compression method is that it requires no hardware support. However, the subroutine call overhead will slow program execution. [Liao96] suggests a hardware modification to support code compression consisting primarily of a call dictionary instruction. This instruction takes two arguments: location and length. Common instruction sequences in the program are saved in a dictionary, and the sequence is replaced in the program with ....
[Article contains additional citation context not shown here]
S. Liao, Code Generation and Optimization for Embedded Digital Signal Processors, Dissertation, Massachusetts Institute of Technology, June 1996.
No context found.
S.Y. Liao, "Code Generation and Optimization for Embedded Digital Signal Processors," PhD thesis, Dept. of Electrical Eng. and Computer Science, Massachusetts Inst. of Technology, Cambridge, Mass., June 1996.
No context found.
S. Y. Liao. Code generation and optimization for embedded digital signal processors. Ph.D. Thesis, Dept. of EECS, MIT, Cambridge, Massachusetts, June 1996.
No context found.
S. Liao. Code Generation and Optimization for Embedded Digital Signal Processors. PhD thesis, MIT Department of EECS, January 1996.
No context found.
S. Y.-H. Liao. Code generation and optimization for embedded digital signal processors. Ph. D. thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. Cambridge (Massachusetts, USA), 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC