Results 1 - 10
of
54
Enhanced Code Compression for Embedded RISC Processors
, 1999
"... This paper explores compiler techniques for reducing the memory needed to load and run program executables. In embedded systems, where economic incentives to reduce both ram and rom are strong, the size of compiled code is increasingly important. Similarly, in mobile and network computing, the need ..."
Abstract
-
Cited by 89 (2 self)
- Add to MetaCart
This paper explores compiler techniques for reducing the memory needed to load and run program executables. In embedded systems, where economic incentives to reduce both ram and rom are strong, the size of compiled code is increasingly important. Similarly, in mobile and network computing, the need to transmit an executable before running it places a premium on code size. Our work focuses on reducing the size of a program's code segment, using pattern-matching techniques to identify and coalesce together repeated instruction sequences. In contrast to other methods, our framework preserves the ability to run program executables directly, without an intervening decompression stage. Our compression framework is integrated into an industrial-strength optimizing compiler, which allows us to explore the interaction between code compression and classical code optimization techniques, and requires that we contend with the difficulties of compressing previously optimized code. The specific contributions in this paper include a comprehensive experimental evaluation of code compression for a Risc-like architecture, a more powerful pattern-matching scheme for improved identification of repeated code fragments, and a new form of profile-driven code compression that reduces the speed penalty arising from compression.
Memory Bank and Register Allocation in Software Synthesis for ASIPs
- Intern. Conf. on Computer-Aided Design (ICCAD
, 1995
"... An architectural feature commonly found in digital signal processors(DSPs) is multiple data-memory banks. This feature increases memory bandwidth by permitting multiple memory accesses to occur in parallel when the referenced variables belong to different memory banks and the registers involved are ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
An architectural feature commonly found in digital signal processors(DSPs) is multiple data-memory banks. This feature increases memory bandwidth by permitting multiple memory accesses to occur in parallel when the referenced variables belong to different memory banks and the registers involved are allocated according to a strict set of conditions. Unfortunately, current compiler technology is unable to take advantage of the potential increase in parallelism offered by such architectures. Consequently, most application software for DSP systems is hand-written -- a very time-consuming task. We present an algorithm which attempts to maximize the benefit of this architectural feature. While previous approaches have decoupled the phases of register allocation and memory bank assignment, our algorithm performs these two phasessimultaneously. Experimental results demonstrate that our algorithm substantially improves the code quality of many compiler-generated and even hand-written programs....
Retargetable code generation based on structural processor descriptions. Design Automation for Embedded Systems
- In Design Automation for Embedded Systems
, 1998
"... Abstract. Design automation for embedded systems comprising both hardware and software components demands for code generators integrated into electronic CAD systems. These code generators provide the necessary link between software synthesis tools in HW/SW codesign systems and embedded processors. G ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
Abstract. Design automation for embedded systems comprising both hardware and software components demands for code generators integrated into electronic CAD systems. These code generators provide the necessary link between software synthesis tools in HW/SW codesign systems and embedded processors. General-purpose compilers for standard processors are often insu cient, because they do not provide exibility with respect to di erent target processors and also su er from inferior code quality. While recent research on code generation for embedded processors has primarily focussed on code quality issues, in this contribution we emphasize the importance of retargetability, and we describe an approachtoachieve retargetability. We propose usage of uniform, external target processor models in code generation, which describe embedded processors by means of RT-level netlists. Such structural models incorporate more hardware details than purely behavioral models, thereby permitting a close link to hardware design tools and fast adaptation to di erent target processors. The MSSQ compiler, which is part of the MIMOLA hardware design system, operates on structural models. We describe input formats, central data structures, and code generation techniques in MSSQ. The compiler has been successfully retargeted to a number of real-life processors, which proves feasibility of our approach with respect to retargetability. We discuss capabilities and limitations of MSSQ, and identify possible areas of improvement.
Memory Data Organization for Improved Cache Performance in Embedded Processor Applications
- ACM Transactions on Design Automation of Electronic Systems
, 1996
"... INTRODUCTION Embedded microprocessors are a common feature in modern electronic systems due to the advantages they offer in terms of flexibility, reduction in design time, and full-custom layout quality [Marwedel and Goosens 1995]. Embedded processors commonly used in the market today can be classi ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
INTRODUCTION Embedded microprocessors are a common feature in modern electronic systems due to the advantages they offer in terms of flexibility, reduction in design time, and full-custom layout quality [Marwedel and Goosens 1995]. Embedded processors commonly used in the market today can be classified into two categories: application-specific processors, such as those in the DSP domain (e.g., TMS320 series from Texas Instruments), and generalpurpose processors (e.g., the CW4000 series from LSI Logic and the ARM series from Advanced RISC Machines). In this article, we concentrate on This work was partially supported by grants from ARPA (MDA904-96-C-1472), NSF(CDA9422095) , and ONR(N00014-93-1-1348). Authors' address: Department of Information and Computer Science, University of California, Irvine, CA 92697; email: #ppanda@ics.uci.edu#. Permission to make digital / hard copy of part or all of this work for personal or classroom use is grante
Analysis and Evaluation of Address Arithmetic Capabilities in Custom DSP Architectures
- in Custom DSP Architectures. Design Automation Conference (DAC
, 1997
"... Many application-specific architectures provide indirect addressing modes with auto-increment/decrement arithmetic. Since these architectures generally do not feature an indexed addressing mode, stack-allocated variables must be accessed by allocating address registers and performingaddress arithmet ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
Many application-specific architectures provide indirect addressing modes with auto-increment/decrement arithmetic. Since these architectures generally do not feature an indexed addressing mode, stack-allocated variables must be accessed by allocating address registers and performingaddress arithmetic. Subsuming address arithmetic into auto-increment/decrement arithmetic improves both the performance and size of the generated code. Our objective in this paper is to provide a method for comprehensively analyzing the performance benefits and hardware cost due to an auto-increment/decrement feature that varies from \Gammal to +l, and allowing access to k address registers in an address generator. We provide this method via a parameterizable optimization algorithm that operates on a procedure-wise basis. Hence, the optimization techniques in a compiler can be used not only to generate efficient or compact code, but also to help the designer of a custom DSP architecture make decisions on ad...
A Uniform Optimization Technique for Offset Assignment Problems
- 11th Int. Symp. on System Synthesis (ISSS
, 1998
"... A number of different algorithms for optimized offset assignment in DSP code generation have been developed recently. These algorithms aim at constructing a layout of local variables in memory, such that the addresses of variables can be computed efficiently in most cases. This is achieved by maximi ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
A number of different algorithms for optimized offset assignment in DSP code generation have been developed recently. These algorithms aim at constructing a layout of local variables in memory, such that the addresses of variables can be computed efficiently in most cases. This is achieved by maximizing the use of auto-increment operations on address registers. However, the algorithms published in previous work only consider special cases of offset assignment problems, characterized by fixed parameters such as register file sizes and auto-increment ranges. In contrast, this paper presents a genetic optimization technique capable of simultaneously handling arbitrary register file sizes and auto-increment ranges. Moreover, this technique is the first that integrates the allocation of modify registers into offset assignment. Experimental evaluation indicates a significant improvement in the quality of constructed offset assignments, as compared to previous work 1 . 1 Introduction One a...
Software Synthesis and Code Generation for Signal Processing Systems
- PHILOSOPHY OF SCIENCE
, 1999
"... The role of software is becoming increasingly important in the implementation of DSP applications. As this trend intensifies, and the complexity of applications escalates, we are seeing an increased need for automated tools to aid in the development of DSP software. This paper reviews the state of t ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
The role of software is becoming increasingly important in the implementation of DSP applications. As this trend intensifies, and the complexity of applications escalates, we are seeing an increased need for automated tools to aid in the development of DSP software. This paper reviews the state of the art in programming language and compiler technology for DSP software implementation. In particular, we review techniques for high level, block-diagram-based modeling of DSP applications; the translation of block diagram specifications into efficient C programs using global, target-independent optimization techniques; and the compilation of C programs into streamlined machine code for programmable DSP processors, using architecture-specific and retargetable back-end optimizations. In our review, we also point out some important directions for further investigation.
Function Inlining under Code Size Constraints for Embedded Processors
- In ICCAD ’99: Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design
, 1999
"... Function inlining is a compiler optimization that generally increases performance at the expense of larger code size. However, current inlining techniques do not meet the special demands in the design of embedded systems, since they are based on simple heuristics, and they generate code of unpredict ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Function inlining is a compiler optimization that generally increases performance at the expense of larger code size. However, current inlining techniques do not meet the special demands in the design of embedded systems, since they are based on simple heuristics, and they generate code of unpredictable size. This paper presents a novel approach to function inlining in C compilers for embedded processors, which aims a maximum program speedup under a global limit on code size. The core of this approach is a branch-and-bound algorithm which allows to quickly explore the large search space. In an application study we show how this algorithm can be applied to maximize the execution speed of an application under a given code size constraint. 1 Introduction For embedded systems based on programmable processors, C compilers play an important role in the system design process. While assembly-level programming of embedded processors has been common for quite some time, using C compilers for ...
C Compiler Design for an Industrial Network Processor
- ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES
, 2001
"... One important problem in code generation for embedded processors is the design of efficient compilers for ASIPs with application specific architectures. This paper outlines the design of a C compiler for an industrial ASIP for telecom applications. The target ASIP is a network processor with special ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
One important problem in code generation for embedded processors is the design of efficient compilers for ASIPs with application specific architectures. This paper outlines the design of a C compiler for an industrial ASIP for telecom applications. The target ASIP is a network processor with special instructions for bit-level access to data registers, which is required for packet-oriented communication protocol processing. From a practical viewpoint, we describe the main challenges in exploiting these application specific features in a C compiler, and we show how a compiler backend has been designed that accomodates these features by means of compiler intrinsics and a dedicated register allocator. The compiler is fully operational, and first experimental results indicate that C-level programming of the ASIP leads to good code quality without the need for time-consuming assembly programming.
Optimized array index computation in DSP programs
- In Proceedings of the ASP-DAC. IEEE
, 1998
"... Abstract | An increasing number of components in embedded systems are implemented by software running on embedded processors. This trend creates a need for compilers for embedded processors capable of generating high quality machine code. Particularly for DSPs, such compilers are hardly available, a ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Abstract | An increasing number of components in embedded systems are implemented by software running on embedded processors. This trend creates a need for compilers for embedded processors capable of generating high quality machine code. Particularly for DSPs, such compilers are hardly available, and novel DSP-speci c code optimization techniques are required. In this paper we focus on e cient address computation for array accesses in loops. Based on previous work, we present a new and optimal algorithm for address register allocation and provide an experimental evaluation of di erent algorithms. Furthermore, an e cient and close-to-optimum heuristic is proposed for large problems. 1 I.

