Results 1 - 10
of
12
A Uniform Optimization Technique for Offset Assignment Problems
- 11th Int. Symp. on System Synthesis (ISSS
, 1998
"... A number of different algorithms for optimized offset assignment in DSP code generation have been developed recently. These algorithms aim at constructing a layout of local variables in memory, such that the addresses of variables can be computed efficiently in most cases. This is achieved by maximi ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
A number of different algorithms for optimized offset assignment in DSP code generation have been developed recently. These algorithms aim at constructing a layout of local variables in memory, such that the addresses of variables can be computed efficiently in most cases. This is achieved by maximizing the use of auto-increment operations on address registers. However, the algorithms published in previous work only consider special cases of offset assignment problems, characterized by fixed parameters such as register file sizes and auto-increment ranges. In contrast, this paper presents a genetic optimization technique capable of simultaneously handling arbitrary register file sizes and auto-increment ranges. Moreover, this technique is the first that integrates the allocation of modify registers into offset assignment. Experimental evaluation indicates a significant improvement in the quality of constructed offset assignments, as compared to previous work 1 . 1 Introduction One a...
Offset Assignment Showdown: Evaluation of DSP Address Code Optimization Algorithms
- IN PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION
, 2003
"... Offset assignment is a highly effective DSP address code optimization technique that has been implemented in a number of ANSI C compilers. In this paper we concentrate on a special class of offset assignment problems called "simple offset assignment" (SOA). A number of SOA algorithms have been pr ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Offset assignment is a highly effective DSP address code optimization technique that has been implemented in a number of ANSI C compilers. In this paper we concentrate on a special class of offset assignment problems called "simple offset assignment" (SOA). A number of SOA algorithms have been proposed recently, but experimental results and direct comparisons are still sparse. This makes
An Empirical Comparison of Algorithmic, Instruction, and Architectural Power Prediction Models for High Peformance Embedded DSP Processors
- PROC. OF THE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN
"... This paper presents a comparison of statistically derived power prediction models at the algorithmic, instruction, and architectural levels for embedded high performance DSP processors. The approach is general enough to be applied to any embedded DSP processor. Results from 168 power measurements of ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
This paper presents a comparison of statistically derived power prediction models at the algorithmic, instruction, and architectural levels for embedded high performance DSP processors. The approach is general enough to be applied to any embedded DSP processor. Results from 168 power measurements of DSP code show that power can be predicted at instruction and architecture levels with less than 2% error. This result is important for developing a general methodology for power characterization of embedded DSP software since low power is critical to complex DSP applications in many cost sensitive markets.
Address Assignment Combined with Scheduling in DSP Code Generation
- in Proc. 39th Design Automation Conference
, 2002
"... One of the important issues in embedded system design is to optimize program code for the microprocessor to be stored in ROM. In this paper, we propose an integrated approach to the DSP address code generation problem for minimizing the number of addressing instructions. Unlike previous works in whi ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
One of the important issues in embedded system design is to optimize program code for the microprocessor to be stored in ROM. In this paper, we propose an integrated approach to the DSP address code generation problem for minimizing the number of addressing instructions. Unlike previous works in which code scheduling and offset assignment are performed sequentially without any interaction between them, our work tightly couples offset assignment problem with code scheduling to exploit scheduling on minimizing addressing instructions more effectively. We accomplish this by developing a fast but accurate two-phase procedure which, for a sequence of code schedules, finds a sequence of memory layouts with minimum addressing instructions. Experimental results with benchmark DSP programs show improvements of 13%-33% in the address code size over Solve-SOA/GOA [7].
Optimal Live Range Merge for Address Register Allocation in Embedded Programs
- In Proceedings of the 10th International Conference on Compiler Construction, CC2001, LNCS 2027
, 2001
"... The increasing demand for wireless devices running mobile applications has renewed the interest on the research of high performance low power processors that can be programmed using very compact code. One way to achieve this goal is to design specialized processors with short instruction formats ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
The increasing demand for wireless devices running mobile applications has renewed the interest on the research of high performance low power processors that can be programmed using very compact code. One way to achieve this goal is to design specialized processors with short instruction formats and shallow pipelines. Given that it enables such architectural features, indirect addressing is the most used addressing mode in embedded programs. This paper analyzes the problem of allocating address registers to array references in loops using auto-increment addressing mode. It leverages on previous work, which is based on a heuristic that merges address register live ranges. We prove, for the rst time, that the merge operation is NP-hard in general, and show the existence of an optimal linear-time algorithm, based on dynamic programming, for a special case of the problem. 1
Array Index Allocation under Register Constraints in DSP Programs
- In 12th Int. Conf. on VLSI Design
, 1999
"... Code optimization for digital signal processors (DSPs) has been identified as an important new topic in system-level design of embedded systems. Both DSP processors and algorithms show special characteristics usually not found in general-purpose computing. Since real-time constraints imposed on DSP ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Code optimization for digital signal processors (DSPs) has been identified as an important new topic in system-level design of embedded systems. Both DSP processors and algorithms show special characteristics usually not found in general-purpose computing. Since real-time constraints imposed on DSP algorithms demand for very high quality machine code, high-level language compilers for DSPs should take these characteristics into account. One important characteristic of DSP algorithms is the iterative pattern of references to array elements within loops. DSPs support efficient address computations for such array accesses by means of dedicated address generation units (AGUs). In this paper, we present a heuristic code optimization technique which, given an AGU with a fixed number of address registers, minimizes the number of instructions needed for address computations in loops. 1 1 Introduction Heterogeneous hardware/software systems are finding increasing use as embedded systems in in...
Address register allocation for arrays in loops of embedded programs
- Microelectronics Journal
, 1009
"... Efficient address register allocation has been shown to be a central problem in code generation for processors with restricted addressing modes. This paper extends previous work on Global Array Reference Allocation (GARA), the problem of allocating address registers to array references in loops. It ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Efficient address register allocation has been shown to be a central problem in code generation for processors with restricted addressing modes. This paper extends previous work on Global Array Reference Allocation (GARA), the problem of allocating address registers to array references in loops. It describes two heuristics to the problem, presenting experimental data to support them. In addition, it proposes an approach to solve GARA optimally which, albeit computationally exponential, is useful to measure the efficiency of other methods. Experimental results, using the MediaBench benchmark and profiling information, reveal that the proposed heuristics can solve the majority of the benchmark loops near optimality in polynomial-time. A substantial execution time speedup is reported for the benchmark programs, after compiled with the original and the optimized versions of GCC.
Optimizing Address Assignment and Scheduling for DSPs with Multiple Functional Units
"... Abstract — DSP processors provide dedicated address generation units (AGUs) that are capable of performing address arithmetic in parallel to the main data path. Address assignment, optimization of memory layout of program variables to reduce address arithmetic instructions by taking advantage of the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract — DSP processors provide dedicated address generation units (AGUs) that are capable of performing address arithmetic in parallel to the main data path. Address assignment, optimization of memory layout of program variables to reduce address arithmetic instructions by taking advantage of the capabilities of AGUs, has been studied extensively for single functional unit (FU) processors. In this paper, we exploit address assignment and scheduling for multiple-FU processors. We propose an efficient address assignment and scheduling algorithm for multiple-FU processors. Experimental results show that our algorithm can greatly reduce schedule length and address operations on multiple-FU processors compared with the previous work. Index Terms — address assignment, scheduling, multiple functional units, AGU, DSP.
Register Allocation for Indirect Addressing in Loops
, 1998
"... Syntax tree corresponding to an array reference is not decomposed into its atomic operations. In other words, the references to array elements are maintained until the final schedule is performed in the program. Assume also that Common Subexpression Elimination (CSE) is not allowed for array indices ..."
Abstract
- Add to MetaCart
Syntax tree corresponding to an array reference is not decomposed into its atomic operations. In other words, the references to array elements are maintained until the final schedule is performed in the program. Assume also that Common Subexpression Elimination (CSE) is not allowed for array indices, and that induction variable elimination is used to optimize the loop. Induction variable elimination is an important loop optimization based on strength reduction and code motion [Aho et al. 1988]. Consider for example array reference vector[i*a + k]. After induction variable elimination is performed, the array element address can be computed simply by adding a to register AR, which is initialized to &vector[0] + k and hoisted outside the loop. In the case of auto-increment (decrement), i.e. a = 1, the AGU automatically increments AR. When multidimensional array references are present within nested loops, references can usually be reduced to the simple unidimensional case (through inducti...
Algorithms for Array Reference Allocation in Loops of Embedded Programs
, 2002
"... Ecient address register allocation has been shown to be a central problem in code generation for processors with restricted addressing modes. This paper extends previous work on Global Array Reference Allocation (GARA), the problem of allocating address registers to array references in loops. It ..."
Abstract
- Add to MetaCart
Ecient address register allocation has been shown to be a central problem in code generation for processors with restricted addressing modes. This paper extends previous work on Global Array Reference Allocation (GARA), the problem of allocating address registers to array references in loops. It describes two heuristics to the problem, based on the SSA Form, presenting experimental data to support them. In addition, it proposes an approach to solve GARA optimally which, albeit computationally exponential, is useful to measure the eciency of other methods. Experimental results, using the MediaBench benchmark, reveal that the proposed heuristics can solve the majority of the benchmark loops near optimality in polynomial-time. A substantial execution time speedup is reported for the benchmark programs, after compiled with the original and the optimized versions of GCC.

