Results 1 -
6 of
6
SSA Elimination after Register Allocation
, 2009
"... Compilers such as gcc use static-single-assignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSA-based register allocation performs SSA elimination after register a ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Compilers such as gcc use static-single-assignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSA-based register allocation performs SSA elimination after register allocation. SSA elimination before register allocation is straightforward and standard, while previously described approaches to SSA elimination after register allocation have shortcomings; in particular, they have problems with implementing copies between memory locations. We present spill-free SSA elimination, a simple and efficient algorithm for SSA elimination after register allocation that avoids increasing the number of spilled variables. We also present three optimizations of the core algorithm. Our experiments show that spillfree SSA elimination takes less than five percent of the total compilation time of a JIT compiler. Our optimizations reduce the number of memory accesses by more than 9 % and improve the program execution time by more than 1.8%.
Aliased Register Allocation for Straight-line Programs is NP-Complete
"... Register allocation is NP-complete in general but can be solved in linear time for straight-line programs where each variable has at most one definition point if the bank of registers is homogeneous. In this paper we study registers which may alias: an aliased register can be used both independentl ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Register allocation is NP-complete in general but can be solved in linear time for straight-line programs where each variable has at most one definition point if the bank of registers is homogeneous. In this paper we study registers which may alias: an aliased register can be used both independently or in combination with an adjacent register. Such registers are found in commonly-used architectures such as x86, the HP PA-RISC, the Sun SPARC processor, and MIPS floating point. In 2004, Smith, Ramsey, and Holloway presented the best algorithm for aliased register allocation so far; their algorithm is based on a heuristic for coloring of general graphs. Most architectures with register aliasing allow only aligned registers to be combined: for example, the low-address register must have an even number. Open until now is the question of whether working with restricted classes of programs can improve the complexity of aliased register allocation with alignment restrictions. In this paper we show that aliased register allocation with alignment restrictions for straight-line programs is NP-complete. We also present a proof of a related result by Stockmeyer: the shipbuilding problem is NP-complete
Decoupled (SSA-based) Register Allocators: from Theory to Practice, Coping with Just-In-Time Compilation and Embedded Processors Constraints.
, 2013
"... In compilation, register allocation is the optimization that chooses which variables of the source program, in unlimited number, are mapped to the actual registers, in limited number. Parts of the live-ranges of the variables that cannot be mapped to registers are placed in memory. This eviction is ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In compilation, register allocation is the optimization that chooses which variables of the source program, in unlimited number, are mapped to the actual registers, in limited number. Parts of the live-ranges of the variables that cannot be mapped to registers are placed in memory. This eviction is called spilling. Until recently, compilers mainly addressed register allocation via graph coloring using an idea developed by Chaitin et al. [33] in 1981. This approach addresses the spilling and the mapping of the variables to registers in one phase. In 2001, Appel and George [3] proposed to split the register allocation in two separate phases. This idea yields better and independent solutions for both problems, but requires a very aggressive form of live-range splitting, split everywhere, which renames all variables between all instructions of the program. However, in 2005, several groups [27, 84, 56, 16] observed that the static single assignment (SSA) form provides sufficient split points to decouple the register allocation as Appel and George suggested, unless register aliasing or precoloring constraints are involved.
Punctual coalescing
- In Proceedings of the International Conference on Compiler Construction, CC'10
, 2010
"... Abstract. Compilers use register coalescing to avoid generating code for copy instructions. For architectures with register aliasing such as x86, Smith, Ramsey, and Holloway (2004) presented a polynomial-time ap-proach, while Scholz and Eckstein (2002) presented an optimal, exponen-tial-time approac ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Compilers use register coalescing to avoid generating code for copy instructions. For architectures with register aliasing such as x86, Smith, Ramsey, and Holloway (2004) presented a polynomial-time ap-proach, while Scholz and Eckstein (2002) presented an optimal, exponen-tial-time approach together with a near-optimal, quadratic-time heuris-tic. Both methods scale poorly after aggressive live range splitting, espe-cially for programs in elementary form where live ranges are split at ev-ery program point. In contrast, we mentioned in a previous paper (2008), without giving details, that we have a scalable, linear-time heuristic for programs in elementary form. In an effort to formalize that heuristic, we discovered an even better algorithm, called Punctual Coalescing, which we present here. Punctual Coalescing is scalable, linear time, locally op-timal in general, close to globally optimal for straight-line code, and proven correct with the Twelf theorem prover. We define global optimal-ity with an ILP-formulation and we show via experiments that Punctual Coalescing compares well to this and two other approaches. 1
Detecting Bugs in Register Allocation
"... Although register allocation is critical for performance, the implementation of register allocation algorithms is difficult, due to the complexity of the algorithms and target machine architectures. It is particularly difficult to detect register allocation errors if the output code runs to completi ..."
Abstract
- Add to MetaCart
Although register allocation is critical for performance, the implementation of register allocation algorithms is difficult, due to the complexity of the algorithms and target machine architectures. It is particularly difficult to detect register allocation errors if the output code runs to completion, as bugs in the register allocator can cause the compiler to produce incorrect output code. The output code may even execute properly on some test data, but errors can remain. In this article, we propose novel data flow analyses to statically check that the value flow of the output code from the register allocator is the same as the value flow of its input code. The approach is accurate, fast, and can identify and report error locations and types. It is independent of the register allocator and uses only the input and output code of the register allocator. It can be used with different register allocators, including those that perform coalescing and rematerialization. The article describes our approach, called SARAC, and a tool that statically checks a register allocation and reports the errors and their types that it finds. The tool has an average compile-time overhead of only 8 % and a modest average memory overhead of 85KB. Our techniques can be used by compiler developers during regression testing and as a command-line-enabled debugging pass for mysterious compiler behavior. Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors—Code generation,