Results 1 - 10
of
32
Design of the Java HotSpot TM client compiler for Java 6
- ACM Transactions on Architecture and Code Optimization
, 2008
"... Version 6 of Sun Microsystems ’ Java HotSpotTM VM ships with a redesigned version of the client just-in-time compiler that includes several research results of the last years. The client compiler is at the heart of the VM configuration used by default for interactive desktop applications. For such a ..."
Abstract
-
Cited by 29 (12 self)
- Add to MetaCart
Version 6 of Sun Microsystems ’ Java HotSpotTM VM ships with a redesigned version of the client just-in-time compiler that includes several research results of the last years. The client compiler is at the heart of the VM configuration used by default for interactive desktop applications. For such applications, low startup and pause times are more important than peak performance. This paper outlines the new architecture of the client compiler and shows how it interacts with the VM. It presents the intermediate representation that now uses static single-assignment (SSA) form and the linear scan algorithm for global register allocation. Efficient support for exception handling and deoptimization fulfills the demands that are imposed by the dynamic features of the Java programming language. The evaluation shows that the new client compiler generates better code in less time. The popular SPECjvm98 benchmark suite is executed 45 % faster, while the compilation speed is also up to 40 % better. This indicates that a carefully selected set of global optimizations can also be integrated in just-in-time compilers that focus on compilation speed and not on peak performance. In addition, the paper presents the impact of several optimizations on execution and compilation speed. As the source code is freely available, the Java HotSpotTM VM and the client compiler are the ideal basis for experiments with new feedback-directed optimizations
Array bounds check elimination for the Java HotSpot client compiler
- IN PPPJ ’07: PROCEEDINGS OF THE 5TH INTERNATIONAL SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PROGRAMMING IN JAVA
, 2007
"... Whenever an array element is accessed, Java virtual machines execute a compare instruction to ensure that the index value is within the valid bounds. This reduces the execution speed of Java programs. Array bounds check elimination identifies situations in which such checks are redundant and can be ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
(Show Context)
Whenever an array element is accessed, Java virtual machines execute a compare instruction to ensure that the index value is within the valid bounds. This reduces the execution speed of Java programs. Array bounds check elimination identifies situations in which such checks are redundant and can be removed. We present an array bounds check elimination algorithm for the Java HotSpot TM VM based on static analysis in the just-in-time compiler. The algorithm works on an intermediate representation in static single assignment form and maintains conditions for index expressions. It fully removes bounds checks if it can be proven that they never fail. Whenever possible, it moves bounds checks out of loops. The static number of checks remains the same, but a check inside a loop is likely to be executed more often. If such a check fails, the executing program falls back to interpreted mode, avoiding the problem that an exception is thrown at the wrong place. The evaluation shows a speedup near to the theoretical maximum for the scientific SciMark benchmark suite (40% on average). The algorithm also improves the execution speed for the SPECjvm98 benchmark suite (2 % on average, 12 % maximum).
Escape analysis in the context of dynamic compilation and deoptimization
- In Proceedings of the ACM/USENIX International Conference on Virtual Execution Environments
, 2005
"... In object-oriented programming languages, an object is said to escape the method or thread in which it was created if it can also be accessed by other methods or threads. Knowing which objects do not escape allows a compiler to perform aggressive optimizations. This paper presents a new intraprocedu ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
(Show Context)
In object-oriented programming languages, an object is said to escape the method or thread in which it was created if it can also be accessed by other methods or threads. Knowing which objects do not escape allows a compiler to perform aggressive optimizations. This paper presents a new intraprocedural and interprocedural algorithm for escape analysis in the context of dynamic compilation where the compiler has to cope with dynamic class loading and deoptimization. It was implemented for Sun Microsystems ’ Java HotSpot TM client compiler and operates on an intermediate representation in SSA form. We introduce equi-escape sets for the efficient propagation of escape information between related objects. The analysis is used for scalar replacement of fields and synchronization removal, aswellasforstack allocation of objects and fixedsized arrays. The results of the interprocedural analysis support the compiler in inlining decisions and allow actual parameters to be allocated on the caller stack. Under certain circumstances, the Java HotSpot TM VM is forced to stop executing a method’s machine code and transfer control to the interpreter. This is called deoptimization. Since the interpreter does not know about the scalar replacement and synchronization removal performed by the compiler, the deoptimization framework was extended to reallocate and relock objects on demand.
Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency
"... Static single assignment (SSA) form is an intermediate program representation in which many code optimizations can be performed with fast and easy-to-implement algorithms. However, some of these optimizations create situations where the SSA variables arising from the same original variable now have ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
(Show Context)
Static single assignment (SSA) form is an intermediate program representation in which many code optimizations can be performed with fast and easy-to-implement algorithms. However, some of these optimizations create situations where the SSA variables arising from the same original variable now have overlapping live ranges. This complicates the translation out of SSA code into standard code. There are three issues to consider: correctness, code quality (elimination of copies), and algorithm efficiency (speed and memory footprint). Briggs et al. proposed patches to correct the initial approach of Cytron et al. A cleaner and more general approach was proposed by Sreedhar et al., along with techniques to reduce the number of generated copies. We propose a new approach based on coalescing and a precise view of interferences, in which correctness and optimizations are separated. Our approach is provably correct and simpler to implement, with no patches or particular cases as in previous solutions, while reducing the number of generated copies. Also, experiments with SPEC CINT2000 show that it is 2x faster and 10x less memory-consuming than the Method III of Sreedhar et al., which makes it suitable for just-in-time compilation.
Register spilling and live-range splitting for SSA-form programs
- IN: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION. LECTURE NOTES IN COMPUTER SCIENCE
, 2009
"... Register allocation decides which parts of a variable’s live range are held in registers and which in memory. The compiler inserts spill code to move the values of variables between registers and memory. Since fetching data from memory is much slower than reading directly from a register, careful s ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
(Show Context)
Register allocation decides which parts of a variable’s live range are held in registers and which in memory. The compiler inserts spill code to move the values of variables between registers and memory. Since fetching data from memory is much slower than reading directly from a register, careful spill code insertion is critical for the performance of the compiled program. In this paper, we present a spilling algorithm for programs in SSA form. Our algorithm generalizes the well-known furthest-first algorithm, which is known to work well on straight-line code, to control-flow graphs. We evaluate our technique by counting the executed spilling instructions in the CINT2000 benchmark on an x86 machine. The number of executed load (store) instructions was reduced by 54.5 % (61.5%) compared to a state-of-the-art linear scan allocator and reduced by 58.2 % (41.9%) compared to a standard graph-coloring allocator. The runtime of our algorithm is competitive with standard linear-scan allocators.
Preference-Guided Register Assignment
- In Compiler Construction 2010
, 2010
"... Abstract. This paper deals with coalescing in SSA-based register allo-cation. Current coalescing techniques all require the interference graph to be built. This is generally considered to be too compile-time intensive for just-in-time compilation. In this paper, we present a biased coloring approach ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Abstract. This paper deals with coalescing in SSA-based register allo-cation. Current coalescing techniques all require the interference graph to be built. This is generally considered to be too compile-time intensive for just-in-time compilation. In this paper, we present a biased coloring approach that gives results similar to standalone coalescers while signif-icantly reducing compile time. 1
Linear scan register allocation on ssa form
- In Proceedings of the International Symposium on Code Generation and Optimization
, 2010
"... The linear scan algorithm for register allocation provides a good register assignment with a low compilation overhead and is thus frequently used for just-in-time compilers. Although most of these compilers use static single assignment (SSA) form, the algorithm has not yet been applied on SSA form, ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
The linear scan algorithm for register allocation provides a good register assignment with a low compilation overhead and is thus frequently used for just-in-time compilers. Although most of these compilers use static single assignment (SSA) form, the algorithm has not yet been applied on SSA form, i.e., SSA form is usually deconstructed before register allocation. However, the structural properties of SSA form can be used to simplify the algorithm. With only one definition per variable, lifetime intervals (the main data structure) can be constructed without data flow analysis. During allocation, some tests of interval intersection can be skipped because SSA form guarantees non-intersection. Finally, deconstruction of SSA form after register allocation can be integrated into the resolution phase of the register allocator without much additional code. We modified the linear scan register allocator of the Java HotSpot TM client compiler so that it operates on SSA form. The evaluation shows that our simpler and faster version generates equally good or slightly better machine code.
Automatic object colocation based on read barriers
- In Proceedings of the Joint Modular Languages Conference
, 2006
"... Abstract. Object colocation is an optimization that reduces memory access costs by grouping together heap objects so that their order in memory matches their access order in the program. We implemented this optimization for Sun Microsystems ’ Java HotSpot TM VM. The garbage collector, which moves ob ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
Abstract. Object colocation is an optimization that reduces memory access costs by grouping together heap objects so that their order in memory matches their access order in the program. We implemented this optimization for Sun Microsystems ’ Java HotSpot TM VM. The garbage collector, which moves objects during collection, assigns consecutive addresses to connected objects and handles them as atomic units. We use read barriers inserted by the just-in-time compiler to detect the most frequently accessed fields per class. These “hot fields ” are added to so-called hot-field tables, which are then used by the garbage collector for colocation decisions. Read barriers that are no longer needed are removed in order to reduce the overhead. Our analysis is performed automatically at run time and requires no actions on the side of the programmer. We measured the impact of object colocation on the young and the old generation of the garbage collector, as well as the difference between dynamic colocation using read barriers and a static colocation strategy where colocation decisions are done at compile time. Our measurements show that object colocation works best for the young generation using a read-barrier-based approach. 1
Split Register Allocation: Linear Complexity Without the Performance Penalty
- in "International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC’10)", Lecture Notes in Computer Science
, 2010
"... Traditional bytecode language tool chains distribute the roles among offline and online compilers. Verification and code compaction are typically assigned to ofinria-00551513, ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Traditional bytecode language tool chains distribute the roles among offline and online compilers. Verification and code compaction are typically assigned to ofinria-00551513,