| Bala, V., Duesterwald, E., Banerjia, S.: Transparent dynamic optimization: The design and implementation of dynamo. HP Laboratories Technical Report HPL1999 -78 (1999) |
....prior work in the area of improving the performance of dynamic optimizers. Several researchers have proposed lightweight optimizations that are tailored for runtime execution [8] 12] 14] Another major interest area has been in techniques to reduce the cost of monitoring application behavior [10][3] and then applying optimizations only to the hottest portions of the executable [1] There is certainly a large body of work that discusses caching and cache management. As stated earlier, we are restricted in the kinds of cache management approaches we can use because we cache variable length ....
Vasanth Bala, Evelyn Duesterwald and Sanjeev Banerjia, "Transparent Dynamic Optimization: The Design and Implementation of Dynamo." HP Labs Technical Report HPL-1999-78.
....used either as a technique in its own right, or in combination with binary translation techniques. Dynamic optimization includes techniques to perform code layout for improved memory behavior, optimize frequently executed program paths, speculatively execute instructions or use value prediction [7, 6, 16, 5, 4, 3, 10]. A number of other optimization techniques are also highly effective in conjunction with dynamic optimization by exploiting runtime program profile data, such as dead code elimination, code sinking, unspeculation or partial redundancy elimination [9] These techniques are even more useful for ....
....bases any actions on the values stored in register r4, the program may fail. Thus, many dynamic optimizers have either severely restricted the amount of dead program state computation which can be eliminated [7] Some dynamic optimizers have included a safe mode which disables such optimizations [4, 3], but this is undesirable since this approach (1) requires to identify which program rely on extensive program state analysis in their exception handler, and (2) such programs are over their entire execution, even if no exception ever occurs. In this work, we present a solution to allowing dead ....
[Article contains additional citation context not shown here]
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report 99-78, HP Laboratories, Cambridge, MA, June 1999.
....emulating architecture the host. Many emulation systems are self hosting, that is the host and target architectures are the same. Such systems are generally created for purposes of optimization or instrumentation. A well known recent dynamic optimization system is Dynamo from HP Labs (Bala et al. [3,4]) and its successor DELI [10] Dynamo s high level architecture is similar to that of CMS, but it can fall back on efficient native execution, so there is no need to attempt translation for code that is problematic, or just cannot be improved. For this reason, the tradeoffs of selfhosting systems ....
Vasanth Bala, Evelyn Duesterwald, and Sanjeev Banerjia, "Transparent Dynamic Optimization: The Design and Implementation of Dynamo," Tech. Report HPL-1999-78, HP Laboratories Cambridge, June 1999.
....of trace start addresses results in relative offset for the replacement conditional branch. 2.3.2 Register indirect jump chaining To save table lookup overhead for each and every indirect jump, most dynamic optimizers translators implement a form of software based jump target prediction. In [3,15,32], a sequence of instructions compares the indirect target address held in a register against an embedded translation time target address. A match indicates a correct prediction and the inlined target instruction can be executed; if not, the code branches to the stub code at the end of the trace, ....
....VM. Another simulation constraint is that the benchmarks we run have relatively short executions times compared with real applications. Consequently, the interpretation and translation overheads, although small, are still disproportionately large for some of the benchmarks. In other systems [3,7], interpretation and translation optimizations overhead have been found to be reasonable. With respect to interpretation, we do nothing special. The same techniques as used by others [3,15] will suffice, and the interpretation overhead should be about the same. Direct threadedcode [30,35] is one ....
[Article contains additional citation context not shown here]
Vasanth Bala et al., "Transparent dynamic optimization: the design and implementation of Dynamo," Hewlett Packard Laboratories Technical Report HPL-1999-78, Jun 1999.
....(if it exists) In a simple implementation, the fragment jumps to the DBT system to determine if the next fragment is translated, and if so, to find its location in the translation cache via a hash table. To limit the number of these expensive translation cache lookups, a good chaining mechanism [3,7] is essential. Furthermore, to supplement conventional chaining, using a co designed VM provides us the ability to implement special instructions in the I ISA that further reduce the fragment transition overhead. Direct branches, either conditional or unconditional, are relatively easy to handle ....
....exit translated superblocks. Register indirect jumps (JMP, JSR and RET in Alpha ISA) pose a challenge because their target addresses can change during a program s execution. To save lookup overhead, most dynamic optimizers translators implement a form of software based jump target prediction. In [3], A short optimized sequence of instructions that accesses a hash table of fragment start addresses. If the target is not found in the hash table, control is transferred to the DBT system. 10 23 a sequence of instructions compares the indirect target address held in a register against an ....
[Article contains additional citation context not shown here]
Vasanth Bala et al., "Transparent dynamic optimization: the design and implementation of Dynamo," Hewlett Packard Laboratories Technical Report HPL-1999-78, Jun 1999.
....about execution values, paths, etc. and the original, unmodified code. The original code executes as a separate thread to verify that the distilled code is operating correctly. Moving upwards in abstraction layers, there have been multiple approaches to software based dynamic optimization [1, 3, 7, 10]. For many schemes, such as Dynamo [1] and Transmeta s Code Morphing System [10] the original program runs under control of a software interpreter. The interpreter gathers information about the program s run time behavior and builds optimized regions. When a PC is encountered for which an ....
....the original, unmodified code. The original code executes as a separate thread to verify that the distilled code is operating correctly. Moving upwards in abstraction layers, there have been multiple approaches to software based dynamic optimization [1, 3, 7, 10] For many schemes, such as Dynamo [1] and Transmeta s Code Morphing System [10] the original program runs under control of a software interpreter. The interpreter gathers information about the program s run time behavior and builds optimized regions. When a PC is encountered for which an optimized region exists, the optimized code ....
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report HPL-1999-78, Hewlett-Packard Laboratories, June 1999.
....average benchmark, these constructed frames provide 76 of all dynamic instructions. 2. 3 Optimization Engine The optimization engine can perform classical compiler optimizations, extended basic block optimizations, and various other optimizations performed by other dynamic optimization systems [1]. The rePLay optimizer can also schedule code, for instance if the underlying execution architecture is statically scheduled. Moreover, the coupling of dynamic optimizations, execution rollback mechanisms, and rePLay s assertion instruction architecture allows for implementation of speculative ....
....of the limited scope of a trace. Because frames are atomic and because rePLay is able to construct long frames, the potential of compiler optimizations increases with rePLay. The rePLay framework differs from software based optimizers such as the Transmeta Code Morphing System[13] HP Dynamo[1], and DyC [9] primarily in the use of the hardware support for optimization functions. The hardware support helps to reduce overhead in two ways: 1) the optimizer does not use the same execution hardware as the application, and (2) the hardware recovery mechanism allows for speculative ....
[Article contains additional citation context not shown here]
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report HPL-1999-78, Hewlett-Packard Laboratories, June 1999.
.... profile with about 16 overhead [BL94] With a hardware based sampling approach, edge profiles cost only 1 3 overhead [ABD # 97] Recently, a software based sampling approach was developed, via transient (removable) instrumentation [TS99] or transient interpretation of native instructions [BDB99] Their cost is comparable to that of the hardware based approach (a few per cent) Path profiles. Path profiles can be collected relatively efficiently, even when compared to the low cost of edge profiling. A path profile can be obtained with about 30 overhead [BL96a] 53 Whole program path ....
Vasanth Bala, Evelyn Duesterwald, and Sanjeev Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report HPL--99--78, Hewlett-Packard Laboratories, 1999.
....previous work. Strict backward binary compatibility means that the architecture cannot be arbitrarily changed to accommodate new microarchitectural techniques. With the advent of binary to binary translation and optimization technology such as FX32 , Dynamo, and Transmeta s code morphing software [Hook97, Bala99, Klai00], this constraint can be removed so the compiler, architecture, and system designer is free to select a better point in the design space than previously allowed. Still, many of the techniques proposed herein can be applied directly to existing architectures with little modification. Second, we ....
Vasanth Bala, Evelyn Duesterwald, and Sanjeev Banerjia. Transparent Dynamic Optimization: The Design and Implementation of Dynamo. HP Laboratories, Cambridge, MA. Technical Report HPL-
....achieves highly optimized execution times while masking almost all compilation overhead. RELATED WORK Our work reduces the compilation overhead associated with dynamic compilation. Much research has gone into dynamic compilation systems for both object oriented [1,6,15] and non objectoriented [8,9,12] languages. Our approach is applicable to other dynamic compilation systems, and can be used to reduce their compilation overhead. Lazy compilation, as mentioned previously, is used in most JIT compilers [1,3,11,13,19] to reduce the overhead of dynamic compilation. However, a quantitative ....
Bala V, Duesterwald E, Banerjia S. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report HPL-1999-78, HP Laboratories, 1999. http://www.hpl.hp.com/techreports/1999/HPL-1999-78.html.
....execution time. Dynamic compilation also offers the potential for further performance improvements over static compilation since runtime information can be exploited for optimization and specialization. Several dynamic, optimizing compiler systems have been built in industry and academia [7, 8, 9, 1, 10, 11, 12, 13]. Dynamic compilation is performed while the application is running and, therefore, introduces compilation overhead in the form of intermittent execution delay. The primary challenge in using dynamic compilation is to enable high performance execution with minimal compilation overhead. ....
....highly optimized execution times while masking almost all compilation overhead. Related Work Our work reduces the compilation overhead associated with dynamic compilation. Much research has gone into dynamic compilation systems for both object oriented [15, 1, 6] and non object oriented [8, 9, 12] languages. Our approach is applicable to other dynamic compilation systems, and can be used to reduce their compilation overhead. Lazy compilation, as mentioned previously, is used in most Just In Time compilers [1, 11, 19, 13, 3] to reduce the overhead of dynamic compilation. However, a ....
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report HP Laboratories HPL-1999-78, 1999. http://www.hpl.hp.com/techreports/1999/HPL-1999-78.html.
....C program. Dynamic compilation offers the potential for better performance than can be achieved by static compilation since runtime information can be exploited for optimization and specialization. Several dynamic, optimizing compiler systems have been built in industry 147 148 and academia [3, 8, 29, 34, 44, 45, 56, 79]. Despite its potential benefits, optimization increases compilation delay since it is performed while the program executes. Most systems attempt to reduce compilation delay introduced by the optimization in one of two ways: they incorporate multiple compilers [12, 16, 84] or they use an ....
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report Technical Report HPL-1999-78, HP Laboratories, 1999.
....execution time. Dynamic compilation also offers the potential for further performance improvements over static compilation since runtime information can be exploited for 1 optimization and specialization. Several dynamic, optimizing compiler systems have been built in industry and academia [2, 15, 24, 16, 17, 13, 6, 20]. Dynamic compilation is performed while the application is running and, therefore, introduces compilation overhead in the form of intermittent execution delay. The primary challenge in using dynamic compilation is to enable high performance execution with minimal compilation overhead. ....
....compiled benchmarks) Absolute total time in seconds appears above each bar. 14 5 Related Work Our work reduces the compilation overhead associated with dynamic compilation. Much research has gone into dynamic compilation systems for both object oriented [8, 15, 27] and non object oriented [13, 6, 20] languages. Our approach is applicable to other dynamic compilation systems, and can be used to reduce their compilation overhead. Lazy compilation, as mentioned previously, is used in most Just In Time compilers [24, 19, 26, 15, 17] to reduce the overhead of dynamic compilation. However, a ....
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of dynamo. Technical Report HP Laboratories Tech Report HPL-1999-78, 1999. http://www.hpl.hp.com/techreports/1999/HPL-1999-78.html.
....nature and do not have the benefit of alias analysis or register liveness analysis; in particular, optimizations that need scratch registers are not carried out. The Dynamo system takes a very different approach to global optimization: it optimizes native executables dynamically, as they execute [3]. This system is able to carry out optimizations across procedure and module boundaries, and has the advantage of being able to handle either statically or dynamically linked libraries. The main disadvantage is that dynamic optimization necessarily incurs some runtime overhead, and in some cases ....
V. Bala, E. Duesterwald, and S. Banerjia, "Transparent Dynamic Optimization: The Design and Implementation of Dynamo", Technical Report HPL-1999-78, Hewlett-Packard Laboratories, Cambridge, Mass., June 1999.
....to show that value profile based specialization can yield significant speed improvements. By contrast, our work describes value profile based specialization that is fully automatic and that has been integrated into a link time optimizer. Systems for dynamic code generation and optimization [4, 8, 12] are also confronted with tradeoffs between the cost of generating specialized code and the savings obtained from the execution of this code. The problem, while qualitatively similar to ours, is considerably more complicated in practice because the runtime costs include the cost of generating the ....
....generally require users to annotate the program fragments that should be subjected to runtime code generation and specialization, effectively moving the burden of analyzing the cost benefit tradeoff to them. Systems for dynamic optimization of conventionally optimized programs, such as Dynamo [4], rely on simple heuristics to determine whether a code fragment is worth optimizing: programs where these heuristics are inadequate can suffer noticeable performance degradation. The work that is conceptually closest to that described here is some recent work towards automating the cost benefit ....
V. Bala, E. Duesterwald, and S. Banerjia, "Transparent Dynamic Optimization: The Design and Implementation of Dynamo", Technical Report HPL-1999-78, Hewlett-Packard Laboratories, Cambridge, Mass., June 1999.
No context found.
Bala, V., Duesterwald, E., Banerjia, S.: Transparent dynamic optimization: The design and implementation of dynamo. HP Laboratories Technical Report HPL1999 -78 (1999)
No context found.
Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia, Transparent Dynamic Optimization: 14 The Design and Implementation of Dynamo, HP Labs Report 1999-78, June 1999.
No context found.
Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia, Transparent Dynamic Optimization: 14 The Design and Implementation of Dynamo, HP Labs Report 1999-78, June 1999.
No context found.
Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia, Transparent Dynamic Optimization: The Design and Implementation of Dynamo, HP Labs Report 1999.
No context found.
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report 99-78, HP Laboratories, Cambridge, MA, June 1999.
No context found.
Bala, V., Duesterwald, E., Banerjia, S.: Transparent dynamic optimization: The design and implementation of dynamo. HP Laboratories Technical Report HPL1999 -78 (1999)
No context found.
V. Bala, E. Duesterwald, and S. Banerjia, "Transparent dynamic optimization: The design and implementation of Dynamo," HP Laboratories, Cambridge, MA, Tech. Rep. 99-78, June 1999.
No context found.
V. Bala, E. Duesterwald, and S. Banerjia, "Transparent dynamic optimization: The design and implementation of Dynamo," HP Laboratories, Cambridge, MA, Tech. Rep. 99-78, June 1999.
No context found.
Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia, "Transparent dynamic optimization: the design and implementation of Dynamo," Hewlett Packard Laboratories Technical Report HPL-1999-78, Jun. 1999.
No context found.
V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report Technical Report HPL-1999-78, HP Laboratories, 1999.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC