| Knoop, J., Ruthing, O. & Ste#en, B. (1994), `Optimal code motion: Theory and practice', ACM Transactions on Programming Languages and Systems 16(4), 1117--1155. |
....computations For a more obvious example of coinductive program improvement we turn to the problem of removing redundant computations. This is an old and well studied improvement, known in various incarnations as common subexpression elimination [CS70] value numbering [BCS97] lazy code motion [KRS94] and partial redundancy elimination [KCL 99] see e.g. Muc00] for a survey) The inductive approach is straightforward: given some straight line code such as w (b; c) x (a; w) y (b; c) z (a; y) one wishes to eliminate redundant computations in this example, y and z are ....
Jens Knoop, Oliver Ruthing, and Bernhard Steen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-1155, July 1994.
....For example, after invariant code has been moved out of loops, loop invariant values are easily identified by looking at the location of their definition. We assume that: 1. Lazy code motion (lcm) has been applied to accomplish both loop invariant code motion and common subexpression elimination [33, 35, 23]. Our compiler performs global reassociation and global renaming prior to running lcm [5] 2. Sparse conditional constant propagation (sccp) has been applied to identify and fold compile time constants [47] This discovers a large class of constant values and makes them textually obvious. We ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Trans. Prog. Lang. Syst., 16(4):1117--1155, July 1994.
....choose function returns all instances: choose all (#, p) #. This profitability heuristic is the default if none is specified explicitly. Below we give an example of an optimization with a nontrivial choose function. Example 3 Consider the implementation of partial redundancy elimination (PRE) [14, 10] in Cobalt. One way to perform PRE is to first insert copies of statements in well chosen places in order to convert partial redundancies into full redundancies, and then to eliminate the full redundancies by running a standard common subexpression elimination (CSE) optimization expressible in ....
....points in the program, such as various sorts of code motion, can in fact be decomposed into several simpler transformations, each of which fits Cobalt s transformation pattern syntax. The PRE example in section 2.3 illustrates all three of these points. PRE is a complex code motion optimization [14, 10], and yet it can be expressed in Cobalt using simple forward and backward passes with appropriate profitability heuristics. Our way of factoring complicated optimizations into smaller pieces, and separating the part that a#ects soundness from the part that doesn t, allows users to write ....
Jens Knoop, Oliver Ruthing, and Bernhard Ste#en. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....( f) trace values through memory ( m) remove some redundant computations ( a) perform algebraic simplification ( s) and perform value driven code motion ( c) 5. Lazy Code Motion (lazy) This pass performs code motion using techniques described by Drechsler and Stadel [11] and Knoop, et al. [16]. 6. Partial Redundancy Elimination (partial) This pass implements Morel and Renvoise s technique for partial redundancy elimination [18] following Drechsler and Stadel [10] 7. Peephole Optimization (combine) This pass is an ssa based peephole optimizer that combines iloc operations ....
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM 16(4):1117--1155, July 1994.
....( f) trace values through memory ( m) remove some redundant computations ( a) perform algebraic simplification ( s) and perform value driven code motion ( c) 5. Lazy Code Motion (lazy) This pass performs code motion using techniques described by Drechsler and Stadel [11] and Knoop, et al. [16]. 6. Partial Redundancy Elimination (partial) This pass implements Morel and Renvoise s technique for partial redundancy elimination [18] following Drechsler and Stadel [10] 7. Peephole Optimization (combine) This pass is an ssa based peephole optimizer that combines iloc operations ....
Jens Knoop, Oliver Ruthing, and Bernhard Ste#en. Optimal code motion: Theory and practice. ACM 16(4):1117--1155, July 1994.
....to reduce the total number of remaining such computations in the transformed code. Global common subexpressions and loop invariant computations are special cases of partial redundancies. As a result, PRE has become an important component in global optimizers [10, 12, 13] Classic PRE methods [21, 22] guarantee computationally optimal results, i.e. results where the number of computations cannot be reduced any further by safe code motion [20] Under such a safety constraint, they insert an expression at a point p in a flow graph only if all paths emanating from p must evaluate before any ....
....given. Figure 1(d) shows the transformed flow graph, requiring 300 computations of a b. Note that the speculative execution of a b on the edge (3; 4) has made the three computations of a b at nodes 7, 8 and 10 in the original flow graph fully redundant. For this same example, the classic PRE [22], known as lazy code motion (LCM) would have produced the code in Figure 2, requiring 400 computations for the same expression. We achieve computational optimality based on edge profiles. This implies immediately that the more expensive path profiling is not necessary for this optimization ....
[Article contains additional citation context not shown here]
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....computations For a more obvious example of coinductive program improvement we turn to the problem of removing redundant computations. This is an old and well studied improvement, known in various incarnations as common subexpression elimination [CS70] value numbering [BCS97] lazy code motion [KRS94] and partial redundancy elimination [KCL 99] see e.g. Muc00] for a survey) The inductive approach is straightforward: given some straight line code such as (a, w) a, y) one wishes to eliminate redundant computations in this example, y and z are redundant since y = w and ....
Jens Knoop, Oliver Ruthing, and Bernhard Ste#en. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....used in optimizing compilers, the proofs have become more tractable and have 1 Based on joint work with David Lacey, Neil D. Jones and Eric Van Wyk [6] Email: xeno diku.dk, Fax: 45) 35 32 14 01. c 2002 Published by Elsevier Science B. V. lead to stronger optimization algorithms [4]. Other works have investigated temporal logic as a means to express program analyses and to derive analysis algorithms [12,13] However declarative methods for specifying compiler optimizations have received relatively little attention. One approach to reason about program transformations for ....
Knoop, J., O. Rutingen and B. Steen, Optimal Code Motion: Theory and Practice, ACM Transactions on Programming Languages and Systems (TOPLAS), 16(4):1117-1155, 1994.
....approaches to program analysis [4, 27, 29, 30, 32, 28] Work by Ste#en and Schmidt [29, 30] showed that temporal logic is well suited to describing data dependencies and other program properties exploited in classical compiler optimizations. In particular, work by Knoop, Ste#en and Ruthing [14] showed that new insights could be gained from using temporal logic, enabling new and stronger code motion algorithms, now part of several commercial compilers. More relevant to this paper: The code motion transformations could be proven correct. 1.5 Model checking and program transformation ....
J. Knoop and O. Ruthing and B. Ste#en. Optimal Code Motion: Theory and Practice. ACM Transactions on Programming Languages and Systems (TOPLAS), 16(4):1117--1155, 1994.
....compiler optimizations have been discussed in detail by many authors (e.g. see [11] and so are not discussed further here. The next section describes the code factoring techniques used within squeeze. 2 Code Factoring 2. 1 Local Factoring Transformations Inspired by an idea of Knoop et al. [10] we try to merge identical code fragments by moving them to a point that pre or post dominates all the occurrences of the fragments. We have implemented a local variant of this scheme which we describe using the example depicted in Figure 1. The left hand side of the figure shows an assembly code ....
J. Knoop, O. Ruthing, and B. Steffen, "Optimal Code Motion: Theory and Practice", ACM Transactions on Programming Languages and Systems vol. 16 no. 4, July 1994, pp. 1117--1155. 10
....Optimization A pro le guided optimizer identi es data ow facts that are observed to hold for hot regions of the code and exploits them to generate highly optimized code. This approach has been shown to be e ective for variety of optimization tasks [7, 3, 11, 23, 13, 14, 15] Both non speculative [22, 16] and speculative [14, 15, 7] transformations have been developed for specialization of code along hot program paths. In this section we illustrate the use of pro le limited analysis in pro le guided optimization. Consider a load instruction which is executed frequently and often causes cache ....
....the other hand if we make use of pro le limited analysis which exploits the timestamps labeling the nodes, we can easily determine that 4 Load is always redundant for the given path trace. This information can be used by the optimizer to transform the program using code motion and or restructuring [22, 16, 14, 7]. The query propagation that identi es that the redundancy count for 4 Load is 60, that is, degree of redundancy is 100 , is also shown in Figure 9. As we can see, the degree of redundancy has been computed using a single backward pass through the loop body and only 6 queries were generated in ....
J. Knoop, O. Ruthing, and B. Steen, \Optimal Code Motion: Theory and Practice," ACM Transactions on Programming Languages and Systems, Vol. 16, No. 4, pages 1117-1155, 1994.
....produce (while A produces only 100 units, C and E together consume ## # ### units, see Figure 5.5) Overbooking can be removed by dividing the producer s contribution among paths leading from the producer to its different consumers. In the CMP estimators, the contribution is divided by delaying [KRS94b] the producer. Delaying moves the producer in forward direction along all paths as far as it remains a producer according to Definition 5.3, i.e. as far as each path through it generates the value. Figure 5.6(a) shows how the producer A is delayed into the edges #f;C# and #f;h#, which become the ....
....shown in Section 1.5. Therefore, practical PRE algorithms are based on code motion, an economical transformation that reorders instructions but does not change the shape of the control flow graph, prohibiting the expensive isolation of individual paths [BC94, CCK # 97, DRZ92, Dha91, DS88, DS93, KRS94b,MR79] The price of the restriction to code motion, however, is the failure to remove all detected redundancies. In theory, even the optimal code motion algorithm [KRS94b] breaks down on loop invariants in while loops, unless preceded by do until conversion (which is based on path separation) In ....
[Article contains additional citation context not shown here]
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Trans. on Progr. Languages and Systems, 16(4):1117--1155, 1994.
.... redundancy elimination (PRE) based on code motion of redundant statements is formulated as a bidirectional data flow problem [MR79] Modern PRE formulations decompose the bidirectional problem into two unidirectional problems: availability and anticipability (also called very busy expressions) KRS94a] To determine which redundant statements can be removed, the approach in [RWZ88] uses the notion of dominators: if a computation is dominated by a value equivalent computation, then it can be removed because its value has been computed on all incoming execution paths. Data flow analysis is used ....
....code motion is the simplest form of such motion transformation. Morel and Renviose generalized it to arbitrary control flow graphs by formulating the code motion problem as a bi directional data flow analysis problem [MR79] Knoop et al. found a uni directional formulation for this problem [KRS94a] The necessary code motion may be blocked when it would change program semantics or impair the program for certain inputs. When code motion fails to eliminate all partial redundancies, control flow restructuring can be applied. In value flow optimization, restructuring is based on separating ....
[Article contains additional citation context not shown here]
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....approaches to program analysis [4, 20, 22, 23, 25, 21] Work by Ste en and Schmidt [22, 23] showed that temporal logic is well suited to describing data dependencies and other program properties exploited in classical compiler optimizations. In particular, work by Knoop, Ste en and R uthing [12] showed that new insights could be gained from using temporal logic, enabling new and stronger code motion algorithms, now part of several commercial compilers. More relevant to this paper: The code motion transformations could be proven correct. 1.4 Model checking and program transformation In ....
J. Knoop and O. Ruthing and B. Steen. Optimal Code Motion: Theory and Practice. ACM Transactions on Programming Languages and Systems (TOPLAS), 16(4):1117-1155, 1994.
.... and Renvoise s Algorithm There has been much continuing interest in the global optimization method of Morel and Renvoise [1] as extended by Joshi Dhamdere [2] and Chow [3] For example, papers by Drechler and Stadel [4] Sorkin [5] Dhamdere [6] and Knoop, Rthing and Steffan [7] deal with perceived problems with the original algorithm. Unfortunately, the original formulation of Morel and Renvoise and those of Joshi Dhamdere [2] and Chow [3] do not agree as to the form of the algorithm. The precise form of the algorithm s boolean equations is critical to the ....
Knoop, J., Rthing, O., and Steffan, B. Optimal Code Motion: Theory and Practice, Universitat Passau, MIP-9310, December 1993 (to appear in TOPLAS).
....any elimination of redundant loads carried out by these systems is limited to fairly simple load removal. Load redundancy elimination can be seen as a particular case of Partial Redundancy Elimination (PRE) where the expressions to be considered for removing are only load operations. PRE [16, 8] is a well known scalar optimization that subsumes various ad hoc code motion optimizations (as common subexpression elimination and loop invariant code motion) by attempting to remove redundancies that occur only on some control flow paths. Horspool and Ho [14] described a new formulation of PRE ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....xor r19,r19,r19 stq r9,16(r23) xor r5,r6,r0 add r5,r6,r8 sub r5,r6,r9 C sub r5,r6,r19 stq r19,8(r30) ldq r19,22(r22) xor r5,r6,r0 stq r9,16(r23) add r5,r6,r8 A cmp r2,r1,r0 beq r0 E Fig. 1. Local code factoring. 3. 1 Local Factoring Transformations Inspired by an idea of Knoop et al. 1994], we try to merge identical code fragments by moving them to a point that pre or postdominates all the occurrences of the fragments. We have implemented a local variant of this scheme which we describe using the example depicted in Figure 1. The left hand side of the gure shows an assembly code ....
Knoop, J., R uthing, O., and Steffen, B. 1994. Optimal code motion: Theory and practice. ACM Trans. Program. Lang. Syst. 16, 4 (July), 1117-1155.
....Node i dominates node j in the CFG (written as j 2 dom(i) if every path from s to j goes through i. We assume that prior to communication analysis, any edge that goes directly from a node with more than one successor to a node with more than one predecessor is split by introducing a dummy node [37]. Our technique for minimizing the communication volume and the number of messages is based on interval analysis [4] Interval analysis consists of a contraction phase and an expansion phase. For programs written in a structured language, an interval corresponds to a loop. The contraction phase ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. In ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....of suppressing partial redundancies. This paper also showed that loop invariant motion and global common sub expression elimination are special cases of suppression of partial redundancies. The particular algorithm described here is a more efficient version due to Knoop, Ruthing, and Steffen [4]. One small difference in the version presented here is that the published version avoids inserting assignments that are only used in their own node. Not only would that have cluttered our presentation, but it also is of questionable utility, since the register 7 allocation phase of the compiler ....
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....in the program to insert the computation. These insertions in turn cause partially redundant computations to become fully redundant, and therefore safe to delete. Knoop et al. came up with a different PRE algorithm called lazy code motion that improves on Morel and Renvoise s results [KRS92, DS93, KRS94a] The result of lazy code motion is optimal: the number of computations cannot be further reduced by safe code motion, and the lifetimes of the pseudo registers introduced are minimized. Our team at Silicon Graphics has recently developed a new algorithm to perform PRE based on SSA form, called ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Trans. on Programming Languages and Systems, 16(4):1117--1155, October 1994.
....partial redundancies, they are subsumed by PRE. As a result, PRE has become the most important component in many global optimizers [Chow 1983; Chow et al. 1986; Schwarz et al. 1988; Briggs and Cooper 1994; Simpson 1996] An alternative placement strategy called lazy code motion [Knoop et al. 1992; Knoop et al. 1994] improves on Morel and Renvoise s results by avoiding unnecessary code movements, and by removing the bidirectional nature of the original PRE data flow equations. The result of lazy code motion is optimal: the number of computations cannot be further reduced by safe code motion [Kennedy 1972] ....
....Each of the above approaches to PRE is based on a bit vector formulation of the problem, and on the iterative solution of data flow equations. This paper presents a new approach called SSAPRE [Chow et al. 1997] that shares the optimality properties of the best prior work [Knoop et al. 1992; Knoop et al. 1994; Drechsler and Stadel 1993] and that is based on static single assignment form. Static single assignment form (SSA) is a popular program representation in modern optimizing compilers. Its versatility stems from the fact that, in addition to representing the program, it provides accurate ....
[Article contains additional citation context not shown here]
Knoop, J., R uthing, O., and Steffen, B. 1994. Optimal code motion: Theory and practice. ACM Trans. on Programming Languages and Systems 16, 4 (Oct.), 1117--1155.
....Joshi and Dhamdhere [JD82, Dha89] and Chow [Cho83] independently describe techniques that allow a PRE implementation to simultaneously perform strength reduction. In this framework, strength reduction does not depend on identifying loop induction variables, and is not restricted to loops. In [KRS92, KRS94], Knoop et al. give an alternative PRE algorithm called lazy code motion that improves on Morel 2 and Renvoise s results by avoiding unnecessary code movements, and by removing the bidirectional nature of the original PRE data flow equations. They also presented the lazy strength reduction ....
Knoop, J., Ruthing, O. and Steffen, B., Optimal Code Motion: Theory and Practice. ACM Transactions on Programming Languages and Systems, October 1994, pp. 1117-1155.
....Lo Peng Tu fchow sgi.com Silicon Graphics Computer Systems 2011 N. Shoreline Blvd. Mountain View, CA 94043 Abstract A new algorithm, SSAPRE, for performing partial redundancy elimination based entirely on SSA form is presented. It achieves optimal code motion similar to lazy code motion [KRS94a, DS93] but is formulated independently and does not involve iterative data flow analysis and bit vectors in its solution. It not only exhibits the characteristics common to other sparse approaches, but also inherits the advantages shared by other SSA based optimization techniques. SSAPRE also ....
....[MR79] By targeting partially redundant computations in the program, it automatically removes global common subexpressions and moves invariant computations out of loops. It has since become the most important component in many global optimizers [Cho83, CHKW86, SKL88, BC94, CS95b] In [KRS92, KRS94a] Knoop et al. formulated an alternative placement strategy called lazy code motion that improves on Morel and Renvoise s results by avoiding unnecessary code movements, and by removing the bidirectional nature of the original PRE data flow equations. The result of lazy code motion is optimal: ....
[Article contains additional citation context not shown here]
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Trans. on Programming Languages and Systems, 16(4):1117--1155, October 1994.
....This last transformation, shown in Figure 3, 7 x x y y new node Figure 3: An example application of the edge split transformation to eliminate critical edges the edges going from a node with more than one successor to a node with more than one predecessor. eliminates all critical edges [38]. 2.2 Interval Analysis We assume that prior to our analysis, the compiler has performed all loop level transformations [9, 48, 52] to enhance parallelism (e.g. loop permutation, loop distribution) and optimize communication. Our technique is based on interval analysis performed on the CFG. As ....
J. KNOOP, O. RUTHING, and B. STEFFEN. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....in the program as in [9] and [6] In practice code movement to the earliest program points can create pressure on the target architecture resources e.g. because of register spills . A more practical approach involves also performing temporary lifetime minimization as in the work on Knoop [7]. Knoop s approach is the best in class approach for code motion since it involves unidirectional analysis techniques in the program flow where reducible programs can be dealt with in O(n log (n) bit vector steps (see [1] where n is the number of statements in the program in contrast to O(n 2 ....
....analysis and information gathering steps within the flow: reachable variable definitions, and reached uses of variables. So, the framework s complexity of O(n 2 ) is what dominates the overall complexity. Of course, for this increase in complexity we can get much better optimization results than [7] since operation motion is applied to all candidate operations at once and is tempered by other data flow and control analysis and optimization steps. As will be described in the sequel, the approach is therefore much simpler (conceptually and in practice) than other approaches as it tackles ....
[Article contains additional citation context not shown here]
Knoop, J.; Rthing, O., "Optimal Code Motion: Theory, and Practice", ACM Transactions on Programming Languages and Systems, Vol. 16, No. 4, pp. 1117-1155, July 1994.
....deleted. A practical method for generating optimizer components will be judged on three criteria. First it must be able to express a broad range of program analysis and transformation problems. We demonstrate this by specifying several parts of the lazy code motion analysis and transformation [KRS94] Second the generated algorithms must run efficiently so that the components can be used in practical optimizers. Thus we have developed a uniform evaluation algorithm for the given rewrite systems, which works efficiently on directed sparse graphs and often results in linear or quadratic ....
....busy code motion [MR79] pays especially in array intensive computations because address expressions often can be moved out of loops. However, busy code motion introduces longer life times for expression values, which may deteroriate register allocation. Therefore lazy code motion was developed [KRS94] It tries to reduce register occupation (register lifetimes) as long as the number of computations is minimal. This is achieved by computing two kinds of information. First the same information as in busy code motion is computed: for each expression the earliest place in the code is determined ....
[Article contains additional citation context not shown here]
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994. RR no2955 38 Uwe Amann
....code size can be obtained without having to resort to extraneous structures such as suffix trees, by using information already available, e.g. the control flow graph and dominator postdominator trees. 6. 1 Local Factoring The local factoring transformation was inspired by an idea of Knoop et al. [45]. It tries to merge identical code fragments by moving them to a point that pre or post dominates all the occurrences of the fragments. We have implemented a local variant of this scheme which we describe using the example depicted in Figure 6.2. 2 . 2 The meaning of the Alpha machine ....
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....assignment hoisting algorithm in [3] moves code aggressively to encourage secondary effects. However, when expressions are hoisted there are no secondary effects. Consequently, the expression hoisting algorithm in [2] uses lazy code hoisting. 2. 4 Lazy Expression Hoisting In the papers [2] and [4], Knoop, Ruthing, and Steffen describe an algorithm to perform lazy expression hoisting. Since expression motion does not involve the second order effects that can occur during assignment motion, the analysis and transformations are only performed once. Unfortunately, the data flow analysis ....
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
....a call or jump) or by moving the representative of several occurrences to a point that dominates every occurrence. We rst exploit the latter form of code factoring since it involves no added control transfer instructions. 3. 1 Local Factoring Transformations Inspired by an idea of Knoop et al. [13], we try to merge identical code fragments by moving them to a point that pre or post dominates all the occurrences of the fragments. We have implemented a local variant of this scheme which we describe using the example depicted in Figure 1. The left hand side of the gure shows an assembly ....
J. Knoop, O. Ruthing, and B. Steen, \Optimal Code Motion: Theory and Practice", ACM Transactions on Programming Languages and Systems vol. 16 no. 4, July 1994, pp. 1117-1155.
....and releases locks. These transformations are similar in spirit to the redundant communication elimination optimization. The elimination of redundant code has been examined quite rigorously for scalar computations. As an example, Knoop. et al. optimize the computation by optimal code motion [14]. They issue the computation as late as possible to avoid unnecessary register pressure while maintaining computational optimality. They focus on scalar variables, while we are interested in pointer dereferences. Further their algorithms treat each computation independently, while we consider the ....
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Trans. on Programming Languages and Systems, 16(4):1117--1155, Jul. 1994.
....and releases locks. These transformations are similar in spirit to the redundant communication elimination optimization. The elimination of redundant code has been examined quite rigorously for scalar computations. As an example, Knoop. et al. optimize the computation by optimal code motion [KRS94] They issue the computation as late as possible to avoid unnecessary register pressure while maintaining computational optimality. They focus on scalar variables, while we are interested in pointer dereferences. Further their algorithms treat each computation independently, while we consider the ....
Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994.
No context found.
J. Knoop, O. Rthing, and B. Steffen. Optimal code motion: theory and practice. ACM Transactions on Programming Languages and Systems, vol. 16, 4, pages 1117--1155, July 1994.
No context found.
J. Knoop, O. Ruthing, and B. Ste#en. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, 1994.
....them close to useless. Since predicated code will be more and more common with the advent of the IA 64 architecture, we present here a family of PRE algorithms tailored for predicated code. Conceptually, the basic element of this family can be considered the counterpart of busy code motion of [20]. It can easily be tuned by two orthogonal means. First, by adjusting the power of a preprocess feeding it by information on predication. Second, by relaxing or strengthening the constraints on synthesizing predicates controlling the movability of computations. Together with extensions towards ....
....(p) y : p) x : a b cmp.unc p,q = a b cmp.unc p,q = a b Figure 4: Benefits for PRE from predication. In this article, we develop a new approach for PRE, which is tailored for predicated code. Conceptually, the basic algorithm we present can be considered the counterpart of busy code motion of [19, 20]. Like busy code motion the new approach relies on two unidirectional data flow analyses. First, a hoistability analysis moving computations to their earliest down safe computation points, i.e. to the earliest points such that the computation will be used on every program continuation reaching ....
[Article contains additional citation context not shown here]
J. Knoop, O. Ruthing, and B. Ste#en. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, TOPLAS, 16:1117-- 1155, 1994.
.... consequence of the fact that eliminating dead assignment occurrences does not reanimate other dead assignment occurrences, and whose second part is a consequence of the admissibility of g and a simple program transformation supposed in [KRS94b] which is typical for code motion transformations (cf. [DRZ92, KRS92, KRS94a, RWZ88]) namely to insert in every edge leading from a node with more than one successor to a node with more than one predecessor a new synthetic node. Lemma 13. Let G 1 ; G 2 ; G 3 2 G, and g; h 2 T with G 1 h se G 3 . 1. If g 2 E ff , and occ an ff occurrence occurring both in G 1 and G 2 , ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-- 1155, 1994.
....code size as a third optimization goal to partial redundancy elimination in addition to the two more classical goals of computation costs and register pressure. We arrive at a family of sparse code motion algorithms coming as modular extensions of the algorithms for busy and lazy code motion (cf. [3, 4]) Each algorithm of this family optimally captures a predefined choice of priority between these three optimization goals, e.g. the construction of codesize optimal programs of at least the same e#ciency as the original program, or of computationally optimal programs of minimal code size, each ....
....if it is semantics and performance preserving. It is well known that under this admissibility constraint, computationally optimal results can be obtained by placing computations as early as possible in a program. This is known as the earliestness principle realized by busy code motion (cf. [3, 4]) Computationally and lifetime optimal results can be achieved by placing computations as early as necessary (in order to achieve computational optimality) but as late as possible (in order to keep lifetimes of temporaries as small as possible) This is known as the latestness principle, which ....
[Article contains additional citation context not shown here]
J. Knoop, O. Ruthing, and B. Ste#en. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-- 1155, 1994.
.... solution refine relax specification User Interaction Specification Transformation into Target Language Selected Solution Execution Classification Taxonomy Database Component Repository Retrieval Figure 1: The DaCapo system 2 both finite and infinite state programs (see [Stef91, Stef93, KnRS92, KnRS94a, KnRS94b, Knoo93]) The paper is organized as follows: Section 2 introduces our UNIX application, Section 3 defines the metalevel specification language SLTL, Section 4 illustrates the full DaCapo lifecycle by means of a typical user session, Section 5 discusses the relations with other approaches, and Section 6 ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal Code Motion: Theory and Practice. ACM Transactions on Programming Languages and Systems, Vol. 16, N.4, July 1994, pp.1117-1155.
....impact of this approach, however, goes far beyond of our type analysis. As illustrated in [KG1] it links classical program analysis and optimization to the OO setting: Along the lines of [KRS1] all the powerful optimizations developed for conventional procedural languages like code motion (cf. [KRS2]) partial dead code elimination (cf. KRS3] assignment mo19 method m z : nil y : z c : a b y : B new method mainMethod class Main var a, b, c, d, e, x, y, z instanceOf : B z : B 3 x : z new 4 5 6 7 8 2 cond a : a c neqNil x : 12 sendsTo m z : C new 1 2 9 10 11 12 13 14 15 16 ....
Knoop, J., Ruthing, O., and Steffen, B. Optimal code motion: Theory and practice. Transactions on Programming Languages and Systems 16 , 4 (1994), 1117 - 1155.
....: h2 1 2 4 5 6 7 8 9 15 20 16 17 18 19 3 14 15 16 17 18 19 9 4 6 7 5 2 1 10 11 12 13 13 12 11 10 3 ( x,y) x,y) 8 20 14 Figure 2: Back to Reality: Syntactic Code Motion vs. Semantic Code Motion History and Current Situation: Syntactic CM (cf. [2, 6, 7, 8, 9, 10, 11, 14, 15, 18, 19, 21, 25]) is well understood and integrated in many commercial compilers. 3 3 e.g. based on [18, 19] in the Sun SPARCompiler language systems (SPARCompiler is a registered trademark of SPARC International, Inc. and is licensed exclusively to Sun Microsystems, 2 In contrast, the much more powerful and ....
....3 ( x,y) x,y) 8 20 14 Figure 2: Back to Reality: Syntactic Code Motion vs. Semantic Code Motion History and Current Situation: Syntactic CM (cf. 2, 6, 7, 8, 9, 10, 11, 14, 15, 18, 19, 21, 25] is well understood and integrated in many commercial compilers. 3 3 e.g. based on [18, 19] in the Sun SPARCompiler language systems (SPARCompiler is a registered trademark of SPARC International, Inc. and is licensed exclusively to Sun Microsystems, 2 In contrast, the much more powerful and aggressive semantic CM (see Figure 3, which illustrates its power on an irreducible program ....
[Article contains additional citation context not shown here]
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-- 1155, 1994.
....the way for the successful transfer of the classical bitvector based optimizations to the parallel setting at almost no costs on the implementation and computation side. In [9] and [13] this has been demonstrated for partial dead code elimination (cf. 12] and partial redundancy elimination (cf. [18, 11]) Constant propagation (CP) cf. 7, 8, 19] however, is beyond bitvector based optimization. It is a powerful and widely used optimization of sequential programs, which improves performance by replacing terms, which can be determined at compile time to evaluate to a unique constant value at ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-- 1155, 1994. 15
....under consideration to the complete program. Sometimes pre and postprocesses can be avoided. However, this usually relies on tricky formulations often injuring conceptual clarity and transparency of the transformation. A representative example is the busy code motion (BCM ) transformation of [20] for the elimination of partially redundant computations in a program. The BCM transformation does not require a postprocess as e.g. dead code elimination. This, however, comes at the price of a more complicated reasoning about the correctness of the transformation as the meet over allpaths (MOP ....
....we will use as an identifier for edges. A finite path in G is a sequence (n 1 ; n q ) of nodes such that (n j ; n j 1 ) 2 E for j 2 f1; q Gamma 1g. PG [m; n] denotes the set of all finite paths from m to n, and PG [m; n[ the set of all finite paths from m to 2 See e.g. [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 16, 17, 20, 21, 23, 24, 25, 26, 27, 28, 32, 33, 34]. One of the few exceptions is [22] considering edge labeled SI graphs. 4 recommended by this article Flow graph variant Basic Blocks Single Instructions Edge labeled Basic Blocks Flow Graphs most widely used Flow graph variant Node labeled Single Instructions Fig. 1: A taxonomy of ....
[Article contains additional citation context not shown here]
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-- 1155, 1994.
....that have been dead before. This observation directly implies the first part of Lemma 3.8. The second part of this lemma is a consequence of the admissibility of g and a simple program transformation supposed in [KRS94b] that is typical for code motion transformations (cf. DRZ92, KRS92, KRS94a, RWZ88] namely to insert in every edge leading from a node with more than one successor to a node with more than one predecessor a new synthetic node. 5 A node n is called a join node, if it has more than one predecessor. 9 G G G 1 G 2 3 4 f f g g se se se se Figure 3: ....
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117-- 1155, 1994.
No context found.
Knoop, J., Ruthing, O. & Ste#en, B. (1994), `Optimal code motion: Theory and practice', ACM Transactions on Programming Languages and Systems 16(4), 1117--1155.
No context found.
Knoop, J., Rthing, O., and Steffan, B. Optimal Code Motion: Theory and Practice, Universitat Passau, MIP-9310, December 1993 (to appear in TOPLAS).
No context found.
Jens Knoop, Oliver Ruthing, and Bernhard Steen. Optimal code motion: Theory and practice. ACM 16(4):1117-1155, July 1994.
No context found.
Jens Knoop, Oliver Ruthing, and Bernhard Ste#en, Optimal code motion: Theory and practice, ACM Transactions on Programming Languages and Systems 16 (1994), no. 4, 1117--1155.
No context found.
J.Knoop,O.Ruthing, and B. Ste#en. Optimal code motion: Theory and practice. July 1994.
No context found.
Jens Knoop, Oliver Ruthing, and Bernhard Ste#en. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, 16(4):1117--1155, July 1994. URL citeseer.nj.nec.com/knoop93optimal.html.
No context found.
Knoop, J., R uthing, O., and Steffen, B. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems 16, 4 (July 1994), 1117--1155.
No context found.
J. Knoop, O. Ruthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Trans. on Prog. Lang. Syst., 16(4):1117--1155, July 1994.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC