| N. Tawbi. Estimation of nested loop execution time by integer arithmetic in convex polyhedra. In Proc. of the 1994. |
....J 1 =1,N DO J 2 =N (2 J 1 ) N S1: IF (J 1 J 2 N) THEN A(J 1 ; J 2 ) ENDIF . ENDDO ENDDO Detecting zero trip loops [14] is a similar problem which tries to determine whether the loop body of a given loop nest is ever executed. Counting the number of loop iterations has been shown [11, 12, 14, 21] to be crucial for many performance analyses such as modeling work distribution [10] data locality [11] and communication overhead [9] All of these problems can be formulated as queries based on a set of linear and non linear constraints I defined over loop variables and parameters (loop ....
....serialize loop L5 as they fail to evaluate non linear array subscript expressions. 7 4 Count Solutions to a System of Constraints Counting the number of integer solutions to a set of constraints has been shown to be a key issue in performance analysis of parallel programs. Numerous applications [8, 21, 14] include: estimating statement execution counts, branching probabilities, work distribution, number of data transfers and cache misses. Even compiler analysis can be supported, for instance, by detecting and eliminating dead code such as loops that never iterate (zero trip loops) In what ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loop execution time by integer arithmetic in convex polyhedra. In Proc. of the 1994.
....convex polytopes or unions of convex polytopes, defined by linear constraints. Finally, it is shown in section 3 how these results can apply to determine useful values in analysis and transformation of scientific programs: they are illustrated with many examples and compared with related works [9, 22, 11, 12, 21, 16]. 2 The polytope model 2.1 Definitions and assumptions We first recall some basic notions dealing with geometry of numbers [10, 17] and enumerative combinatorics [19, 20] Then some more specific concepts, closely dedicated to the scope of the paper, are introduced. In the following, Q denotes ....
....to any other P n ; j 6= i, unless it is also a vertex of P n ; j 6= i. Compute all the vertices of each intersection P n n ; j 6= i. only if it is not a vertex of any P n . 3. 2 Examples and related work All the examples considered in this section come from several related works [11, 12, 22, 21, 9, 16] and were also handled by W. Pugh in [16] except example 7 whose aim is to show the generality of our method. Example 1 M. Haghighat and C. Polychronopoulos present in [11, 12] a method for volume computation. Their first example is k=j 1. This sum defines a set of linear constraints f1 i ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loop execution time by integer arithmetics in convex polyhedra. Proc. of the 1994.
....it consists in some cases in adding auxiliary variables, and requires the use of standard formulas for sums of powers (the author expects it will be sufficient to hard code the formulas for powers up to 10) All these facts clearly show some limits in the general use of these techniques. Tawbi in [21, 20] first uses a polyhedral splitting technique to characterize the validity domains, and an elimination of the variables in a predetermined order. However, her method does not systematically compute exact answers like ours. We have illustrated this paper with one example from a possible ....
N. Tawbi. Estimation of nested loop execution time by integer arithmetics in convex polyhedra. Proc. of the 1994.
....single line. 8 Cost Analysis This step aims at automatically evaluating the cost of L 6 programs in order to select the most eOEcient distribution. The complexity analysis is based on polytope volume computations INRIA Compilation of a Specialized Functional Language for Parallel Computers 29 ( Taw94] Pug94] Cla96] and yields accurate symbolic costs. This approach is made possible by the restrictions of L 1 and the xed set of data distributions considered. Together, they guarantee that the cost of all transformed source programs can be expressed as a sum of polytope volumes. The goal ....
....is unknown at compile time. In this case, the coeOEcients of aOEne expressions in the inequalities can contain an unknown. The volume of this kind of polytope cannot be represented by a pseudo polynomial. When we want to keep the number of processors as a parameter, the technique described in [Taw94] can be used. It consists in cutting out the polytope by breaking up the inequalities into several subsets such as each of them contains only two inequalities for each variable and that the lower limit is lower than the upper limit (to rule out null polytopes) After this step, traditional ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loops execution time by integer arithmetic in convex polyhedra. In Proc. of International Symposium on Parallel Processing, pages 217223, 1994.
....problems arising in the analysis of parallel loops by compilers. In this context, previous work attempted to provide solutions using rather expensive frameworks based on Presburger formulas [7] or Ehrhart polynomials [3] Other related work has concentrated on an algorithm for polytope splitting [10], while, in [6] an algebra of conditional values is defined, which, however, returns complex expressions. ....
N. Tawbi, Estimation of Nested Loops execution time by Integer Arithmetic in Convex Polyhedra, Proceedings of the 8th International Parallel Processing Symposium, IEEE Computer Society Press, pp. 217--221 (1994).
....1 i q. In the following we show how to approximate all terms in (A.2) by an upper or lower bound or an approximate expression (average between upper and lower bound) For the case where i 0 we can use standard formulas for sums of powers of integers. They are described in [39] and reviewed in [73]. For i 0 there are no closed forms known. However, for i = Gamma1 and 2 n it has been shown [39] that (ln is the natural logarithm) ln(n) n P v=1 1 v ln(n) 1 Gammaln(n) Gamma 1 Gamman P v= Gamma1 1 v Gammaln(n) A.3) and also [9] that 1 P v=1 1 v 2 = 2 6 ....
N. Tawbi. Estimation of nested loop execution time by integer arithmetic in convex polyhedra. In Proc. of the 1994 International Parallel Processing Symposium, April 1994.
....exist to solve this problem. Unfortunately, their results may be inadequate. For instance, when asked to compute , Maple and Mathematica give n(2mn 1) 2, which is true only if . Otherwise, the correct answer is m(m 1) 2. Other methods, which correctly handle symbolic constants, were proposed in [Taw94, Pug94, Cla96]. Readers are referred to the references for the details of these methods. Our implementation of the probability calculator is built based on the software developed by Tawbi [Taw94] Computing the denominator is straightforward because all the inequalities defining the domain are given. However, ....
....correct answer is m(m 1) 2. Other methods, which correctly handle symbolic constants, were proposed in [Taw94, Pug94, Cla96] Readers are referred to the references for the details of these methods. Our implementation of the probability calculator is built based on the software developed by Tawbi [Taw94]. Computing the denominator is straightforward because all the inequalities defining the domain are given. However, getting the numerator requires a little more work because this domain has to be found. More precisely, the ranges of sum variables have to be precomputed. For instance, in Example ....
N. Tawbi, "Estimation of Nested Loop Execution Time by Integer Arithmetics in Convex Polyhedra," In Proc. of the 1994 Int'l Parallel
.... analysis: Estimation of nested loop execution time Let us consider the following loop nest: for i : 0 to n do for j : 0 to i m 2 do for k : 0 to i n p do Statement(s) We want to compute the number of flops in order to evaluate the execution time of this code segment, as in [23]. This loop nest is modeled by the convex polytope PN = fz 2 Z 3 j A Delta z BN Cg where A= 0 B B B Gamma1 0 0 1 0 0 0 Gamma1 0 Gamma2 2 0 0 0 Gamma1 Gamma1 0 1 1 C C C A B= 0 B B B 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 Gamma1 0 1 1 C C C A N = n m ....
....in some cases in adding auxiliary variables, and requires the use of standard formulae for sums of powers (the author expects it will be sufficient to hard code the formulae for powers up to 10) All these facts clearly show some limits in the general use of these techniques. Tawbi in [4] [23] first uses a polyhedral splitting technique to characterize the validity domains, and an elimination of the variables in a predetermined order. However, her method does not systematically compute exact answers like ours. We have illustrated this paper with three examples from possible ....
N. Tawbi. Estimation of nested loop execution time by integer arithmetics in convex polyhedra. Proc. of the 1994 International Parallel Processing Symposium, April 1994.
....of all processors for a parallel program and compare it to measurements taken on an iPSC 860 hypercube system. 1. Introduction Counting the number of integer solutions to a set of inequalities has been shown to be a key issue in performance analysis of parallel programs. Numerous applications [2, 7] include: estimating the number of statement execution counts, branching probabilities, number of data transfers and cache misses. Even conventional compiler analysis can be supported, for instance, by detecting loops that never iterate (zero trip loops) and consequently can be eliminated. In ....
....by constant integers. In order to overcome this disadvantage M. Haghighat and C. Polychronopoulos [5] described an algebra of conditional values and a set of rules for transforming symbolic expressions. However, they did not present an algorithm that decides which rule to apply when. N. Tawbi [7] developed a symbolic sum algorithm which handles loop nests with parameters. Her approach is restricted to sums based on linear inequalities implied by loop header statements only. W. Pugh [6] improved Tawbi s algorithm by extending it to techniques that count the number of integer solutions to ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loop execution time by integer arithmetic in convex polyhedra. In Proc. of the 1994 International Parallel Processing Symposium, April 1994.
....After transformation of the program according to dioeerent distributions, this step aims at automatically evaluating the complexity of each L 4 program obtained in order to choose the most eOEcient. Our approach to get an accurate symbolic cost is to reuse results on polytope volume computations ( Taw94] Cla96] It is possible because the restrictions of the source language and the xed set of data distributions guarantee that the abstracted cost of all transformed source programs can be translated into a polytope volume description. First, an abstraction function CA takes the program and ....
.... i n Gamma 1 0 j; i Gamma (p Gamma 1) b j j b Gamma 1; j n Gamma 1 b i k n Gamma 1 (b) 8 : 1 i n Gamma 1 0 j; x j; j b Gamma 1 i k n Gamma 1 p (x Gamma 1) i Gamma p 1 p x Figure 9: Inequality systems associated to C bloc (a) C cyc (b) Taw94] Cla96] describe algorithms for computing symbolically the volume of parametrized polytopes. The result is a polynomial whose variables are the system parameters. Further, Clauss [Cla96] presents an extension for systems of non linear inequalities with AEoor ( Xi Pi ) and ceiling ( Sigma ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loops execution time by integer arithmetic in convex polyhedra. In Int. Symp. on Par. Proc., pages 217223, 1994.
....to TRUE. If not then we can simply eliminate the entire conditional statement (dead code elimination) Detecting zero trip loops [15] is a similar problem which tries to determine whether the loop body of a given loop nest is ever executed. Counting the number of loop iterations has been shown [11, 7, 15, 22] to be crucial for many performance analyses such as modeling work distribution [10] data locality [11] and communication overhead [9] All of these problems can be formulated as queries based on a set of linear and non linear constraints I. The constraints of I are defined over loop variables ....
....serialize loop L5 as they fail to evaluate non linear array subscript expressions. 4. Count Solutions to a System of Constraints Counting the number of integer solutions to a set of constraints has been shown to be a key issue in performance analysis of parallel programs. Numerous applications [8, 22, 15, 21, 19] include: estimating statement execution counts, branching probabilities, work distribution, number of data transfers and cache misses. Even compiler analysis can be supported, for instance, by detecting and eliminating dead code such as loops that never iterate (zero trip loops) Consider the ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loop execution time by integer arithmetic in convex polyhedra. In Proc. of the 1994 International Parallel Processing Symposium, April 1994.
....convex polytopes or unions of convex polytopes, defined by linear constraints. Finally, it is shown in section 3 how these results can apply to determine useful values in analysis and transformation of scientific programs: they are illustrated with many examples and compared with related works [9, 22, 11, 12, 21, 16]. 2 The polytope model 2.1 Definitions and assumptions We first recall some basic notions dealing with geometry of numbers [10, 17] and enumerative combinatorics [19, 20] Then some more specific concepts, closely dedicated to the scope of the paper, are introduced. In the following, Q denotes ....
....j n ; j 6= i. Compute all the vertices of each intersection P i n P j n ; j 6= i. For each found vertex v, v is a vertex of Pn if and only if it is not a vertex of any P i n . 3. 2 Examples and related work All the examples considered in this section come from several related works [11, 12, 22, 21, 9, 16] and were also handled by W. Pugh in [16] except example 7 whose aim is to show the generality of our method. Example 1 M. Haghighat and C. Polychronopoulos present in [11, 12] a method for volume computation. Their first example is P n i=1 P i j=3 P 5 k=j 1. This sum defines a set of linear ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loop execution time by integer arithmetics in convex polyhedra. Proc. of the 1994 Int. Parallel Processing Symp., Apr. 1994.
....it consists in some cases in adding auxiliary variables, and requires the use of standard formulas for sums of powers (the author expects it will be sufficient to hard code the formulas for powers up to 10) All these facts clearly show some limits in the general use of these techniques. Tawbi in [21, 20] first uses a polyhedral splitting technique to characterize the validity domains, and an elimination of the variables in a predetermined order. However, her method does not systematically compute exact answers like ours. We have illustrated this paper with one example from a possible application. ....
N. Tawbi. Estimation of nested loop execution time by integer arithmetics in convex polyhedra. Proc. of the 1994 International Parallel Processing Symposium, April 1994.
.... j 2 u 33 X j 3 =l 31 i l 32 j 2 l 33 W J 3 : The evaluation of sums such as the above depends on the number of symbolic variables involved in the loop bounds; detailed methodologies for the symbolic evaluation of sums in the context of parallelising compilers are described in [3] 11] 12] [14]. 1 This implies that the j2 and or j3 loops may be surrounded by DO : ENDDO loops which perform the same number of iterations regardless of the value of i; such loops may also exist in any of the five mentioned sets of statements. 3 3 Methodology A subclass of the loop nests shown in ....
N. Tawbi, Estimation of Nested Loops execution time by Integer Arithmetic in Convex Polyhedra, Proceedings of the 8th International Parallel Processing Symposium, IEEE Computer Society Press, 1994, pp. 217--221.
....of a methodology for evaluating summations, such as those in Equations (3.1) and (3.2) corresponding to loop nests with non constant or unknown, at compile time, bounds. The evaluation of summations to compute the number of iterations in a loop nest has also been considered by Nadia Tawbi [209, 210, 211]. Her approach concentrates on the description of an algorithm for splitting the computation into sub parts which can be manipulated; issues, such as those arising from the presence of conditionals or complicated loop bounds expressions, are not addressed. A more in depth study has been carried ....
N. Tawbi, "Estimation of Nested Loops execution time by Integer Arithmetic in Convex Polyhedra ", in Proceedings of the 8th International Parallel Processing Symposium, IEEE Computer Society Press, 1994, pp. 217--221.
....denote a small impact of load imbalance on performance. In order to estimate the values of W i in (1) we consider the number of times that each part of the loop body is executed. Techniques to compute this number (which corresponds to the number of integer points in a polytope) are described in [5, 6, 7]; they are based on the evaluation of nested sums with each sum corresponding to a loop. Based on the above, the iterations of a single loop with lower bound l and upper bound u can be partitioned across p processors, with processor k, 0 k p, executing a loop whose bounds, l k , u k , can be ....
N. Tawbi. Estimation of Nested Loops execution time by Integer Arithmetic in Convex Polyhedra. In Proceedings of the 8th International Parallel Processing Symposium, pp. 217--221. IEEE Computer Society Press, 1994.
....cost of recovery code, a compiler can then build a cost model to determine whether it is profitable to perform a data speculation or not. In this work, we propose a general probabilistic memory disambiguation (PMD) framework based on an integer linear programming technique to perform a convex sum [TF92, Taw94, Pug94, Cla96]. We demonstrate the application of this PMD framework to data speculation. For a practical purpose, we develop a set of heuristics to quickly approximate aliasing probabilities in 3 many common cases. Although not done in this paper yet, the PMD framework can also take advantage of array data ....
....insufficient for an application like data speculation, which demands the knowledge of aliasing probabilities. In order to derive an aliasing probability, our approach expresses the solution set and the domain set in terms of sets of constraints and uses the integer linear programming methods in [Pug94, Taw94, Cla96] to derive the number of elements within these constrains. In Example 3 the constraints of the solution set are , e.g. i = 3 and j = 1. The constraints of the domain set are , which is the entire iteration space. The aliasing probability is the ratio of the cardinality of solution set and the ....
[Article contains additional citation context not shown here]
N. Tawbi, "Estimation of nested loop execution time by integer arithmetics in convex polyhedra," In Proc. of the 1994 Int'l Parallel Processing Symp., April 1994.
....and show how they are used to formulate the number of lattice points inside a union of convex polytopes. In section 4, we describe the algorithm used to derive these formula. In section 5, the method presented here is illustrated with many examples and compared with related works [11, 27, 13, 14, 26, 21]. Many applications can make good use of a symbolic formula for the number of integral solutions to system of parameterized constraints. In section 6, we will show example applications in which the results in this paper can be effectively used for the analysis and transformation of scientific ....
.... 3 4 n n 2 1 2 ; 1 2 ; 1 2 ; 1 2 ; 1 2 ; 1 2 n n 0; Gamma 1 4 ; 0; Gamma 1 4 ; 0; Gamma 1 4 n is reduced to 3 4 n 2 1 2 n 0; Gamma 1 4 n 5 Examples and related work All the examples considered in this section come from several related works [13, 14, 27, 26, 11, 21] and were also handled by W. Pugh in [21] except example 13 whose aim is to show the generality of our method. Example 8 M. Haghighat and C. Polychronopoulos present a method for volume computation in [13, 14] Their first example is P n i=1 P i j=3 P 5 k=j 1. This sum defines a set of ....
[Article contains additional citation context not shown here]
N. Tawbi. Estimation of nested loop execution time by integer arithmetics in convex polyhedra. Proc. of the 1994 International Parallel Processing Symposium, April 1994.
....for the values of W i in (1) can be derived by considering the number of times that each part of the loop body is executed. This corresponds to the (complex) problem of enumerating the integer points of a polytope [1] In the context of loop nests, some techniques to compute this are described in [5, 14, 17]. In general, the number of times, n, that a single statement surrounded by m loops is executed is given by: n = u 1 X i 1 =l 1 u 2 X i 2 =l 2 Delta Delta Delta um X i m=lm 1; 3) where l j ; u j are the lower and upper bounds, respectively, of the j th loop, 1 j m. If, for every ....
N. Tawbi. Estimation of Nested Loops Execution Time by Integer Arithmetic in Convex Polyhedra. In Proceedings of the 8th International Parallel Processing Symposium, IEEE Computer Society Press, 1994, pp. 217--221.
....we have produced simplified constraints in (overlapping) disjunctive normal form. The need to do this is explained in Section 4.5.1 and techniques to do it are described in Section 5. ffl Show the application of our techniques to a number of examples and compare our techniques with relation work [FST91, TF92, HP93a, Taw94] (Section 6) 2 The Omega test The Omega test [Pug92] was originally developed to check if a set of linear constraints had an integer solution, and was initially used in array data dependence testing. Since then, its capabilities and uses have grown substantially. In this section, we describe ....
....and lower bound on the sum. Only if these values are far apart may it be worthwhile to compute the exact answer. 4.1 Simple sums There are fairly standard formulas for sums of powers of integers. These formulas are described in the CRC Standard Mathematical Tables [Bey81] and are reviewed in [TF92, Taw94]. For example, Sigmai : 1 i n : i 2 ) Sigma : 1 n : n(n 1) 2n 1) 6 ) Within our implementation, we expect it will be sufficient to hard code the formulas for p up to 10. In each of these sums, the guard produced is 1 n. 4.2 Basic sums In this section, we concern ourselves with ....
[Article contains additional citation context not shown here]
Nadia Tawbi. Estimation of nested loop execution time by integer arithmetics in convex polyhedra. In Proc. of the 1994 International Parallel Processing Symposium, April 1994.
No context found.
N. Tawbi. Estimation of nested loop execution time by integer arithmetic in convex polyhedra. In Proc. of the 1994.
No context found.
N. Tawbi. Estimation of nested loop execution time by integer arithmetic in convex polyhedra. In Proc. of the 1994 International Parallel Processing Symposium, April 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC