19 citations found. Retrieving documents...
A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Parallelization via Constrained Storage Mapping Optimization - Cohen (1999)   (7 citations)  (Correct)

....One needs an expression of the partial order that does not grow with problem size. Additional constraints on the expression of partial orders are: Have a high expressive power; Be easily found and manipulated; Allow optimized code generation. A suitable solution is to use scheduling functions [8, 9] from operations to integers (or vectors of integers in the case of multidimensional schedules) It is straightforward to compute the parallel execution order associated to a given scheduling function (the contrary is not true. u OE v , u) v) Another solution is tiling [11, 3] ....

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


Reuse-Driven Tiling for Improving Data Locality - Xue, Huang (1998)   (1 citation)  (Correct)

....Darte and Vivien s algorithm for finding canonical transformations is implemented. The reuse framework based on vector spaces is due to Wolf and Lam [12] The relevance of cones to solving compiler problems is being increasingly recognised. Successful applications are dependence abstraction [4, 18], loop scheduling [3] and tiling for parallelism [1, 3, 10, 16] One more application considered in this paper is tiling for data locality. The concept of fully permutable loop nests introduced in [13] has emerged to be a useful. Initially in [13] the concept is related to the maximal degree of ....

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. In Proc. of the 1996 International Conference on Parallel Architectures and Compilation Techniques, pages 281--291, Boston, MA,, 1996.


Maximal Static Expansion - Barthou, Cohen, Collard (1998)   (9 citations)  (Correct)

....= B[j] R 1 B [j] foo (B[j] A[1] end for for i = 2 to N do T A[i] 0 for j = 1 to N do S if B [i 1,j] A[i] then A[i] B [i 1,j] R B [i,j] foo (B [i 1,j] A[i] end for end for Figure 5: Maximal static expansion for the second example. One may apply classical scheduling algorithms [15, 13] (possibly combined with some tiling of the iteration space [17, 7] to the expanded program. One possible solution would be to execute in a single parallel front all instances hS; i; ji and hR; i; ji such that i j is equal to some constant t. The resulting program has the same degree of ....

.... on classical algorithms to compute parallel order OE 0 from the dependence graph associated with ffi (OE;f 0 e ) cf. Theorem 2) In particular, when relation ffi (OE;f 0 e ) is affine i.e. involves only 19 affine inequalities over loop counters and symbolic constants scheduling [15, 13] algorithms can be applied. With some additional hypotheses on the original program (such as being a perfect nest of loops) tiling [17, 7] algorithms also apply. Any parallel order OE 0 (over operations) must satisfy dependence relation ffi (OE;f 0 e ) over accesses) 8e; 8(o 1 ; r 1 ....

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


Code Generation in the Polytope Model - Griebl, Lengauer, Wetzel (1998)   (8 citations)  (Correct)

....of this paper. 1. Introduction In recent years, methods for automatic parallelization of nested loops based on a mathematical model, the polytope model [9, 12] have been improved significantly. The focus has been on identifying good schedules, i.e. distributions of computations in time, e.g. [6, 8], and allocations, i.e. distributions of computations in space, e.g. 5, 14] Thus, the space time mapping, i.e. the combination of schedule and allocation, derived by state of the art techniques often describes a very efficient parallel execution of the source loop nest. In contrast, code ....

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. J. Parallel Programming, 25(6):447--496, Dec. 1997.


Parallelization via Constrained Storage Mapping Optimization - Cohen (1999)   (7 citations)  (Correct)

....One needs an expression of the partial order that does not grow with problem size. Additional constraints on the expression of partial orders are: Have a high expressive power; Be easily found and manipulated; Allow optimized code generation. A suitable solution is to use scheduling functions [8, 9] from operations to integers (or vectors of integers in the case of multidimensional schedules) It is straightforward to compute the parallel execution order associated to a given scheduling function (the contrary is not true. u OE 0 v , u) v) Another solution is tiling [11, 3] ....

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


The Interplay of Expansion and Scheduling in PAF - Feautrier, Collard.. (1998)   (6 citations)  (Correct)

....future work. In the rest of this paper, schedules are supposed to be non speculative. 5.4 Related Work The scheduling problem has been widely studied since the first Kennedy and Allen algorithm. It is not the purpose of this paper to compare these algorithms, the interested reader may refer to [20, 29, 25] for details. 6 The Interplay of Array Expansion and Scheduling The previous section presented a method to express the parallelism in a dependence graph. It computes a schedule satisfying the partial order given by the dependence graph. Moreover, memory expansion aims at deleting dependences due ....

....SUIF is, in that respect, more advanced, and features an array data flow analysis and affine mappings. Several automatic parallelizing compilers rely on the polytope model. The Loopo parallelizer [37] developed at the University of Passau, implements the scheduling algorithms proposed in [28] and [20]. Moreover, Loopo is an excellent platform to compare scheduling techniques. The Omega project [43] is very similar in spirit to PAF. The findings of Pugh and Wonnacott on data flow analysis of general programs [57] are very close to ours, even though both results were achieved independently. ....

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


Optimization of Storage Mappings for Parallel Programs - Cohen, Lefebvre (1998)   (6 citations)  (Correct)

.... Parallel execution order We can rely on classical algorithms to compute parallel order OE 0 from the dependence graph associated with oe (cf. Theorem 1) In particular, when relation oe is affine i.e. involves only affine inequalities over loop counters and symbolic constants scheduling [8, 11] algorithms can be applied. With some additional hypotheses on the original program (such as being a perfect nest of loops) tiling [14, 4] algorithms also apply. This issue is studied in more detail in Section 5.1. Second step: Reducing memory usage Take single assignment as storage mapping, ....

....of the partial order that does not grow with problem size, i.e. a closed form. Additional constraints on the expression of partial orders are: Have a high expressive power; Be easily found and manipulated; Allow optimized code generation. A suitable solution is to use scheduling functions [8, 10] from operations to integers (or vectors of integers in the case of multidimensional schedules [11] It is straightforward to compute the parallel execution order associated to a given scheduling function (the contrary is not true. u OE 0 v , u) v) Another solution is tiling which ....

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


On the optimality of Feautrier's scheduling algorithm - Vivien (2002)   (1 citation)  Self-citation (Vivien)   (Correct)

No context found.

A. Darte and F. Vivien. Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs. Int. J. of Parallel Programming, 1997.


On the Optimality of Feautrier's Scheduling Algorithm - Vivien (2002)   (1 citation)  Self-citation (Vivien)   (Correct)

....the use of powerful scheduling algorithms. In the eld of dense matrix code parallelization, lots of algorithms have been proposed along the years. Among the main ones, we have the algorithms proposed by Lamport [10] Allen and Kennedy [2] Wolf and Lam [14] Feautrier [7, 8] and Darte and Vivien [5]. This collection of algorithm spans a large domain of techniques (loop distribution, unimodular transformations, linear programming, etc. and a large domain of dependence representations (dependence levels, direction vectors, a ne dependences, dependence polyhedra) One may wonder which ....

.... the representation of the dependences it handles; 3) that is contained in the program to be parallelized (not taking into account the dependence representation used nor the transformations allowed) For example, Allen, Callahan, and Kennedy uses the rst de nition [1] Darte and Vivien the second [5], and Feautrier the third [8] We now recall that Feautrier is not optimal under any of the last two de nitions. This distribution is rather esthetic as the exact same result can be achieved without using it. This distribution is intuitive and ease the computations. 5 The Classical ....

[Article contains additional citation context not shown here]

A. Darte and F. Vivien. Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs. Int. J. of Parallel Programming, 1997.


Optimal Fine and Medium Grain Parallelism Detection in.. - Darte, Vivien (1997)   (12 citations)  Self-citation (Darte Vivien)   (Correct)

....lemma and corollary: Lemma 1 A basic path Pi u of Gu corresponds to a unique polyhedral edge e of G o and the total weight of Pi u is a vector that belongs to the polyhedron P (e) associated to e. Proof: The proof is straightforward, by construction of G u . A detailed proof is given in [17]. 2 Corollary 1 A path Pi u of Gu , from an actual node to an actual node, defines an equivalent dependence path Pi o in G o : each basic sub path of Pi u corresponds exactly to a polyhedral edge e of G o whose dependence polyhedron contains the weight of the basic sub path. 5.1.2 From G o to ....

....just to decompose each polyhedral edge on the vertices, rays and lines of the corresponding polyhedron. If some components are not integral, we multiply (i.e. we use several times the cycle) all the components by a suitably large integer so that they all become integral. Details can be found in [17]. 2 5.1.3 Computability condition In the case of bounded iteration domains, a reduced dependence graph is computable if and only if the apparent dependence graph it describes is acyclic, i.e. if there is no dependence path from an instance of a statement to the same instance of the same ....

[Article contains additional citation context not shown here]

Alain Darte and Fr'ed'eric Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Technical Report 96-06, LIP, ENS-Lyon, France, April 1996.


Parallelizing Nested Loops With Approximations Of Distance.. - Darte, Vivien (1997)   (3 citations)  Self-citation (Darte Vivien)   (Correct)

....have the following optimality result: Property 3 Algorithm Darte Vivien is optimal among all parallelism detection algorithms whose input is a graph whose edges are labeled by a polyhedral representation of distance vectors. Proof. We just give the scheme of the proof. All details are provided in [10]. As Darte Vivien is recursive, one can associate to each statement S, the number d S of recursive steps needed to satisfy all dependences concerning S. In the parallelized code, statement S will be surrounded by d S sequential loops. Furthermore, for a loop nest whose iteration domain contains ....

Alain Darte and Fr'ed'eric Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Technical Report 96-06, LIP, ENS-Lyon, France, April 1996.


Parallelizing Nested Loops With Approximations Of Distance.. - Darte, Vivien (1997)   (3 citations)  Self-citation (Darte Vivien)   (Correct)

....parallelization, but also, among others, for general task scheduling and software pipelining. This paper studies the three main parallelism detection algorithms that work with a description of distance vectors a , the algorithms of Allen and Kennedy [1] Wolf and Lam [20] and Darte and Vivien [9]. These algorithms seem very different not only by the techniques they use (computations of strongly connected components, computations of unimodular matrices, and resolution of linear programs, respectively) but also by the description of dependences they work with (approximation of distance ....

....and not with an approximation. In this sense, it is more powerful. However discussing its optimality is much more complicated. See Section 3.4 for some hints. correspond to a sequential loop, the missing (n Gamma d) dimensions will correspond to DOALL loops (Feautrier [13] Darte and Vivien [9]) Here, we do not discuss the rewriting process needed to obtain some loop nests from these three transformation schemes (see [20,22,6,5] but we discuss the link between the loops transformations (the output) and the dependence representation (the input) We want to characterize, for a given ....

[Article contains additional citation context not shown here]

Alain Darte and Fr'ed'eric Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. In Proceedings of PACT'96, Boston, MA, October 1996. IEEE Computer Society Press.


Loop Parallelization Algorithms - Darte, Robert, Vivien (2001)   (1 citation)  Self-citation (Darte Vivien)   (Correct)

....decompose G 0 into its strongly connected components G i and call DARTE VIVIEN(G i , k 1) for each subgraph G i that has at least one actual node. Remarks Step (2) is necessary only for general PRDGs: for example, it could be removed for RDGs labeled by direction vectors (for details see [16]) In this case, the resolution of a single linear program can simultaneously solve Step (1) and Step (3) In Step (3) we do not specify, on purpose, how the vector X and the constants ae are selected, so as to allow various selection criteria. For example, a maximal set of linearly ....

....of the linear programs 6.1 and 6.2. We have outlined the main ideas of algorithm Darte Vivien [15] Some technical modifications are needed to distinguish between virtual and actual nodes, and to take into account the nature of the edges (vertices, rays or lines of a dependence polyhedron) see [16] for full details. 6.7 Power and limitations Now that we have a multi dimensional schedule T , we can prove its optimality in terms of degree of parallelism. We can show [14, 16] that for each statement S (i.e. for each node of G o ) the number of instances of S that have been sequentialized by ....

[Article contains additional citation context not shown here]

Alain Darte and Fr'ed'eric Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Technical Report 96-06, LIP, ENS-Lyon, France, April 1996.


Loop Parallelization Algorithms - Darte, Robert, Vivien (2001)   (1 citation)  Self-citation (Darte Vivien)   (Correct)

....loop transformations, either ad hoc transformations such as Banerjee s algorithm [3] or generated automatically such as Wolf and Lam s algorithm [31] 3. Schedules, either mono dimensional schedules [10, 12, 19] a particular case being the hyperplane method [26] or multi dimensional schedules [15, 20]. 2 Alain Darte, Yves Robert and Fr ed eric Vivien These loop parallelization algorithms are very different for a number of reasons. First, they make use of various mathematical techniques: graph algorithms for (1) matrix computations for (2) and linear programming for (3) Second, they take ....

....generate the lineality space of Gamma and X is a vector of the relative interior of Gamma . However, there is no need to build Gamma effectively to build G 0 . This is one of the interest of the linear programs 6.1 and 6.2. We have outlined the main ideas of algorithm Darte Vivien [15]. Some technical modifications are needed to distinguish between virtual and actual nodes, and to take into account the nature of the edges (vertices, rays or lines of a dependence polyhedron) see [16] for full details. 6.7 Power and limitations Now that we have a multi dimensional schedule T , ....

Alain Darte and Fr'ed'eric Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. In Proceedings of PACT'96, Boston, MA, October 1996. IEEE Computer Society Press.


Storage Mapping Optimization for Parallel - Programs Albert Cohen   (Correct)

No context found.

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


Maximal Static Expansion - Barthou, Cohen, Collard (2000)   (9 citations)  (Correct)

No context found.

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


Parallelization via Constrained Storage Mapping Optimization - Albert Cohen Prism (1999)   (7 citations)  (Correct)

No context found.

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. Journal of Parallel Programming, 25(6):447--496, December 1997.


Multi-dimensional Incremental Loop Fusion for Data.. - Verdoolaege.. (2003)   (2 citations)  (Correct)

No context found.

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. International Journal of Parallel Programming, 25(6):447--496, 1997.


On Tiling as a Loop Transformation - Xue (1997)   (19 citations)  (Correct)

No context found.

A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. In Proc. of the 1996 International Conferenceon Parallel Architectures and Compilation Techniques, pages 281--291, Boston, MA,, 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC