14 citations found. Retrieving documents...
M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In Euro-Par95, Stockholm, Sweden, August 1995.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Parallelization via Constrained Storage Mapping Optimization - Cohen (1999)   (7 citations)  (Correct)

..... We are assured that with these new data, the original program semantic will be preserved in the parallel version. Eventually, there are several methods to compute OE functions at run time [4, 5] They all relies on an array Last of operations, unifying arrays from [12] and Last functions from [10]; See [5] for details. 6 Conclusion and Perspectives Expanding data structures is a classical optimization to cut memory based dependences. The questions are (1) What is the good expansion for my favorite program and architecture , 2) What is the good parallel loop reordering algorithm . We ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EURO-PAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


A Structured Synchronization and Communication Model.. - Melin, Raffin.. (1997)   (Correct)

....transformations[7] They are based on static analysis of the iteration domain of loop nests, the set of the operations and of their data flow dependences. When programs involve unpredictable control structures, Griebl and Collard propose to generate a code able to scan such domains at run time [9]. Our data driven parallelization allows us to express the parallel program independently from mapping and scheduling. 6 Example: Cholesky factorization In this section we present an example of parallelization of the classical Cholesky factorization algorithm using our translation function. In ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EURO-PAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


An Automatic Distribution of Sequential Code Fitting.. - Melin, Raffin.. (1997)   (Correct)

.... 12] realize a fine dataflow analysis of the sequential program to identify the sequences of operations that can be executed simultaneously: the waves [11] When programs involve unpredictable control structures, Griebl and Collard propose to generate a code able to scan such domains at run time [8]. Then, they generate programs in a data parallel languages like HPF thanks to a synchronous sequence of forall loops. This involves loop nests transformation using basis change and convex polyhedron machinery [6] Unfortunately, despite of a fine analysis, some useless synchronizations between ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EURO-PAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


Parallelization via Constrained Storage Mapping Optimization - Cohen (1999)   (7 citations)  (Correct)

..... We are assured that with these new data, the original program semantic will be preserved in the parallel version. Eventually, there are several methods to compute OE functions at run time [4, 5] They all relies on an array Last of operations, unifying arrays from [12] and Last functions from [10]; See [5] for details. 6 Conclusion and Perspectives Expanding data structures is a classical optimization to cut memory based dependences. The questions are (1) What is the good expansion for my favorite program and architecture , 2) What is the good parallel loop reordering algorithm . We ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EURO-PAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


Exploiting Monotone Convergence Functions in Parallel Programs - Pugh, Rosser, Shpeisman (1996)   (Correct)

....and barrier, which can impose substantial performance penalties on some systems. In a few cases, such as when a natural ordering is used in a relaxation algorithm, the inner loops carry dependencies and cannot be run in parallel. To exploit parallelism in these loops, a number of researchers [1,4,5,2,7,3] have proposed speculative execution: a wavefront technique is used to execute the program in parallel, despite the fact that all loops carry dependences. Since this ignores the termination condition of the while loop, iterations of the while loop are executed speculatively until each iteration ....

....is necessary to determine if the computation has converged. Our technique also allows us to provide efficient doacross pipelined parallelism when the body of a while loop contains cross processor dependencies. We believe the technique we propose is more practical than speculative execution [1,4,5,2,7,3]. In a language like HPF, the transformation we describe has to be performed by the compiler; there is no way for the user to express a reduction over local data and make a decision based on that. In the experiments we performed, for most of the computation, local data alone was sufficient to ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In N.N., editor, EuroPar 95, Lecture Notes in Computer Science. Springer-Verlag, 1995.


Optimization of Storage Mappings for Parallel Programs - Cohen, Lefebvre (1998)   (6 citations)  (Correct)

....as follows: For every read access u such as oe(u) 6= and for all statement S in Stmt(oe(u) insert statement Last[f(v) max OE (Last[f(v) v) immediately after S, where v is an instance of S. This concept unifies arrays from array static single assignment (SSA) 15] and Last functions from [13]; See [5] for comparison between SA and array SSA. Then, it is easy to see that Last[f(u) holds the last executed source, and that the OE function which computes the value produced by this source is: OE(oe(u) D Stmt(Last[f(u) Index(Last[f(u) mod E Last[f(u) Clearly, this ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EURO-PAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


Maximal Static Expansion - Denis Barthou Albert (1998)   (9 citations)  Self-citation (Collard)   (Correct)

No context found.

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In Euro-Par95, Stockholm, Sweden, August 1995.


The Advantages of Instance-wise Reaching Definition Analyses in.. - Collard (1998)   Self-citation (Collard)   (Correct)

....are not bounded at compile time, the data structures D S we allocate are not bounded either; we thus have to allocate them dynamically, or to tile the iteration space. 6 Related Work Other Work on Array (S)SA Our method to convert programs with arrays to single assignment form, adapted from [16, 17], uses the result of a reaching definition analysis on arrays to get the SA intermediate representation. Recent work by Knobe and Sarkar [18] on the other hand, does not make this separation. So, is this an important issue We believe the answer is yes. Cutting the conversion to (S)SA into two ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EUROPAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


Classifying Loops for Space-Time Mapping - Griebl, Lengauer (1996)   Self-citation (Griebl)   (Correct)

....target loop bounds can be determined which are functions in the outer target loop indices (if any) and structure parameters only. This is feasible for loops of Class 4 with Fourier Motzkin and for while loops (Classes 0 and 1) since their while dependences allow termination detection at run time [10, 12]. Thus, these classes would not change. Classes 2 and 3, however, would in the alternative classification be divided orthogonally: in both classes there are loop nests for which target loop bounds can be found at compile time (e.g. any source loop nest under scannable transformations) and loop ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EUROPAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


Classifying Loops for Space-Time Mapping - Griebl, Lengauer (1996)   Self-citation (Griebl)   (Correct)

.... on the outer, sequential loop without knowledge of the maximal extent of some inner loop in general, there need not be a scannable transformation in the synchronous case [16] Target code enhancements which deal with this typical problem in both shared and distributed memory systems are given in [4, 12]. One can also employ a speculative approach as described in [3] However, these complex schemes are not necessary for all classes of loops. For Class 4, the computation space is known at compile time to be a polytope. Therefore, code enumerating the target space can be generated easily with ....

J.-F. Collard and M. Griebl. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EURO-PAR '95 Parallel Processing, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, August 1995.


Maximal Static Expansion - Barthou, Cohen, Collard (1998)   (9 citations)  Self-citation (Collard)   (Correct)

....of (2) yields the program in Fig.2. While the right hand side of S only depends on w, the right hand side of R depends on the control flow, thus needing a function similar to a OE function in the SSA framework (even if, on this introductory example, the OE function would be very simple) [8] . for i = 1 to N do T x [i] while . do S x [i,w] if w 0 then x [i,w 1] else x [i] end while R . OE(hi; T i; fhi; w; Si : w 0g) end for Figure 2: First example, continued. The aim of this paper is to expand x as much as possible in this program but without ....

....array data flow analyses are approximate [7, 10, 11] Several writes may be the unique definition of a given value, but the analysis cannot tell. From the result of such an analysis, how to obtain a single assignment program to the price of dynamic restoration of data flows had been described in [8]. 3 Static Expansion Let Omega be the set of all operations in the program, f the function mapping operations to memory cells they write into, and W Omega be the set of all writes. Let f 0 be the expansion, that is the new function, after program transformation, mapping operations to the ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In Euro-Par95, Stockholm, Sweden, August 1995.


Maximal Static Expansion - Barthou, Cohen, Collard (1998)   (9 citations)  Self-citation (Collard)   (Correct)

....of (2) yields the program in Figure 2. While the right hand side of S only depends on w, the right hand side of R depends on the control flow, thus needing a function similar to a OE function in the SSA framework (even if, on this introductory example, the OE function would be very simple) [9]. 1 For instance, Horn and Schunck s algorithm to perform 3D Gaussian smoothing by separable convolution. for i = 1 to N do T x [i] while . do S x [i,w] if w 0 then x [i,w 1] else x [i] end while R . OE(hi; T i; fhi; w;Si : w 0g) end for Figure 2: First ....

....coined the term static control programs for this class of programs. In the case of programs with general control and unrestricted arrays subscripts, array data flow analyses are approximate [3, 2, 16, 17] Several writes may be the unique definition of a given value, but the analysis cannot tell. [9] describes how to obtain a single assignment program to the price of dynamic restoration of data flow. 3 real A[1. 4 N 1,1. 2] for i = 1 to 2 N do for j = 1 to 2 N do fexpansion of statement Sg if 2 N 1 = i j = N then if . then S1 A[i j 2 N,1] A[i j 2 N,1] end if elsif N 1 ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In Euro-Par95, Stockholm, Sweden, August 1995.


The Interplay of Expansion and Scheduling in PAF - Feautrier, Collard.. (1998)   (6 citations)  Self-citation (Collard)   (Correct)

....the array cell under study is not modified by S. A coherent way of thinking about is to consider it as the name of an operation which is executed once before all other operations of the program: 8u 2 Omega ; OE u. In the following, will be used to denote, also, an undefined vector. 7, 5] [28, 29, 36, 31] Section 3 Section 5 Section 6.2 Scheduling Folding Code Generation Conversion to SA Implementation in progress Currently restricted to affine loop nests Research and implementation status: Implemented in PAF [6] Section 6.1 Section 4 [36] SA for Static Expansion Maximal Affine Loop ....

....denote, also, an undefined vector. 7, 5] 28, 29, 36, 31] Section 3 Section 5 Section 6.2 Scheduling Folding Code Generation Conversion to SA Implementation in progress Currently restricted to affine loop nests Research and implementation status: Implemented in PAF [6] Section 6. 1 Section 4 [36] SA for Static Expansion Maximal Affine Loop Nests ADA or FADA [27] 45] 14] Figure 1: Parallelization framework for unrestricted loop nests over arrays Figure 1 describes the parallelization process for this program model. The first phase consists in an element wise array data flow analysis ....

[Article contains additional citation context not shown here]

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, EURO-PAR '95, Lecture Notes in Computer Science 966, pages 315--326. Springer-Verlag, 1995.


Fuzzy Array Dataflow Analysis - Barthou, Collard, Feautrier (1997)   (22 citations)  Self-citation (Collard)   (Correct)

....definitions. Even though their method does not handle multi dimensional arrays and gives only maximal distances, a fuzzy array dataflow analysis along their lines may be an interesting alternative to this paper. Applications of FADA to automatic parallelization include static scheduling [9], array privatization and register allocation [5] As a concluding remark, note that a in a source set points to a possible programming error. Beyond automatic parallelization, a fuzzy array dataflow analysis may therefore be a general tool for translators, compilers and program checkers, as ....

M. Griebl and J.-F. Collard. Generation of synchronous code for automatic parallelization of while loops. In S. Haridi, K. Ali, and P. Magnusson, editors, Euro-Par95, volume 966 of LNCS, pages 315--326, Stockholm, Sweden, August 1995. Springer Verlag.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC