17 citations found. Retrieving documents...
Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Optimization of Data Remapping in Data-Parallel Languages - Mehofer (1998)   (Correct)

....and DFA based Communication Optimizations M.W. Hall et al. 36] present techniques for hoisting remappings out of loops and eliminating dead remappings. But they do not consider the more general problem of eliminating partially dead and partially redundant remappings. F. Coelho and C. Ancourt [15] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Optimal in their sense means that for a given remapping, a minimal number of messages is sent over the network. The problem of ....

....The communication sets of the basic cycle are used to efficiently perform the remapping of the whole array. E.T. Kalns and L.M. Ni [40] proposed a processor mapping technique to minimize the total amount of data that must be communicated during remapping. This issue has also been addressed in [15] (cf. to Section 7.3) A. Wakatani and M. Wolfe [81, 82] proposed a strip mining approach for remapping instead of transferring the entire array at one time in order to reduce the communication overhead CHAPTER 7. RELATED WORK 101 by latency hiding. The idea of message strip mining is to ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.


Optimal Distribution Assignment Placement - Jens Knoop And (1997)   (Correct)

....analysis has been addressed by several researchers. Hall et al. 5] presented techniques for hoisting remappings out of loops and eliminating dead remappings. The more general problem of eliminating partially dead and partially redundant remappings has not been considered. Coelho et al. [2] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Optimal in their sense means that for a given remapping, a minimal number of messages is sent over the network. The problem of ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.


A Powerful and Flexible Approach for Distribution Assignment.. - Knoop, Mehofer   (Correct)

....for hoisting remappings out of loops and eliminating dead remappings. Their approach is incomparable to ours as it takes interprocedural information into account, but does not consider the more general problem of eliminating partially dead and partially redundant remappings. Coelho et al. [2] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Reducing the overall number of remappings by employing code motion is not addressed. Similarly, this holds for Ramaswamy et al. ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.


Interprocedural Distribution Assignment Placement: Analogies.. - Knoop, Mehofer (1997)   (1 citation)  (Correct)

....varying power and efficiency allowing user customized solutions by trading efficiency against power and vice versa. Related work. Analyses aiming at reducing the number of executed remappings have been investigated previously. However, most of these approaches do not apply code motion at all (cf. [3, 15]) or only in quite a restricted manner preventing many beneficial optimizations (cf. 7] This also holds for the partially redundant expression elimination (PREE) based approaches for communication optimization of e.g. 1] 6] and [8] None of these approaches combines code hoisting (PRAE) ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR CRI A-277, Ecole des Mines de Paris, Centre de Recherche en Informatique, October 1995.


Interprocedural Distribution Assignment Placement: More than.. - Knoop, Mehofer (1997)   (1 citation)  (Correct)

....stressed the importance of interprocedural compilation and presented techniques for hoisting remappings out of loops and eliminating dead remappings. Contrary to our approach, the more general problem of eliminating partially dead and partially redundant remappings is not considered. Coelho et al. [3] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. The elimination of useless remappings is based on a remapping graph which is presented at an intraprocedural level. Optimal in ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR CRI A-277, Ecole des Mines de Paris, Centre de Recherche en Informatique, October 1995.


Fast Runtime Block Cyclic Data Redistribution on.. - Prylli, Tourancheau (1997)   (6 citations)  (Correct)

....of processors in a row, P 1 ; the number of processors in a column, P 2 , plus other parameters to determine, when a sub matrix is used, where it is located in the global matrix. 0,1) 0,2) 1,0) 2,0) 3,0) 0,4) 0,5) 0,3) 0,0) 2,3) Blocks owned by the processors [0,0] Grid of Processors [2,3] Block Matrix 0 1 2 1 0 (2,0) 1,0) 3,0) 3,3) 1,1) 3,1) 1,4) 3,4) 1,2) 1,5) 3,2) 3,5) 0,3) 2,3) 1,3) 0,1) 0,4) 2,1) 2,4) 0,2) 0,5) 2,2) 2,5) 0,0) Figure 1: The block cyclic data distribution of a 2D array on a 2 Theta 3 grid of processors. Our dynamic approach implies that ....

Corinne Ancourt and Fabien Coelho. Optimal compilation of HPF remappings. Research Report A-277, Ecole des Mines de Paris, October 1995.


Distribution Assignment Placement: A New Aggressive Approach.. - Knoop, Mehofer (1997)   (Correct)

....for hoisting remappings out of loops and eliminating dead remappings. Their approach is incomparable to ours as it takes interprocedural information into account, but does not consider the more general problem of eliminating partially dead and partially redundant remappings. Coelho et al. [3] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Optimal in their sense means that for a given remapping, a minimal number of messages is sent over the network. The problem of ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.


An Interprocedural Framework for Determining Efficient.. - Gupta, Krishnamurthy (1996)   (1 citation)  (Correct)

....that the performance after merging the nodes is better. This greedy approach ensures that reorganization, if any, occurs only in the less frequently executed parts of the program. Optimization of remappings in a HPF like language is treated using a data flow approach in the prototype HPFC compiler [6]. In this approach, a remapping graph is built from the control flow graph. The HPF data remapping directives, realign and redistribute, that appear in the program, form the vertices of the remapping graph. An edge in the remapping graph denotes a possible path in the control flow graph where ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR A-277, Centre de Recherche en Informatique, Ecole des mines de Paris, 1995.


An Interprocedural Framework for Determining Efficient.. - Gupta, Krishnamurthy (1996)   (1 citation)  (Correct)

....to solve the dynamic data decomposition problem [1] This greedy approach ensures that reorganization, if any, occurs only in the less frequently executed parts of the program. Optimization of remappings in a HPFlike language is treated using a data flow approach in the prototype HPFC compiler [4]. Data flow analysis is used to compute the reaching mapping for each node of the remapping graph. The work also mentions several potential remapping optimizations. Chatterjee et al. use a divide and conquer strategy to solve the problem of dynamic distribution [3] Palermo and Banerjee have ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR A-277, Centre de Recherche en Informatique, Ecole des mines de Paris, 1995.


Efficient Index Generation for Compiling Two-Level Mappings.. - Shih, Sheu, Huang   (Correct)

.... array statements have been extensively discussed recently in the literature [7, 9, 18, 19, 23, 25, 28, 29] On the other hand, researchers in considering the generation of communication sets for compiling array redistribution seldom take arbitrary affine alignment into consideration, such as [5, 10, 11, 20, 21, 24]. However, affine alignment will waste a lot of memory space if the alignment stride is non unit. Such a wastage of memory usage is unacceptable for limited local spaces of processors on distributed memory multicomputers. Allocating spaces only for useful template cells is, therefore, of critical ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.


Table-Lookup Approach for Compiling Two-Level Data-Processor.. - Kuei-Ping Shih (1997)   (Correct)

.... compiling array statements have been extensively discussed recently in the literature [7, 9, 18, 19, 22, 24, 27, 28] On the other hand, researchers in considering generating communication sets for compiling array redistribution seldom take arbitrary affine alignment into consideration, such as [5, 10, 11, 20, 21, 23]. However, affine alignment will waste a lot of memory space if the alignment stride is non unit. Such a wastage of memory usage is unacceptable for limited local spaces of processors on distributed memory multicomputers. Allocating spaces only for useful template cells is, therefore, of critical ....

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.


HPF Design Issues - Report Ensmp Cri   Self-citation (Coelho)   (Correct)

No context found.

Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995.


A Linear Algebra Framework for Static HPF Code.. - Ancourt, Coelho.. (1995)   (63 citations)  Self-citation (Coelho Ancourt)   (Correct)

....send data to each viewer. An additional constraint should be added to restrict the sets of communications to needed ones. Different techniques can be used to address this issue: 1) replication allows broadcasts and or load balance, what is simply translated into linear constraints as described in [29]. 2) The affectation of owners to viewers can also be optimized in order to reduce the distance between communicating processors. For instance, the cost function could be the minimal Manhattan distance between p and p or the lexicographically minimal vector p Gamma p if the ....

....less precise but faster program analysis [21, 13, 45] can also be used in place of the region analysis. Polyhedron based techniques are already implemented in hpfc, our prototype Hpf compiler [27] to deal with I O communications in a host nodes model [28] and also to deal with dynamic remappings [29] (realign and redistribute directives) For instance, the code generation times for arbitrary remappings are in 0.1 2s range. Future work includes the implementation of our scheme in hpfc, experiments, extensions to optimize sequential loops, to overlap communication and computation, and to ....

Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995.


A Linear Algebra Framework for Static HPF Code.. - Ancourt, Coelho.. (1995)   (63 citations)  Self-citation (Coelho Ancourt)   (Correct)

....send data to each viewer. An additional constraint should be added to restrict the sets of communications to needed ones. Different techniques can be used to address this issue: 1) replication allows broadcasts and or load balance, what is simply translated into linear constraints as described in [29]. 2) The affectation of owners to viewers can also be optimized in order to reduce the distance between communicating processors. For instance, the cost function could be the minimal Manhattan distance 2 between p and p 0 or the lexicographically minimal vector 3 p 0 Gamma p if the ....

....less precise but faster program analysis [21, 13, 45] can also be used in place of the region analysis. Polyhedron based techniques are already implemented in hpfc, our prototype Hpf compiler [27] to deal with I O communications in a host nodes model [28] and also to deal with dynamic remappings [29] (realign and redistribute directives) For instance, the code generation times for arbitrary remappings are in 0.1 2s range. Future work includes the implementation of our scheme in hpfc, experiments, extensions to optimize sequential loops, to overlap communication and computation, and to ....

Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995.


Interprocedural Array Redistribution Data-Flow Analysis - Palermo, IV, Banerjee (1996)   (11 citations)  (Correct)

No context found.

F. Coelho and C. Ancourt. Optimal Compilation of HPF Remappings (Extended Abstract). Tech. Report CRI A-277, Centre de Recherche en Informatique, Ecole des mines de Paris, Fontainebleau, France, Nov. 1995.


Simultaneous Exploitation of Task and Data Parallelism in.. - Ramaswamy (1996)   (16 citations)  (Correct)

No context found.

F. Coelho and C. Ancourt, "Optimal compilation of hpf remappings," Tech. Rep. A-277CRI, Centre de Recherche en Informatique, Fontainebleau, France, Oct. 1995.


On Compiling Block-Cyclic Data Redistribution - Kuei-Ping Shih (1997)   (Correct)

No context found.

F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38:229--236, November 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC