| Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995. |
....and DFA based Communication Optimizations M.W. Hall et al. 36] present techniques for hoisting remappings out of loops and eliminating dead remappings. But they do not consider the more general problem of eliminating partially dead and partially redundant remappings. F. Coelho and C. Ancourt [15] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Optimal in their sense means that for a given remapping, a minimal number of messages is sent over the network. The problem of ....
....The communication sets of the basic cycle are used to efficiently perform the remapping of the whole array. E.T. Kalns and L.M. Ni [40] proposed a processor mapping technique to minimize the total amount of data that must be communicated during remapping. This issue has also been addressed in [15] (cf. to Section 7.3) A. Wakatani and M. Wolfe [81, 82] proposed a strip mining approach for remapping instead of transferring the entire array at one time in order to reduce the communication overhead CHAPTER 7. RELATED WORK 101 by latency hiding. The idea of message strip mining is to ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.
....analysis has been addressed by several researchers. Hall et al. 5] presented techniques for hoisting remappings out of loops and eliminating dead remappings. The more general problem of eliminating partially dead and partially redundant remappings has not been considered. Coelho et al. [2] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Optimal in their sense means that for a given remapping, a minimal number of messages is sent over the network. The problem of ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.
....for hoisting remappings out of loops and eliminating dead remappings. Their approach is incomparable to ours as it takes interprocedural information into account, but does not consider the more general problem of eliminating partially dead and partially redundant remappings. Coelho et al. [2] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Reducing the overall number of remappings by employing code motion is not addressed. Similarly, this holds for Ramaswamy et al. ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.
....varying power and efficiency allowing user customized solutions by trading efficiency against power and vice versa. Related work. Analyses aiming at reducing the number of executed remappings have been investigated previously. However, most of these approaches do not apply code motion at all (cf. [3, 15]) or only in quite a restricted manner preventing many beneficial optimizations (cf. 7] This also holds for the partially redundant expression elimination (PREE) based approaches for communication optimization of e.g. 1] 6] and [8] None of these approaches combines code hoisting (PRAE) ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR CRI A-277, Ecole des Mines de Paris, Centre de Recherche en Informatique, October 1995.
....stressed the importance of interprocedural compilation and presented techniques for hoisting remappings out of loops and eliminating dead remappings. Contrary to our approach, the more general problem of eliminating partially dead and partially redundant remappings is not considered. Coelho et al. [3] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. The elimination of useless remappings is based on a remapping graph which is presented at an intraprocedural level. Optimal in ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR CRI A-277, Ecole des Mines de Paris, Centre de Recherche en Informatique, October 1995.
....of processors in a row, P 1 ; the number of processors in a column, P 2 , plus other parameters to determine, when a sub matrix is used, where it is located in the global matrix. 0,1) 0,2) 1,0) 2,0) 3,0) 0,4) 0,5) 0,3) 0,0) 2,3) Blocks owned by the processors [0,0] Grid of Processors [2,3] Block Matrix 0 1 2 1 0 (2,0) 1,0) 3,0) 3,3) 1,1) 3,1) 1,4) 3,4) 1,2) 1,5) 3,2) 3,5) 0,3) 2,3) 1,3) 0,1) 0,4) 2,1) 2,4) 0,2) 0,5) 2,2) 2,5) 0,0) Figure 1: The block cyclic data distribution of a 2D array on a 2 Theta 3 grid of processors. Our dynamic approach implies that ....
Corinne Ancourt and Fabien Coelho. Optimal compilation of HPF remappings. Research Report A-277, Ecole des Mines de Paris, October 1995.
....for hoisting remappings out of loops and eliminating dead remappings. Their approach is incomparable to ours as it takes interprocedural information into account, but does not consider the more general problem of eliminating partially dead and partially redundant remappings. Coelho et al. [3] describe an optimization which reduces the communication amount by removing useless remappings and taking advantage of replications to shorten individual remappings. Optimal in their sense means that for a given remapping, a minimal number of messages is sent over the network. The problem of ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.
....that the performance after merging the nodes is better. This greedy approach ensures that reorganization, if any, occurs only in the less frequently executed parts of the program. Optimization of remappings in a HPF like language is treated using a data flow approach in the prototype HPFC compiler [6]. In this approach, a remapping graph is built from the control flow graph. The HPF data remapping directives, realign and redistribute, that appear in the program, form the vertices of the remapping graph. An edge in the remapping graph denotes a possible path in the control flow graph where ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR A-277, Centre de Recherche en Informatique, Ecole des mines de Paris, 1995.
....to solve the dynamic data decomposition problem [1] This greedy approach ensures that reorganization, if any, occurs only in the less frequently executed parts of the program. Optimization of remappings in a HPFlike language is treated using a data flow approach in the prototype HPFC compiler [4]. Data flow analysis is used to compute the reaching mapping for each node of the remapping graph. The work also mentions several potential remapping optimizations. Chatterjee et al. use a divide and conquer strategy to solve the problem of dynamic distribution [3] Palermo and Banerjee have ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Technical Report TR A-277, Centre de Recherche en Informatique, Ecole des mines de Paris, 1995.
.... array statements have been extensively discussed recently in the literature [7, 9, 18, 19, 23, 25, 28, 29] On the other hand, researchers in considering the generation of communication sets for compiling array redistribution seldom take arbitrary affine alignment into consideration, such as [5, 10, 11, 20, 21, 24]. However, affine alignment will waste a lot of memory space if the alignment stride is non unit. Such a wastage of memory usage is unacceptable for limited local spaces of processors on distributed memory multicomputers. Allocating spaces only for useful template cells is, therefore, of critical ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.
.... compiling array statements have been extensively discussed recently in the literature [7, 9, 18, 19, 22, 24, 27, 28] On the other hand, researchers in considering generating communication sets for compiling array redistribution seldom take arbitrary affine alignment into consideration, such as [5, 10, 11, 20, 21, 23]. However, affine alignment will waste a lot of memory space if the alignment stride is non unit. Such a wastage of memory usage is unacceptable for limited local spaces of processors on distributed memory multicomputers. Allocating spaces only for useful template cells is, therefore, of critical ....
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38(2):229--236, November 1996.
No context found.
Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995.
....send data to each viewer. An additional constraint should be added to restrict the sets of communications to needed ones. Different techniques can be used to address this issue: 1) replication allows broadcasts and or load balance, what is simply translated into linear constraints as described in [29]. 2) The affectation of owners to viewers can also be optimized in order to reduce the distance between communicating processors. For instance, the cost function could be the minimal Manhattan distance between p and p or the lexicographically minimal vector p Gamma p if the ....
....less precise but faster program analysis [21, 13, 45] can also be used in place of the region analysis. Polyhedron based techniques are already implemented in hpfc, our prototype Hpf compiler [27] to deal with I O communications in a host nodes model [28] and also to deal with dynamic remappings [29] (realign and redistribute directives) For instance, the code generation times for arbitrary remappings are in 0.1 2s range. Future work includes the implementation of our scheme in hpfc, experiments, extensions to optimize sequential loops, to overlap communication and computation, and to ....
Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995.
....send data to each viewer. An additional constraint should be added to restrict the sets of communications to needed ones. Different techniques can be used to address this issue: 1) replication allows broadcasts and or load balance, what is simply translated into linear constraints as described in [29]. 2) The affectation of owners to viewers can also be optimized in order to reduce the distance between communicating processors. For instance, the cost function could be the minimal Manhattan distance 2 between p and p 0 or the lexicographically minimal vector 3 p 0 Gamma p if the ....
....less precise but faster program analysis [21, 13, 45] can also be used in place of the region analysis. Polyhedron based techniques are already implemented in hpfc, our prototype Hpf compiler [27] to deal with I O communications in a host nodes model [28] and also to deal with dynamic remappings [29] (realign and redistribute directives) For instance, the code generation times for arbitrary remappings are in 0.1 2s range. Future work includes the implementation of our scheme in hpfc, experiments, extensions to optimize sequential loops, to overlap communication and computation, and to ....
Fabien Coelho and Corinne Ancourt. Optimal Compilation of HPF Remappings. Technical Report A 277, CRI, ' Ecole des mines de Paris, October 1995.
No context found.
F. Coelho and C. Ancourt. Optimal Compilation of HPF Remappings (Extended Abstract). Tech. Report CRI A-277, Centre de Recherche en Informatique, Ecole des mines de Paris, Fontainebleau, France, Nov. 1995.
No context found.
F. Coelho and C. Ancourt, "Optimal compilation of hpf remappings," Tech. Rep. A-277CRI, Centre de Recherche en Informatique, Fontainebleau, France, Oct. 1995.
No context found.
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings. Journal of Parallel and Distributed Computing, 38:229--236, November 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC