68 citations found. Retrieving documents...
C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static hpf code distribution. In Fourth Workshop on Compilers for Parallel Computers, CPC'93, Delft, Pays-Bas, 1993.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Generating Local Addresses and Communication Sets.. - Chatterjee.. (1995)   (76 citations)  (Correct)

....schemes can be extended to two level mappings, but neither paper gives explicit formulas for 22 handling this case. Stichnoth s communication set generation algorithm trades memory locality for buffer space: it reuses a single buffer but makes multiple passes over the array data. Ancourt et al. [1] use a linear algebraic approach based on the Hermite Normal Form to generate code for such programs in the general case. They provide a scheme that requires a small amount of additional storage but still results in constant stride loops. Once again, their scheme uses a doubly nested loop for each ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. In Workshop on Compilers for Parallel Computers, Delft, Dec. 1993.


Algorithms for Automatic Alignment of Arrays - Chatterjee, Gilbert, Oliker.. (1996)   (3 citations)  (Correct)

....Our theory makes these forms of communication residual rather than intrinsic, and thus encompasses such optimizations [7] 1.1 A formal model of data layout Alignment maps an array to a template with an affine mapping. The array coordinate a 2Z d maps thus to the template coordinate t 2Z t [1]: Rt = La f: 1) The matrix L encodes the orientation and spacing of array elements in the template, the column vector f encodes the offset of the array in the template, and the projection matrix R encodes the template axes along which the array is replicated. We constrain the matrix L to have ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. In Workshop on Compilers for Parallel Computers, Delft, The Netherlands, Dec. 1993.


A Semantic Framework To Address Data Locality in Data Parallel.. - Violard (2001)   (Correct)

....of elements of matrices A and B as in Cannon s algorithm. Beyond syntactical di erences in languages such as C and HPF, this example again shows a di erent balance between the programmer and the compiler workload: non linear alignments, if allowed in HPF, should involve new compiling techniques [1], whereas the diculty is in writing the program in C . In the rest of the paper we introduce a theory which de nes features such that the programmer and the compiler can fairly collaborate to reach program correctness and eciency. 3 An introduction for the theory 3.1 Objects The theory ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static hpf code distribution, 1993.


On deriving HPF code from Pei programs - Genaud (1998)   (2 citations)  (Correct)

....the message passing paradigm. Some projects have also designed functional languages for that model (Nesl [Ble95] and Sisal [MSA 85] for example) to get rid of classical drawbacks of imperative languages. For all these languages, much attention is paid to the performance of the compilers (e.g. AICK93] which is the key to make this style of programming popular. However, when it comes to program correctness, there still is a need for software tools to be able to formally state about programs meaning. Our contribution is a language called Pei, that can model most of the data parallel ....

Corinne Ancourt, Francois Irigoin, Fabien Coelho, and Ronan Keryell. A linear algebra framework for static hpf code distribution. In CPC'93, November 1993.


Design and Evaluation of a Compiler-directed Collective.. - Memik, Kandemir.. (2000)   (3 citations)  (Correct)

....by each code block can be determined by considering individual loop nests that make up the code block. The crucial step in this process is taking into account the parallelization information [2] Individual nests can either be parallelized explicitly by programmers using compiler directives [1, 4], or can be parallelized automatically (without user intervention) as a result of intra procedural and inter procedural compiler analyses [2, 10, 11] In either case, after the parallelization step, our approach determines the data regions (for a given dataset) accessed by each processor involved. ....

A. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. Scientific Prog., 6(1):3-- 28, Spring 1997.


Finding performance bugs with the TNO HPF benchmark suite - Denissen, Sips (2002)   (Correct)

....for the iteration sets of the local loops. This transformation is called mask absorption.Forblock and cyclic distributions this transformation is relatively simple, but for block cyclic distributions this transformation is more complicated and a number of different solutions have been proposed [2, 12, 13, 14, 15]. Less attention has been paid to the equally important efficient absorption of multiple masks as needed in the derivation of communication sets and dependencies between loop iterators [2, 16] The three compilers differ in their optimization techniques in two ways: i) the derivation of the local ....

C. Ancourt, F. Irigoin, F. Coelho, and R. Keryell, "A Linear Algebra Framework for Static HPF Code Distribution," Technical Report A-278-CRI, Ecole des Mines, Paris, Nov. 1995.


I/O Optimizations for Hierarchical Storage Systems - Memik (2000)   (Correct)

.... T INITIALIZE( Open the file for read. The exinfo will be filled by the library. For creating the file (i.e. if the file is opened for the first time) information about the file should be supplied to T OPEN via exinfo. exfile = T OPEN ( file 1 , r , exinfo) start[0] 0; start[1] = 0; end[0] 24000; end[1] 80; Perform the operation T READ SECTION ( exfile, buf, starts, ends) Close the file T CLOSE ( exfile) T FINALIZE( g Figure 3.2: An example code for reading from a two dimensional file. size of each dimension of the global file. Array ....

....the file for read. The exinfo will be filled by the library. For creating the file (i.e. if the file is opened for the first time) information about the file should be supplied to T OPEN via exinfo. exfile = T OPEN ( file 1 , r , exinfo) start[0] 0; start[1] 0; end[0] 24000; end[1] = 80; Perform the operation T READ SECTION ( exfile, buf, starts, ends) Close the file T CLOSE ( exfile) T FINALIZE( g Figure 3.2: An example code for reading from a two dimensional file. size of each dimension of the global file. Array Access Routines: These routines ....

[Article contains additional citation context not shown here]

A. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. Scientific Prog., 6(1):3--28, Spring 1997.


An implementation framework for HPF distributed.. - van Reeuwijk.. (1996)   (4 citations)  (Correct)

....has been explored for some time in the context of various data parallel languages [7, 8, 9, 10, 11, 12] The recent definition of HPF [1] has added some new data alignment and data distribution features for which no efficient solutions existed. As a consequence, new results have been reported in [13, 6, 14, 15, 16, 17, 18, 19, 20, 21] and, more recently and concurrent with this paper, 22, 23, 20, 24, 25, 26] Early optimization techniques only consider non aligned arrays. The first optimizations were reported by Callahan and Kennedy [7] and Gerndt [8] They considered non aligned block(m) distributions with linear array ....

....9, 10, 11, 12] The recent definition of HPF [1] has added some new data alignment and data distribution features for which no efficient solutions existed. As a consequence, new results have been reported in [13, 6, 14, 15, 16, 17, 18, 19, 20, 21] and, more recently and concurrent with this paper, [22, 23, 20, 24, 25, 26]. Early optimization techniques only consider non aligned arrays. The first optimizations were reported by Callahan and Kennedy [7] and Gerndt [8] They considered non aligned block(m) distributions with linear array access functions. Gerndt also showed how overlap can be handled. In Paalvast et ....

[Article contains additional citation context not shown here]

C. Ancourt, F. Irigoin, F. Coelho, and R. Keryell, "A linear algebra framework for static HPF code distribution", Tech. Rep. A-278-CRI, Ecole des Mines, Paris, November


Algorithms for Automatic Alignment of Arrays - Chatterjee, Gilbert, Oliker.. (1996)   (3 citations)  (Correct)

....these forms of communication residual rather than intrinsic, and thus encompasses such optimizations [12] 1.1 A formal model of data layout Alignment maps an array to a template with an affine mapping. The array coordinate a 2 Z d maps to the template coordinate t 2 Z t in the following way [2]: Rt = La f: 1) The matrix L encodes the orientation and spacing of array elements in the template, the column vector f encodes the offset of the array in the template, and the projection matrix R encodes the template axes along which the array is replicated. We constrain the matrix L to have ....

....0.1 0.12 0.14 shift distance cshift, dim = 1. uniformly distributed. 8 processors 1M 590K 262K 65K (a) b) Figure 1: Collective communication costs on the CM 5. a) All to all communication. b) Shift communication. A similar linear algebraic formulation is available for distribution [2, 7]. We do not explicitly mention it here as distribution is beyond the scope of this paper. 1.2 A formal statement of the data layout problem Given an array parallel program and a target number of processors, our goal is to determine the quantities R, L, and f for each array and template at each ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. In Workshop on Compilers for Parallel Computers, Delft, The Netherlands, Dec. 1993.


Load balancing HPF programs by migrating virtual processors - Perez (1996)   (7 citations)  (Correct)

....in the distributions is the foundation of the HPF language. Almost all previous compilation techniques are based on this concept. By keeping it valid, they also remain valid. The techniques we are referring to are relative to linear algebra that is useful for regular computation compilation [1]. Irregularity We need to handle irregularity because it is a key for high performances. Real codes have a part of irregularity: Irregularity may come from an unbalanced distribution of computations or from a repeated irregular communication pattern. In these two cases, an irregular ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. Technical Report A-278-CRI, CRI, ENSM Paris, November 1995.


A Global Communication Optimization Technique.. - Kandemir.. (1998)   (9 citations)  (Correct)

....approaches, primarily for hoisting communication and minimizing the number of messages, respectively, that are aimed at reducing communication overhead and show the tradeoff between these two. Both these approaches are accurate; using the linear algebra framework proposed by Ancourt et al. [7], they are able to handle the optimization problem at the granularity of individual array elements. 3) We show that the global communication sets resulting from our analysis can be enumerated by our use of the Omega library [42, 31] from the University of Maryland. Although the Omega library ....

....be found elsewhere [48, 52] 9 2.4 Linear Algebra Framework HPF like languages provide compiler directives that allow the user to perform data allocation onto local memories. The compiler then uses these distribution directives to partition computation across processors. It has been shown in [7] that linear algebra provides a powerful framework to generate code for distributedmemory message passing machines, taking into account compiler directives. Most of the compilers for distributed memory message passing machines use the owner computes rule, which simply assigns each computation to ....

[Article contains additional citation context not shown here]

A. ANCOURT, F. COELHO, F. IRIGOIN, and R. KERYELL. A linear algebra framework for static HPF code distribution. Scientific Programming, 6(1):3--28, Spring 1997.


Automatic Vectorization of Communications for Data-Parallel.. - Germain, Delaplace (1995)   (Correct)

....of basis. Analysis of the properties of communications leads to a tiling of the local memory addresses that provides maximal message vectorization. 1 Introduction Static analysis of data parallel programs, for the generation of distributed code, has been proposed by many authors, for instance [7] [4] 9] 5] 13] Static analysis aims to improve performance over run time resolution [2] which includes a lot of pure overhead in form of guards and tests. Many static compilation schemes have been considered; they differ in important points such as interleaving computation and communication as ....

....resolution [2] which includes a lot of pure overhead in form of guards and tests. Many static compilation schemes have been considered; they differ in important points such as interleaving computation and communication as in [5] or having identical management of local an nonlocal data such as in [7]. However, they all use three basic sets: Compute(s) is the part of the index set which is local to processor s; Send(s) resp Received(s) is the part of a distributed array that has to be sent (resp received) by processor s when owner computes rule is applied. The central problem of static ....

[Article contains additional citation context not shown here]

F.Irigoin, C. Ancourt, F. Coelho, and R. Keryell. A linear algebra framework for static HPF code distribution. In 4th Int. Workshop on Compilers for Parallel Computers, pages 117--132, 93.


Scheduling Block-Cyclic Array Redistribution - Desprez, Dongarra, Petitet.. (1997)   (8 citations)  (Correct)

.... Sophisticated techniques involve finite state machines (see Chatterjee et al. 3] set theoretic methods (see Gupta et al. 8] Diophantine equations (see Kennedy et al. 11, 12] Hermite forms and lattices (see Thirumalai and Ramanujam [18] or linear programming (see Ancourt et al. [1]) A comparative survey of these algorithms can be found in Wang et al. 22] where it is reported that the most powerful algorithms can handle block cyclic distributions as efficiently as the simpler case of pure cyclic or full block mapping. At the end of the message generation phase, each ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. Scientific programming, to appear. Avalaible as CRI--Ecole des Mines Technical Report A-278-CRI, and at http://www.cri.ensmp.fr.


A Generalized Framework for Global Communication.. - Kandemir Choudhary.. (1998)   (Correct)

....allows the compiler to easily apply traditional loop based optimization techniques such as message vectorization, message coalescing as well as global optimizations such as redundant communication elimination and communication hoisting. Using the linear algebra framework proposed by Ancourt et al. [2], our technique is able to handle the optimization problem at the granularity of individual array elements. The task of code generation is made easier by our use of the Omega library [9] from the University of Maryland. The global communication sets resulted from our dataflow analysis can be ....

....tu and 0 # l #C,1g; where P = p u , p l 1. In this formulation, t = # # d # represents alignment information and t = C#P #c C#q l denotes the distribution information. Simple BLOCK and CYCLIC(1) distributions can easily be handled within this framework by setting c=0 and l=0, respectively. See [2] for the details. A communication descriptor can be defined as a pair hR; Si,whereR is an array identifier (name) and S is the communication set associated with R. The exact definition of a communication set depends on the context in which it is used. Throughout our analysis, a set is defined as ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. In Proc. 4th International Workshop on Compilers for Parallel Computers,Delft,the Netherlands, 1993.


Code Generation and Optimization for High Performance Fortran - Thirumalai (1995)   (1 citation)  (Correct)

....and the virtual cyclic schemes. The virtual block (cyclic) scheme views the global array as a union of several cyclically (block) distributed arrays. The virtual cyclic scheme does not preserve the access order in the case of DO loops (this is not a problem for array assignments) Ancourt et al. [1] use a linear algebra framework to generate code for fully parallel loops in HPF, which does not exploit repeating sequence of accesses. Recently, building on the finite state machine approach of Chatterjee et al. Kennedy et al. 10] derived a runtime solution that determines the basis vectors by ....

A. Ancourt, F. Coelho, F. Irigoin and R. Keryell. A linear algebra framework for static HPF code distribution. In Proc. of the 4th Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.


A Generalized Framework for Global Communication.. - Kandemir Banerjee Choudhary (1998)   (Correct)

....allows the compiler to easily apply traditional loop based optimization techniques such as message vectorization, message coalescing as well as global optimizations such as redundant communication elimination and communication hoisting. Using the linear algebra framework proposed by Ancourt et al. [2], our technique is able to handle the optimization problem at the granularity of individual array elements. The task of code generation is made easier by our use of the Omega library [9] from the University of Maryland. The global communication sets resulted from our dataflow analysis can be ....

.... Gamma 1g; where P = p u Gamma p l 1. In this formulation, t = ff d fi represents alignment information and t = C P c C q l denotes the distribution information. Simple BLOCK and CYCLIC(1) distributions can easily be handled within this framework by setting c=0 and l=0, respectively. See [2] for the details. A communication descriptor can be defined as a pair hR; Si, where R is an array identifier (name) and S is the communication set associated with R. The exact definition of a communication set depends on the context in which it is used. Throughout our analysis, a set is defined ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. In Proc. 4th International Workshop on Compilers for Parallel Computers, Delft, the Netherlands, 1993.


Efficient Address and Communication Generation for.. - Venkatachar (1996)   (Correct)

....for storing nonlocal references. Several optimizations such as offset communication, message coalescing and aggregation are also presented. 2. 2 RELATED WORK ON COMMUNICATION GENERATION Several solutions have been proposed to address the code generation problem for distributed memory machines [1, 2, 4, 8, 12, 14, 15, 21, 22, 23, 24, 26, 27, 29, 31]. Many of these are good schemes for address generation but issues in communication optimization have not received much attention. Chatterjee et al. 4] describe the set of accesses as a 11 finite state machine (FSM) and use the FSM to generate communication. Their model of communication is based ....

....memory space. One other drawback of their technique when compared to our technique is the fact that their one level pattern tables contain local memory gaps and not actual addresses. However their execution preserves lexicographic ordering and they do not incur any memory wastage. Ancourt et al. [1] presented a linear algebraic framework to solve the problem of address generation for distributed memory machines. Their framework to solve two level mapping involves a change of basis which leads to compression of holes but still incur some memory wastage. The node code generated by this ....

A. Ancourt, F. Coelho, F. Irigoin and R. Keryell. A linear algebra framework for static HPF code distribution. In Proc. of the 4th Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.


On The Implementation And Effectiveness Of Autoscheduling For.. - Moreira (1995)   (16 citations)  (Correct)

....a mechanism for satisfying control and data dependences. It supports both loop and functional parallelisms, in the form of PARALLEL LOOP and PARALLEL CASE constructs, respectively. Techniques for the compilation of Fortran dialects with data distribution features (Fortran D, HPF) are discussed in [21, 23, 74, 75]. Compile based, automatic (as opposed to user specified) data distribution is discussed in [76, 77, 78, 79, 80] 38 Decomposition and scheduling algorithms that attempt to co locate process and data in shared memory multiprocessors, in order to reduce the costs of remote memory accesses, are ....

C. Angourt, F. Coelho, F. Irigoin, and R. Keryell, "A linear algebra framework for static HPF code distribution," tech. rep., Centre de Recherche en Informatique, ' Ecole Nationale Sup'erieure des Mines de Paris, December 1993.


Scheduling Block-Cyclic Array Redistribution - Desprez, Dongarra, Petitet.. (1997)   (8 citations)  (Correct)

.... Sophisticated techniques involve finite state machines (see Chatterjee et al. 3] set theoretic methods (see Gupta et al. 8] Diophantine equations (see Kennedy et al. 10, 11] Hermite forms and lattices (see Thirumalai and Ramanujam [17] or linear programming (see Ancourt et al. [1]) A comparative survey of these algorithms can be found in Wang et al. 21] where it is reported that the most powerful algorithms can handle block cyclic distributions as efficiently as the simpler case of pure cyclic or full block mapping. At the end of the message generation phase, each ....

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. Scientific programming, to appear. Avalaible as CRI--Ecole des Mines 26 Technical Report A-278-CRI, http://www.cri.ensmp.fr.


Opus: A Coordination Language for Multidisciplinary.. - Chapman, Zima.. (1997)   (21 citations)  (Correct)

....can be accesses locally by the leader. local to the leader. Determining the communication schedule, i.e. what elements of an array are to be sent or received from which thread, is a complex task. Several groups have been studying algorithms and heuristics to determine the most efficient schedule [2, 11, 26, 16, 22, 27, 31]. We have adopted (and augmented) the finite state machine (FSM) method for local address set calculation developed by Chatterjee et al. 11] in our current prototype. The FSM method exploits the repeating patterns of local array indices to determine the elements of a distributed array that each ....

A. Ancourt, F. Coehlo, F. Irigoin, and R. Keyrell. A linear algebra framework for static HPF code distribution. In Proc. of the 4th Workshop Compilers for Parallel Computers, Delft, The Netherlands, December 1993.


Parallelization Of A Wave Propagation Application.. - Andre, Le Fur.. (1994)   (1 citation)  (Correct)

....execution and the data distribution is enforced by the owner writes rule. To achieve good performance when following this approach, sophisticated compilation techniques and run time systems have been studied and integrated into environments [11, 4, 2] Other related techniques have been proposed [1, 6, 13] but have not been fully integrated yet in complete environments. Based on the data parallel approach, the Pandore environment allows compiling both HPF and Pandore programs into spmd machine independent code. A series of experiments on classical kernels have already led to satisfactory results. ....

F. Irigoin, C. Ancourt, F. Coelho, and R. Keryell. A Linear Algebra Framework for Static HPF Code Distribution. In International Workshop on Compilers for Parallel Computers, December 1993.


A Case-Study of Design Space Exploration for.. - Hurbain, Ancourt, ..   Self-citation (Ancourt Irigoin)   (Correct)

No context found.

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static hpf code distribution. In Fourth Workshop on Compilers for Parallel Computers, CPC'93, Delft, Pays-Bas, 1993.


Compiling Parallel Sparse Code for User-Defined Data.. - Kotlyar, Pingali.. (1997)   (5 citations)  (Correct)

No context found.

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell, A linear algebra framework for static hpf code distribution, in CPC'93, November 1993. Also available at http://cri.ensmp.fr/doc/A-250.ps.Z.


A Semantic Framework to Address Data Locality - In Data Parallel (2004)   (Correct)

No context found.

C. Ancourt, F. Coelho, F. Irigoin, R. Keryell, A linear algebra framework for static hpf code distribution, 1993.


Application of Ant Colony Optimization to Data Distribution in.. - Rodrigues   (Correct)

No context found.

C. Ancourt, F. Coelho, F. Irigoin, and R. Keryell. A linear algebra framework for static HPF code distribution. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, 1993.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC