15 citations found. Retrieving documents...
Kees van Reeuwijk, Will Denissen, Henk J. Sips, and Edwin M. R. M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, September 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Optimizing Array Reference Checking in Java Programs - Midkiff, Moreira, Snir (1998)   (8 citations)  (Correct)

....be approximated. Replacing true safe bounds L and U by approximated safe bounds L and U does not introduce any hazards as long as L L and U U. Techniques for approximating the iteration subspace of a loop that accesses some range of an affinely subscripted array axis are described in [23, 24]. 39 A.3 Constant subscripts For an array reference A[f(i) where f(i) k (a constant) f(i) is neither monotonically increasing nor monotonically decreasing. Nevertheless, we can treat this special case by defining U = u if lo(A) k up(A) U = l Gamma 1 if k up(A) L = u 1 ....

K. van Reeuwijk, W. Denissen, H. J. Sips, and E. M. R. M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, Sep 1996.


Lowering HPF Procedure Interface to a Canonical Representation - Jan Borowiec Arthur   (Correct)

....this paper (sections 3 and 4) focuses on the most complex part of this, the handling of mapped arrays at procedure call boundaries. The mapping information represented in the PIR is used by later components of the compiler to produce efficient SPMD code. More about that can be found in [2] 5] [8]. 2 PIR Constructs This section introduces simplified versions of the PIR constructs that are later used in the formulation of the procedure interface lowering algorithm. To represent PIR constructs we use the following notation: class name (component list) where class is a single letter ....

C. van Reeuwijk, H.J. Sips, W. Denissen, E.M. Paalvast, An implementation framework for HPF distributed arrays on message passing computer systems. IEEE Transactions on Parallel and Distributed Systems, Vol.7, no.9, September 1996, pp. 897--914.


Transforming Data-parallel Fortran90/HPF Constructs Into a.. - Jan Borowiec Thilo   (Correct)

....statements which were developed in this context. 1 Background This paper focuses on one specific component of the PREPARE HPF compiler. The context (other compiler components, data structures, the CoSy compiler generation environment) cannot be described here see [1] 2] 3] 4] [5], 6] 7] for more information. To understand the methods presented here it is most important to know that the PREPARE compiler, unlike many other HPF systems, is not a source to source translator but a real compiler: it translates the source program to an internal representation which is then ....

....loop nests) but retains parallelism right until the code generation phase. As a consequence, generation of very efficient code for superscalar and other vector handling architectures becomes possible. More information about the methods applied in these steps can be found in [1] 2] 3] 4] [5]. Subsequent sections demonstrate how the PIR parallel array assignment is employed in the implementation of data parallel Fortan90 HPF constructs in the PREPARE compiler. 4 Compiler introduced array buffers During translation of data parallel Fortran90 HPF statements, we sometimes employ ....

C. van Reeuwijk, W. Denissen, H.J. Sips, E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, to appear.


PCRC-based HPF Compilation - Zhang, Carpenter, Fox, Li, Li, Wen (1997)   (4 citations)  (Correct)

....are relatively weak; the compiler needs to generate send and receive primitives to accomplish communication. Though this may have more ecient code generation after extensive program analysis, the compiler may become too complicated to be operational. The most recent paper on HPF compiling was [15], in which a local set enumeration method was used to generate local part of a loop iteration and derive the communication set. Comparatively speaking, we believe our run time support method to get values is more straightforward and ecient, especially for regular access to the array data. ....

Kees van Reeuwijk, Will Denissen, Henk J. Sips, and Edwin M.R.M. Paalvast, \An implementation Framework for HPF Distributed Arrays on Message-Passing Parallel Computer Systems", IEEE Trans. on parallel and distributed system, vol.7 Sep. 1996 16


Parallel and Distributed Systems Report Series - Code Generation Techniques (1998)   Self-citation (Van reeuwijk Sips)   (Correct)

No context found.

Kees van Reeuwijk, Will Denissen, Henk J. Sips, and Edwin M. R. M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, September 1996.


Spar 1.2 Language Specification - van Reeuwijk (2000)   Self-citation (Van reeuwijk)   (Correct)

No context found.

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, September 1996.


Parallel and Distributed Systems Report Series - Code Generation Techniques   Self-citation (Van reeuwijk Sips)   (Correct)

No context found.

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897-- 914, September 1996.


A Unified Compiler Framework for Work and Data - Sips (2002)   Self-citation (Van reeuwijk Denissen Sips)   (Correct)

....foreach (i : 0:N) We now have reached a form where the first loop allows us to perform aggregate communication. This means there will be at most one data packet being sent from a processor to any other processor. After communication aggregation other optimizations such as owner absorption [8] will be done. This means that owner tests using regular distributions like cyclic or block will be replaced by recomputed B[i,j,k] A[i2,j2,k2] A[i2,j2,k2 1] A[i2,j2,k2 1] A[i2,j2 1,k2] A[i2,j2 1,k2] A[i2,j2 1,k2 1] A[i2,j2 1,k2 1] A[i2,j2 1,k2 1] A[i2,j2 1,k2 1] A[i2 1,j2,k2] ....

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897-- 914, September 1996.


Spar language specification - Containing a description of the.. - van Reeuwijk (2001)   Self-citation (Van reeuwijk)   (Correct)

....iteration range, m = N P ext is assumed. The (cyclic i m) placement function places index i onto processor p = i m) mod P ext . If no m is specified, the value 1 is assumed. For data that is distributed with the block and cyclic functions, the compiler is able to apply specific optimizations, see [17] for details. With these functions, all HPF data mappings can be specified, even alignments, although templates are not explicitly visible as in HPF. 69 Annotating declarations By annotating a member function, the user can specify the group of processors allowed to execute the member function. ....

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, September 1996.


Finding performance bugs with the TNO HPF benchmark suite - Denissen, Sips (2002)   Self-citation (Denissen Sips)   (Correct)

....for the iteration sets of the local loops. This transformation is called mask absorption.Forblock and cyclic distributions this transformation is relatively simple, but for block cyclic distributions this transformation is more complicated and a number of different solutions have been proposed [2, 12, 13, 14, 15]. Less attention has been paid to the equally important efficient absorption of multiple masks as needed in the derivation of communication sets and dependencies between loop iterators [2, 16] The three compilers differ in their optimization techniques in two ways: i) the derivation of the local ....

.... this transformation is more complicated and a number of different solutions have been proposed [2, 12, 13, 14, 15] Less attention has been paid to the equally important efficient absorption of multiple masks as needed in the derivation of communication sets and dependencies between loop iterators [2, 16]. The three compilers differ in their optimization techniques in two ways: i) the derivation of the local iteration spaces for distributed computations and (ii) the compression techniques for local storage of distributed arrays. The PGI HPF compiler uses pattern matching techniques to recognize ....

[Article contains additional citation context not shown here]

C. van Reeuwijk, W.J.A. Denissen, H.J. Sips, E.M.R.M. Paalvast, "An implementation framework for HPF distributed arrays on message passing parallel computer systems," IEEE Transactions on Parallel and Distributed Systems,Vol.7, No. 9, September 1996, pp 897-914.


ENSEMBLE: A Communication Layer for Embedded.. - Cadot, Kuijlman..   Self-citation (Van reeuwijk Sips)   (Correct)

....entirely at processor 0. The Spar Java compiler generates C code with explicit send and receive primitives. Figure 2 shows the generated inproduct C code (edited for readability) The Spar Java compiler performs a sophisticated analysis to identify opportunities for message aggregation [12]. With the inproduct code, the compiler infers that B and C MESSAGE = MSGBUF; all processors: pack outgoing message for(int i=procno;i A.length;i =P) MESSAGE = B[i] C[i] all processors: send message to owner emb send(owner(A) MSGBUF,MESSAGE MSGBUF) owner: receive and unpack ....

....of a distributed array, and allows the cache to be used more efficiently. This will raise packing speeds considerably and increase the relative impact of pipelining. A potential disadvantage is that every array access requires an additional global to local offset translation, but as we describe in [12] this calculation can usually be lifted out of loops. 5. Related work Overlapping computation with communication is a well known concept; many message passing systems, including MPI, provide asynchronous send (and receive) primitives. Making effective use of these primitives is often the task of ....

C. van Reeuwijk, W. Denissen, H. Sips, and E. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, Sept. 1996.


Spar Language Specification - Containing a description of the.. - van Reeuwijk (2001)   Self-citation (Van reeuwijk)   (Correct)

....range, m = N P ext is assumed. The (cyclic i m) placement function places index i onto processor p = i m) mod P ext . If no m is specified, the value 1 is assumed. For data that is distributed with the block and cyclic functions, the compiler is able to apply specific optimizations, see [17] for details. With these functions, all HPF data mappings can be specified, even alignments, although templates are not explicitly visible as in HPF. Annotating declarations By annotating a member function, the user can specify the group of processors allowed to execute the member function. For ....

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, September 1996.


Annotating Spar/Java for data-parallel programming - van Reeuwijk, Kuijlman.. (2000)   Self-citation (Van reeuwijk Sips)   (Correct)

....p = a i b) m) mod P ext . If no m is speci ed, the value 1 is assumed. For data that is distributed with the block and cyclic functions, the compiler is able to generate highly ecient code for the enumeration of local elements and the translation of global to local array indices; see [15] for details. With these functions, all the data mappings of High Performance Fortran (HPF) 2.0 [12] can be speci ed. See Section 6 for more details. 4.1 Annotating declarations By annotating a member function, the user can specify the group of processors allowed to execute the member function. ....

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897-914, September 1996.


Spar 1.3 Language Specification - van Reeuwijk (2000)   Self-citation (Van reeuwijk)   (Correct)

....range, m = N P ext is assumed. The (cyclic i m) placement function places index i onto processor p = i m) mod P ext . If no m is specified, the value 1 is assumed. For data that is distributed with the block and cyclic functions, the compiler is able to apply specific optimizations, see [15] for details. With these functions, all HPF data mappings can be specified, even alignments, although templates are not explicitly visible as in HPF. Annotating declarations By annotating a member function, the user can specify the group of processors allowed to execute the member function. For ....

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897--914, September 1996.


Code generation techniques for the task-parallel.. - Kuijlman, van..   Self-citation (Van reeuwijk Sips)   (Correct)

....mapping and or alignment information of tasks and data can be specified by the programmer and passed to the compiler through annotations. The Spar compiler generates SPMD target code. Although much work has previously been done on the efficient generation of SPMD data parallel programs [17], much less is the case for the efficient and systematic generation of SPMD programs with task parallel constructs. This paper shows how to systematically generate such programs. The paper is organized as follows. Section 2 gives an introduction to Spar and describes the language constructs we ....

.... By using this, we obtain: Example 8 Generated optimized code SPMD Vnus: for [i=0:n, on processor(p[i] A[i] compute(B[i] synchronized(m1) C = reduce(C,A[i] If the on processor(p[i] expression is regular, we can use standard iteration squeezing techniques as described in [17]. 9 Conclusion In this paper we have described a compilation scheme to translate implicitly parallel programs in the programming language Spar to code for a distributedmemory parallel computer system. The compilation scheme has been formulated as a set of transformation rules. Work is currently ....

C. van Reeuwijk, W. Denissen, H.J. Sips, and E.M. Paalvast. An implementation framework for HPF distributed arrays on message-passing parallel computer systems. IEEE Transactions on Parallel and Distributed Systems, 7(9):897-- 914, September 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC