52 citations found. Retrieving documents...
G. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. 6(7):747--754, July 1995.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Platforms for HPJava: Runtime Support for Scalable Programming in.. - Lim (2003)   (Correct)

....and general operation, it is actually one of the more simple collectives to implement in the HPJava framework. General algorithms for this primitive have been described by other authors in the past. For example it is essentially equivalent to the operation called Regular Section Copy Sched in [6]. In this section we want to illustrate how this kind of operation can be implemented in term of the particular Range and Group classes of HPJava, complemented by suitable set of messaging primitives. All collective operations in the library are based on communication schedule objects. Each kind ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compiletime approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


Benchmarking HPJava: Prospects for Performance - Lee, Carpenter, Fox, Lim   (Correct)

....that some computations are localized to some processors, and for writing a distributed form of the parallel loop. Crucially, it also supports binding from the extended languages to various communication and arithmetic libraries. These might involve simply new interfaces to some subset of PARTI [1], Global Arrays [10] Adlib [7] MPI [9] and so on. Providing the libraries for irregular communication may well be important. Evaluating HPspmd programming language model on large scale applications is also an important issue. 1 2 HPSpmd Programming Language Model 2.1 HPspmd Language ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


High Level Support for Distributed High Performance Computing - Laure (2001)   (Correct)

....communication at a high level. KeLP encapsulates communication activity using a model known as communication orchestration. Under this model, all communication is expressed in terms of atomic array section moves. The implementation of communication follows the inspector executor model [1]. The KeLP programmer expresses and optimizes data motion and decomposition using high level geometric set operations and may specify elaborate interpolation or coupling functions to handle complicated boundary conditions arising in multidisciplinary applications. KeLP supports a task parallel ....

F. Agrawal, A. Sussman, and J. Saltz. An Integrated Runtime and Compile-Time Approach for Parallelizing Structured and Block Structured Applications. IEEE Trans. on Parallel and Distributed Systems, 6(7):747--754, July 1995.


The Measured Network Traffic of Compiler-Parallelized Programs - Dinda, Garcia, Leung (2001)   (Correct)

....[12] and MPI [19] and parallel languages such as High Performance Fortran [10] HPF) have been, greatly enhancing the portability of parallel programs to workstation clusters. Further, the parallel computing community has developed extremely efficient implementations of these APIs and languages [17, 1, 4]. As implementations continue to become more efficient, the performance of the network will be increasingly important. In addition to significantly increased connection and aggregate bandwidths, next generation networks will supply quality of service (QoS) guarantees for connections [2, 3] ....

G. Agrawal, A. Sussman, and J. Salz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Transaction on Parallel and Distributed Systems, 6(7):747--754, July 1995.


A Framework for Partitioning Composite Grids - Rantakokko (1998)   (Correct)

....It is then not appropriate to partition the arrays over all processors disregarding the coupling between the arrays. Also, partitioning small arrays is not very efficient as it yields relatively much communication points compared to computation points. The HPF 2 standard and the approach in [2] use compiler directives to define processor subspaces and to map different arrays to different subsets of the processors. If the arrays have varying lengths a severe load imbalance can arise between the subsets. The definition of the sizes of the processors subspaces, i.e. the load balancing, is ....

G. Agrawal, A. Sussman, J. Saltz, An integrated runtime and compile- time approach for parallelizing structured and block structured applications, ICASE report no. 93-77, NASA Langley Research Center, Hampton, Virginia, 1993.


HPspmd: Data Parallel SPMD Programming Models from Fortran to.. - Carpenter, Fox (1998)   (Correct)

....for general parallel programs with regular distributed arrays. They emphasize high level communication primitives for particular styles of programming, rather than speci c numerical algorithms. These libraries include rutimes libraries for HPF like languages, such as Adlib and Multiblock Parti [1], and the Global Array toolkit [32] Adlib is a runtime library was initially designed to support HPF translation. It provides communication primitives similar to Multiblock PARTI, plus all Fortran 90 transformational intrinsics for arithmetic on distributed arrays. It also provides some ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compiletime approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


Experiments with "HPJava" - Carpenter, Chang, Fox, Leskiw, Li (1997)   (7 citations)  (Correct)

.... Shift this block s upper edge into next neighbour s lower ghost edge HPJava.Output(next) write(block[blockSize] HPJava.Output(next) flush( HPJava.Input(prev) read(block[0] Shift this block s lower edge into prev neighbour s upper ghost edge HPJava.Output(prev) write(block[1]) HPJava.Output(prev) flush( HPJava.Input(next) read(block[blockSize 1] Calculate a block of neighbour sums. Update block of board values. Figure 1: Skeleton of socket based Life program. 8 int sums[ new int[blockSize] N] Calculate block of neighbour ....

.... iter NITER ; iter ) Shift this block s upper edge into next neighbour s lower ghost edge MPI.WORLD.Send(block[blockSize] N, MPI.BYTE, next, 0) MPI.WORLD.Recv(block[0] N, MPI.BYTE, prev, 0) Shift this block s lower edge into prev neighbour s upper ghost edge MPI.WORLD.Send(block[1], N, MPI.BYTE, prev, 0) MPI.WORLD.Recv(block[blockSize 1] N, MPI.BYTE, next, 0) Calculate block of neighbour sums. Update block of board values. MPI.Finalize( Figure 3: Simple MPI Life program. 11 direct. In the next section we illustrate some of the added value that ....

[Article contains additional citation context not shown here]

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compiletime approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


An HPspmd Programming Model (Extended Abstract) - Carpenter, al. (2000)   (Correct)

....primarily as underlying support for general parallel programs with regular distributed arrays. They emphasize high level communication primitives for particular styles of programming, rather than speci c numerical algorithms. These libraries include compiler runtime libraries like Multiblock Parti [1] and Adlib [21] and application level libraries like the Global Array toolkit [17] Adlib is a runtime library that was designed to support HPF translation. It provides communication primitives similar to Multiblock PARTI, plus the Fortran 90 transformational intrinsics for arithmetic on ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compiletime approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


Language Bindings for a Data-Parallel Runtime - Carpenter, Fox, Leskiw, Li.. (1998)   (3 citations)  (Correct)

....expressivity comparable to full HPF. 2 Background: runtime kernel The kernel of NPAC library is a C class library. It is most directly descended from the run time library of an earlier research implementation of HPF [7] with in uences from the Fortran 90D run time and the CHAOS PARTI libraries [1, 11, 5]. The kernel is currently implemented on top of MPI. The library design is solidly object oriented, but eciency is maintained as a primary goal. The overall architecture of the library is illustrated in gure 1. At the top level there are several compilerspeci c interfaces to a common run time ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


Java as a Language for Scientific Parallel Programming - Carpenter, Chang, Fox, Li (1997)   (1 citation)  (Correct)

.... Shift this block s upper edge into next neighbour s lower ghost edge HPJava.Output(next) write(block[blockSize] HPJava.Output(next) flush( HPJava.Input(prev) read(block[0] Shift this block s lower edge into prev neighbour s upper ghost edge HPJava.Output(prev) write(block[1]) HPJava.Output(prev) flush( HPJava.Input(next) read(block[blockSize 1] Calculate a block of neighbour sums. Update block of board values. Fig. 1. Skeleton of socket based Life program. 5 void Send(Object buf, int offset, int count, Datatype datatype, int dest, int tag) ....

....directives and language extensions in HPF as well as the HPF library. We will loosely distinguish two different levels at which a library implementation of the HPF semantics (or, at least, the HPF distributed data model) can operate. The first is the level of the so called run time libraries [1, 7, 8, 5]. This kind of library provides functions for scheduling and executing specific patterns of collective communication already identified by a compiler (in the HPF case) or else by an application programmer using the library directly. Such a library may also provide functions for translating between ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


Themis: Component Dependence Metadata In Adaptive.. - Kelly, Beckmann.. (2001)   (Correct)

....of the Standard Template Library [27] POOMA [25] is a C library designed to represent common abstractions in computational science applications. PETSc [4, 5] extends array objects with communication methods. OPlus [11] manages communication in unstructured meshes in Fortran. KeLP [17] and CHAOS [1, 20, 32] introduced the idea of inspecting the irregular data structure to plan the communication required. Although these approaches help manage the communications involved, none of them provides any automated support for resource management in applications with several parallel components. ....

.... From the skeletons community, we have taken the idea of optimising compositions of parallel software components. From the restructuring compilers community we have taken the mathematical formulation of dependence and transformation of a component s iteration space. From KeLP [17] and Chaos [1,20,32] we have taken the idea of metadata to describe data shape and dependence, the idea of planning parallel execution by processing this metadata, and the idea that metadata can be globally replicated even if data is not. The main challenge for future work is to provide flexible, powerful, explicit ....

Gagan Agrawal, Alan Sussman, and Joel Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6(7):747--754, July 1995.


A Parallel Software Infrastructure for Dynamic Block-Irregular.. - Kohn (1995)   (12 citations)  (Correct)

....inadequate for dynamic problems. Such applications will also require sophisticated run time support to manage changing data distributions and communication patterns. A number of run time support systems have already been developed, including CHAOS (formerly called PARTI) 60] multiblock PARTI [3], and Multipol [40] Both CHAOS and multiblock PARTI have been used as run time support for data parallel Fortran compilers. CHAOS has been very successful in addressing unstructured problems such as sparse linear algebra and finite elements [58] Multiblock PARTI has been employed in the ....

.... new data type (an unstructured Region) to address other classes of irregular scientific applications, such as unstructured finite element 38 problems and irregularly coupled regular meshes [45] The goal of SA is to unify several previous domain specific systems, including LPARX, multiblock PARTI [3], and CHAOS [60] 2.4.2 Parallel Languages The parallel programming literature describes numerous languages, each of which provides facilities specialized for its own intended class of applications. In the following survey, we evaluate various parallel languages on their ability to solve the ....

[Article contains additional citation context not shown here]

G. Agrawal, A. Sussman, and J. Saltz, An integrated runtime and compile-time approach for parallelizing structured and block structured applications, IEEE Transactions on Parallel and Distributed Systems, (to appear).


Flexible Communication Mechanisms for Dynamic Structured.. - Fink, Baden, Kohn (1996)   (19 citations)  (Correct)

....to implement due to elaborate, dynamic data structures. Since these structures give rise to unpredictable communication patterns, parallelization is difficult. To ease the programmer s burden, programming languages and libraries can hide many low level details of a parallel implementation [1, 2, 3, 4, 5, 6]. We present Kernel Lattice Parallelism (KeLP) a C class library that provides high level abstractions to manage data layout and data motion for dynamic irregular block structured applications 1 . KeLP supports data orchestration, a model which enables the programmer to express dependence ....

....computational structure from the data itself [2, 7] KeLP utilizes structural abstraction to provide intuitive geometric operations for manipulating a high level description of data dependence patterns. KeLP relies on a generalization of the inspector executor model employed in Multiblock PARTI [4]. KeLP encodes data dependence patterns into an object called a MotionPlan, and interprets the corresponding data motion using a handler called a Mover. The programmer may customize the data motion pattern and interpretation according to the needs of the application. KeLP s abstractions offer ....

[Article contains additional citation context not shown here]

G. Agrawal, A. Sussman, and J. Saltz, "An integrated runtime and compile-time approach for parallelizing structured and block structured applications," IEEE Transactions on Parallel and Distributed Systems, to appear.


An HPspmd Programming Model (Extended Abstract) - Carpenter, al. (1999)   (Correct)

....as underlying support for general parallel programs with regular distributed arrays. They emphasize high level communication primitives for particular styles of programming, rather than specific numerical algorithms. These libraries include compiler runtime libraries like Multiblock Parti [1] and Adlib [21] and application level libraries like the Global Array toolkit [17] Adlib is a runtime library that was designed to support HPF translation. It provides communication primitives similar to Multiblock PARTI, plus the Fortran 90 transformational intrinsics for arithmetic on ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compiletime approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


A Programming Methodology for Dual-tier Multicomputers - Baden, Fink (1999)   (8 citations)  (Correct)

....be known at compile time. They can depend on the input to the problem, to conditions evolving at run time, or both. We may describe these block structured communication patterns using a table of meta data, containing descriptions of the regular sections to be moved, i.e. a communication schedule [19]. This model is sufficiently general to treat a wide range of applications, including uniform finite difference methods (Fig. 2) blocked numerical linear algebra (Fig. 10) and irregular adaptive and multilevel methods [6] We have just seen how a collective model captures the communication ....

....that represents a potentially irregular block data decomposition. Alternatively, a FloorPlan can represent distribution of work among processors within a single node, as it is derived from Map. The MotionPlan implements a dependence descriptor, which is also known as a communication schedule [19]. The programmer builds and manipulates MotionPlans using geometric Region calculus operations, a process which will be described shortly. 4.4 Storage Model The Point, Region, Map, FloorPlan, and MotionPlan meta data may live at any of the three levels of control flow. Meta data can pass through ....

G.Agrawal, A.Sussman, and J.Saltz, "An Integrated Runtime and Compile-Time Approach for Parallelizing Structured and Block Structured Applications," IEEE Transactions on Parallel and Distributed Systems, Vol. 6, No. 7, Jul. 1995, pp. 747--754.


Efficient Run-time Support for Irregular Block-Structured.. - Fink, Baden, Kohn (1998)   (8 citations)  (Correct)

....non uniform parallel memory hierarchies [2] the programmer must judiciously exploit parallelism and locality in the application to match the hardware capabilities. To ease the programmer s burden, programming languages and libraries can hide many low level details of a parallel implementation [20, 24, 16, 1, 12, 10, 35, 15, 11, 23, 28, 4]. We present Kernel Lattice Parallelism (KeLP) a C class library that provides high level abstractions to manage data layout and data motion for dynamic block structured applications. Block structures arise in many scientific applications ranging from finite difference methods for partial ....

....For example, in Figure 1b, the irregularly shaped fine level communicates with the irregularly shaped coarse level in the shadow cast by the fine level. KeLP represents a data motion pattern between XArrays using the MotionPlan abstraction. The MotionPlan is a first class communication schedule [16, 1] object that encodes a set of array section copy operations between XArrays (Fig. 4) The programmer builds and modifies a MotionPlan using Region calculus operations described in the previous sub section. KeLP data structures communicate via block copy operations. A single block copy operation is ....

[Article contains additional citation context not shown here]

Agrawal, G., Sussman, A., and Saltz, J. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Trans. on Parallel and Distrib. Systems 6, 7 (July 1995), 747-54.


HPspmd: Data Parallel SPMD Programming Models from Fortran to.. - Carpenter, Fox (1998)   (Correct)

....for general parallel programs with regular distributed arrays. They emphasize high level communication primitives for particular styles of programming, rather than specific numerical algorithms. These libraries include rutimes libraries for HPF like languages, such as Adlib and Multiblock Parti [1], and the Global Array toolkit [32] Adlib is a runtime library was initially designed to support HPF translation. It provides communication primitives similar to Multiblock PARTI, plus all Fortran 90 transformational intrinsics for arithmetic on distributed arrays. It also provides some ....

A. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compiletime approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 6, 1995.


Irregular Coarse-Grain Data Parallelism Under LPARX - Scott Kohn   (16 citations)  (Correct)

....and recursive coordinate bisection decompositions. The pC programming language [10] implements a collection abstraction which includes a coarse grain data parallel loop over objects within the collection; pC employs a data decomposition scheme similar to that of HPF. The Multiblock PARTI [1] and CHAOS [16] libraries provide run time support for data parallel compilers such as HPF. CHAOS is targeted towards unstructured calculations such as sweeps over finite element meshes or sparse matrix calculations. Multiblock To appear in J. Scientific Programming 20 PARTI has been targeted to ....

G. Agrawal, A. Sussman, and J. Saltz, An integrated runtime and compile-time approach for parallelizing structured and block structured applications, IEEE Transactions on Parallel and Distributed Systems, (to appear).


A Programming Methodology for Dual-tier Multicomputers - Baden, Fink (1999)   (8 citations)  (Correct)

....communication patterns will never change in the middle of computational phases, but only between them. Thus, we may may describe block structured communication patterns using a table of meta data, containing descriptions of the regular sections to be moved, i.e. a communication schedule [19]. This model is sufficiently general to treat a wide range of applications, including uniform finite difference methods (Fig. 2) customized broadcasts within processor geometries (Fig. 11) and even 3 Though the blocks of data are themselves structured, in general the sizes may be non uniform, ....

....that represents a potentially irregular block data decomposition. Alternatively, a FloorPlan can represent distribution of work among processors within a single node, as it is derived from Map. The MotionPlan implements a dependence descriptor, which is also known as a communication schedule [19]. The programmer builds and manipulates MotionPlans using geometric Region calculus operations, a process which will be described shortly. 4.4 Storage Model The Point, Region, Map, FloorPlan, and MotionPlan meta data may live at any of the three levels of control flow. Meta data can pass ....

G. Agrawal, A. Sussman, and J. Saltz, "An integrated runtime and compile-time approach for parallelizing structured and block structured applications," IEEE Transactions on Parallel and Distributed Systems, vol. 6, Jul. 1995.


Software Infrastructure for Non-Uniform Scientific Computations on .. - Baden (1996)   (3 citations)  (Correct)

....communication handlers for the five communication patterns arising in the application. By customizing the handlers we were able to reduce communication overheads by as much as a factor of 2 to 4 [10] The run time system that provides the closest functionality to KeLP is Multiblock PARTI [1], which employs the inspector executor model. Under Multiblock PARTI, communication schedules are opaque objects over which the programmer has limited control. By comparison, KeLP s motion abstractions are exposed to the user as first class language objects, which provide the flexibility to handle ....

Gagan Agrawal, Alan Sussman, and Joel Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Trans. Parallel Distr. Sys., to appear.


Multiple data parallelism with HPF and KeLP - Merlin, Baden, Fink, Chapman (1998)   (4 citations)  (Correct)

....describes a set of block copy operations. It is built incrementally at runtime from its constituent block copy operations. Mover: which performs the communication pattern encoded in a MotionPlan in a collective communication operation. Communication is based on the inspector executor paradigm [1]. The MotionPlan and Mover generalise the schedule and executor, respectively, of this model. Finally, KeLP defines a parallel looping construct 5 , for all end for all, which iterates concurrently over the Grids of an XArray. For example, if X is an XArray then: forall (i, X) code ....

G. Agrawal, A. Sussman and J. Saltz, An Integrated Runtime and CompileTime Approach for Parallelizing Structured and Block Structured Applications, IEEE Trans. on Parallel and Distributed Systems, 6 (7) (1995) 747--754.


Efficient Communication Between Parallel Programs with InterComm - Jae-Yong Lee And   Self-citation (Sussman)   (Correct)

No context found.

G. Agrawal, A. Sussman, and J. Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. 6(7):747--754, July 1995.


Interprocedural Compilation of Irregular Applications for.. - Gagan Agrawal (1995)   (4 citations)  Self-citation (Agrawal Saltz)   (Correct)

....in which communication preprocessing calls are inserted and or collective communication routines are used. We have shown in our previous work how communication preprocessing is useful in regular applications in which data distribution, strides and or loop bounds are not known at compile time [3, 5, 4, 32] or when the number of processors available for the execution of the program varies at runtime [15] The rest of the paper is organized as follows. In Section 2, we discuss the basic IPRE framework. In Section 3, we present several new optimizations required for compiling irregular applications. ....

Gagan Agrawal, Alan Sussman, and Joel Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 1995. To appear. Also available as University of Maryland Technical Report CS-TR-3143 and UMIACS-TR93 -94.


High Performance Fortran: History, Status and Future - Mehrotra, Van Rosendale, Zima (1997)   (Correct)

No context found.

G. Agrawal, A. Sussman and J. Saltz. An Integrated Runtime and Compile-Time Approach for Parallelizing Structured and Block Structured Applications. IEEE Transactions on Parallel and Distributed Systems, 6(7):747754, July, 1995.


Final Report on Research in Parallel Computing.. - December Carnegie (1996)   (Correct)

No context found.

Agrawal, G., A. Sussman, and J. Saltz. An Integrated Runtime and Compile-Time Approach for Parallelizing Structure and Block Structured Applications. Technical Report 93-77, ICASE, 1993.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC