| D. Callahan and K. Kennedy. Compiling programs for distributed memory multiprocessors. The Journal of Supercomputing, 2(2), October 1988. |
....with the others. This is the abstract graph method used for SIGNAL programs [15] ffl Compiling the source program into a single object program, and then distributing this centralized program towards as many programs as locations, so that each location only has to perform its own computations [5]. Owning to the common format OC, this method can be applied to any synchronous language. The last two methods are complementary: the distribution of source programs avoids the problem of code size explosion, while the distribution of object programs allows to take advantage of all the ....
D. Callahan and K. Kennedy. Compiling programs for distributed memory multiprocessors. Journal of Supercomputing, 2(2):151--169, June 1988.
....with the development of such compilers in the industry and academia are currently involved in defining High Performance Fortran (HPF) a new Fortran standard. Given the data distribution, the basic rule of compilation used by most of the systems described above is the owner computes rule [11], according to which it is the processor owning a data item that has to perform all computations for that item. Any data values required for the computation that are not available locally have to be obtained via interprocessor communication. Therefore, with such compilers, one of the most ....
D. Callahan and K. Kennedy. Compiling programs for distributed-memory multiprocessors. The Journal of Supercomputing, 2:151--169, October 1988.
....a distributed memory system in a way that maximizes the number of each processor s accesses that are satisfied by the tiles in its local memory is called the data partitioning problem or the domain decomposition problem. This problem is non trivial and has been the focus of much recent attention [2, 3, 6, 14, 17]. Our work applies to systems that perform data partitioning either at compile time or program creation time and to systems where we know which data tiles are accessed by which processors. An important problem faced by systems that perform data partitioning irrespective of whether they perform ....
D. Callahan and K. Kennedy. Compiling Programs for Distributed-Memory Multiprocessors. Jou,'nal of Super'computing, 2(151-169), October 1988.
....this proposal uses the terms static software DSM and dynamic software DSM to refer to members of the first and second class, respectively. Static Approaches Static software DSM systems are typified by compilers for FORTRAN style scientific codes targeting message passing multicomputers [9, 41, 51, 64, 79, 87]. These are typically data parallel systems with a single thread of control; parallelism can only be expressed in the form of a large number of similar (possibly identical) operations applied in parallel to elements of large, dense arrays according to some user specified iteration space (parallel ....
David Callahan and Ken Kennedy. Compiling Programs for Distributed-Memory Multiprocessors. Journal of Supercomputing, pages 151--169, October 1988.
....added to insure the semantics of the output code. Nevertheless, if we can prove the dividend is positive, we can use the Fortran division. 6 Related work Techniques to generate distributed code from sequential or parallel code using a uniform memory space have been extensively studied since 1988 [22, 70, 89]. Techniques and prototypes have been developed based on Fortran [38, 39, 47, 18, 69, 88, 19, 20] C [8, 63, 6, 60, 7, 61] or others languages [74, 75, 58, 66, 57] The most obvious, most general and safest technique is called run time resolution [22, 70, 74] Each instruction is guarded by a ....
.... been extensively studied since 1988 [22, 70, 89] Techniques and prototypes have been developed based on Fortran [38, 39, 47, 18, 69, 88, 19, 20] C [8, 63, 6, 60, 7, 61] or others languages [74, 75, 58, 66, 57] The most obvious, most general and safest technique is called run time resolution [22, 70, 74]. Each instruction is guarded by a condition which is only true for processors that must execute it. Each memory address is checked before it is referenced to decide whether the address is local and the reference is executed, whether it is remote, and a receive is executed, or whether it is ....
[Article contains additional citation context not shown here]
David Callahan and Ken Kennedy. Compiling programs for distributed-memory multiprocessors. The Journal of Supercomputing, 2:151--169, 1988.
....it is difficult to program such an architecture. The programmer must explicitly distribute data and must program the transfer of data among the processors. One approach considered by researchers provides a combination of language extensions and compilation techniques for programming such systems [1] [9] 10] 15] In this ap proach, the user specifies the distribution of data by using language extensions and the compiler translates the program for single program multiple data (SPMD) execution. One approach to translation yields a program in which a processor executes a statement only if a ....
....1. Avoiding sequentialization caused by communication; 2. Avoiding redundant communication; 3. Overlapping of communication and computation; and 4. Combining small messages sent to the same destination into larger messages. Significant work has been done in optimizing the com munication [1] [6] 4] 11] In this paper, we propose to perform all of the above optimizations in a unifying framework. This framework allows us to deal with the tradeoff between conflicting optimizations. We develop a data flow framework for collecting the information needed to perform the above ....
[Article contains additional citation context not shown here]
D. Callahan, and K. Kennedy, "Compiling Pro- grams for Distributed-Memory Multiprocessors," The Journal of Supercomputing, Vol. 2, 1988.
....changes may require major program restructuring. An alternative approach is to automatically generate parallel programs in SPMD (Single Program Multiple Data) Karp87] format, given a data decomposition specification. This approach has recently gained a lot of attention. It has been applied by [Callahan88, Gerndt89, Kennedy89,90] for applications to Fortran, by [Andre90] to C, by [Rogers89] to Id Nouveau, by [Koelbel90] to Kali Fortran, by [Quinn89] to C , and by [Paalvast90] to the fourth generation parallel programming language Booster. In particular application to Fortran shows some limitations, due to equivalencing, ....
....and integrated. Several optimizations have been derived for important classes of data decompositions and access functions which are generally applicable. Several optimizations discussed in this paper have also been presented by others, albeit in the context of some programming language [Callahan88, Gerndt89, Koelbel90, Paalvast90] . We have generalized these results and placed them in a general context. Advantage of the methodology lies in application of the view concept, which through its associated calculus, allows for automated reasoning about compile time optimizations and code generation for a broad range of ....
D. Callahan, K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors, " The Journal of Supercomputing, Vol.2, No.2, Oct. 1988.
....This insight has led to the approach of parallelism through program annotations, incorporating explicit data decompositions. From these data decomposition specifications, SPMD (Single Process Multiple Data) code [Karp87] can be generated automatically. This approach is followed by [Callahan88, Gerndt89, Kennedy89, 4 Koelbel89] in FORTRAN, by [Rogers89] in Id Nouveau, and by [Quinn89] in C . This concept is also followed in Booster [Paalvast90] 3. Booster Language concepts Booster is a high level, fourth generation, algorithm description language for sequential and parallel computers. Parallel computers may be either ....
D. Callahan, K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors," The Journal of Supercomputing, Vol. 2, No. 2, October 1988, pp. 151-169.
....communication and synchronization, is generated automatically by the compiler. Furthermore, the compiler uses a model base of target architectures in order to optimize computation and communication efficiency. The approach of inducing parallelism by explicitly decomposing the data is not new. In [Callahan88, Gerndt89, Kennedy89] applications to Fortran are described, in [Rogers89] to Id Nouveau, in [Koelbel87] to BLAZE, and in [Quinn89] to C . In particular application to Fortran is limited, because of equivalencing, passing of array subsections to subroutine calls, and any form of indirect addressing cannot be ....
D. Callahan, K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors," The Journal of Supercomputing, Vol. 2, No. 2, October 1988, pp. 151-169.
....to improve the efficiency. The programming model offered by Fortran D [3] and HPF [4] gives the programmer control over how to align and distribute data structures across processors. The compiler has to be able to assign work to processors (the ownership rule is the simplest way to do this [5]) and restructure loop nests with the aim of avoiding non local accesses as much as possible, and when necessary, optimize communication to transfer remote data [6] The single program multiple data (SPMD) model [7] is used to generate code. Each processor runs the same program but accesses to ....
Callahan D. and Kennedy K., Compiling Programs for Distributed-Memory Multiprocessors, Journal of Supercomputing, vol. 2, no. 2, October 1988.
....to improve the efficiency. The programming model offered by Fortran D [3] and HPF [4] gives the programmer control over how to align and distribute data structures across processors. The compiler has to be able to assign work to processors (the ownership rule is the simplest way to do this [5]) and restructure loop nests with the aim of avoiding non local accesses as much as possible, and when necessary, optimize communication to transfer remote data [6] The single program multiple data (SPMD) model [7] is used to generate code. Each processor runs the same program but accesses to ....
Callahan D. and Kennedy K., Compiling Programs for Distributed-Memory Multiprocessors, Journal of Supercomputing, vol. 2, no. 2, October 1988.
....data among processors trying to maximize the parallelism obtained with a good load balance over time while minimizing the amount of interprocessor communication required. Other approaches try to extend a programming language with directives that control the mapping of variables to local memories [CaKe88, KeZi89, Tsen89]. The compiler automatically tries to perform a task partitioning assuming the data partitioning specified by the user. It also inserts all message passing communication that is required to maintain the semantics of the original sequential program. Optimizing communication by message fussion is ....
D. Callahan and K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors", The Journal of Supercomputing, No. 2, October 1988.
....aspect. Instead, the implementations are compared on the processor overhead of constructing the messages. 3. 4 Related work The automatic generation of message passing programs from data distribution specifications has been explored for some time in the context of various data parallel languages [7, 8, 9, 10, 11, 12]. The recent definition of HPF [1] has added some new data alignment and data distribution features for which no efficient solutions existed. As a consequence, new results have been reported in [13, 6, 14, 15, 16, 17, 18, 19, 20, 21] and, more recently and concurrent with this paper, 22, 23, 20, ....
....As a consequence, new results have been reported in [13, 6, 14, 15, 16, 17, 18, 19, 20, 21] and, more recently and concurrent with this paper, 22, 23, 20, 24, 25, 26] Early optimization techniques only consider non aligned arrays. The first optimizations were reported by Callahan and Kennedy [7] and Gerndt [8] They considered non aligned block(m) distributions with linear array access functions. Gerndt also showed how overlap can be handled. In Paalvast et al. 10] a solution for monotone array access functions and block(m) distributions was given. Solutions for cyclic(1) and linear ....
D. Callahan and K. Kennedy, "Compiling programs for distributed-memory multiprocessors", The Journal of Supercomputing, vol. 2, no. 2, pp. 151--169, October 1988.
....machines are partitioning the data and computation across processors, then introducing communication for nonlocal accesses where needed. The compiler partitions computation across processors using the owner computes rule where each processor only computes values of data it owns [6, 12, 25]. It performs a large number of communication and parallehsm optimizations based on data dependence. Details of the compilation process are presented elsewhere [13, 16, 17, 27] 2.2 Prototype Compiler The prototype Fortran D compiler is implemented as a source to source Fortran translator in the ....
....of the CM Fortran compiler, Fortran 90 syntax does not eliminate the need for advanced compile time analysis and optimization. 6 Related Work The Fortran D compiler is a second generation distributedmemory compiler that incorporates and extends features from previous compilation systems [6, 12, 19, 21, 25]. Compared with other contemporary systems [2, 3, 7, 8, 18, 24] The Fortran D compiler is less flexible but performs deeper compile time analysis, many more advanced optimizations, requires fewer language extensions, and relies on less runtime support. Few researchers have published experimental ....
D. Callahan and K. Kennedy. Compiling programs for distributed-memorymultiprocessors. Journal of Supercomputing, 2:151-169, October 1988.
....arbitrary user specified contiguous rectangular distributions. Superb also originated the overlap concept as a means to both specify and store nonlocal data accesses. Wolfe [Wol89, Wol90] describes transformations such as loop rotation for programs with BLOCK distributions. Callahan and Kennedy [CK88] propose methods for compiling programs with userspecified data distribution functions. They also demonstrate how such programs can be optimized using loop transformations. Booster [PvGS90] also provides user specified distribution functions defined as program views, but does not generate or ....
D. Callahan and K. Kennedy. Compiling programs for distributed-memory multiprocessors. Journal of Supercomputing, 2:151--169, October 1988.
No context found.
D. Callahan and K. Kennedy. Compiling programs for distributed memory multiprocessors. The Journal of Supercomputing, 2(2), October 1988.
No context found.
D. Callahan and K. Kennedy. Compiling programs for distributed memory multiprocessors. The Journal of Supercomputing, 2(2), October 1988.
No context found.
D. Callahan and K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors", The Journal of Supercomputing, No. 2, October 1988.
No context found.
D.Callahan and K.Kennedy. "Compiling Programs for Distributed-Memory Multiprocessors". J.Supercomputing 2, october 1988, pp. 151-169, Kluwer Academic Publishers.
No context found.
D. Callahan and K. Kennedy, "Compiling programs for distributed-memory multiprocessors, " The Journal of Supercomputing, vol. 2, no. 2, pp. 151--169, Oct. 1988.
No context found.
D. Callahan and K. Kennedy,"Compiling Programs for Distributed-Memory Multiprocessors" The Journal of Supercomputing, no. 2, pp. 151-169, 1988, Kluwer Academic Publishers.
No context found.
David Callahan and Ken Kennedy. Compiling programs for distributed-memory multiprocessors. The Journal of Supercomputing, 2(2):151-169, October 1988.
No context found.
David Callahan and Ken Kennedy. Compiling programs for distributed- memory multiprocessors. Journal of Supercomputing, 2:151-169, 1988.
No context found.
D. Callahan and K. Kennedy. Compiling programs for distributed-memory multiprocessors. Journal of Supercomputing, 2:151-169, October 1988.
No context found.
D. Callahan and K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors", The Journal of Supercomputing, No. 2, October 1988.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC