9 citations found. Retrieving documents...
G. Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitat Berlin, Fachbereich Informatik, 1999.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Flattening is an Improvement (Extended Abstract) - Riely, Prins (2000)   (Correct)

....Skillicorn and Cai [31] presented a cost calculus for parallel programs using the Bird Meertens formalism. This approach has been developed further by Jay [16,15] using shape analysis. Another promising direction is that of Keller, who develops transformations that take distribution into account [17]. In this setting, flattening can profitably be combined with deforestation and related techniques [35,18,11] Flattening is an Improvement 15 Nested data parallelism may be seen as a particular form of the more general andparallelism found in logic programs [13] Research on the parallel ....

G. Keller. Transformation-Based Implementation of Nested Data-Parallelism for Distributed Memory Machines. PhD thesis, TU Berlin, 1999.


Skeleton Implementations Based on Generic Data Distributions - Nitsche (2000)   (1 citation)  (Correct)

....only o ers arrays, but allow the programmer to de ne his own data distributions (covers) as well as his own skeletons in terms of such covers. While systems that only deal with one speci c data type like arrays in HPF or nested vectors in NESL [Ble90] can generate quite ecient code for them [Kel99] we do not expect that an automatic transformation of a high level speci cation of data distributions with arbitrary types will always give the optimal solution. Therefore, we allow the user to give additional information which may be hard or impossible to analyse in the general case. The ....

G. Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitat Berlin, Fachbereich Informatik, 1999.


Flattening is an Improvement - Riely, Prins (2000)   (4 citations)  (Correct)

....Skillicorn and Cai [31] presented a cost calculus for parallel programs using the Bird Meertens formalism. This approach has been developed further by Jay [16,15] using shape analysis. Another promising direction is that of Keller, who develops transformations that take distribution into account [17]. In this setting, flattening can profitably be combined with deforestation and related techniques [35,18,11] Flattening is an Improvement 15 Nested data parallelism may be seen as a particular form of the more general andparallelism found in logic programs [13] Research on the parallel ....

G. Keller. Transformation-Based Implementation of Nested Data-Parallelism for Distributed Memory Machines. PhD thesis, TU Berlin, 1999.


Irregular Computations in Fortran - Expression and.. - Prins, Chatterjee..   (Correct)

....and increased load imbalance. Recent work has resulted in run time scheduling techniques that minimize completion time and memory use of the generated threads [11, 8, 27] The automatic construction of threads of appropriate granularity is currently being investigated by several researchers [26, 19]. In Fig. 4(c) we show a decomposition of the total work into four parallel threads T 1 ; T 4 . In this decomposition the body of the inner FORALL loop has been serialized to increase the grain size of each thread. As a result the amount of work in each thread is quite different. On the other ....

....work compared with a sequential implementation. However, full parallelism and optimal load balance are easily achieved in this approach. Compile time techniques to fuse data parallel operations can reduce the number of barrier synchronizations, decrease space requirements, and improve reuse [14, 31, 19]. 11 To flatten the sparse matrix vector product smvp, we replace the nested sequence representation of A with a linearized (flattened) representation #A 0 ; s#.HereA 0 is an array of r pairs, indexed by val and col, partitioned into rows of A by s, i.e. s is an array of n integers equal to ....

[Article contains additional citation context not shown here]

G. Keller. Transformation-based Implementation of Nested Data-Parallelism for Distributed Memory Machines. Dissertation, Technische Universitt Berlin, 1999. 4.1, 4.2, 4.2


Functional Array Fusion - Chakravarty, Keller (2001)   (4 citations)  Self-citation (Keller)   (Correct)

....and, in case of lazy languages, opportunities for unboxing are increased; 2) it improves load balancing and data distribution in a parallel implementation. The transformation was mainly studied for the second reason. There is plenty of literature discussing the merits of attening for parallelism [7, 33, 11, 34, 21, 22]. We recently extended attening to be suitable for typed functional languages such as Haskell and Standard ML [10] In the following, we merely want to summarise the main properties of the code resulting from attening as this will be the input to the transformations discussed in this paper. For ....

G. Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitat Berlin, Fachbereich Informatik, 1999.


Nepal - Nested Data-Parallelism in Haskell - Chakravarty, Keller, al. (2001)   (1 citation)  Self-citation (Keller)   (Correct)

....arrays to be distributed across processing nodes if they occur in a program executed on a distributed memory machine. It is the responsibility of the execution mechanism to select a distribution which realises a good compromise between optimal load balance and minimal data re distribution see [22] for the corresponding implementation techniques. The type of a parallel array containing elements of type is denoted by [ This notation is similar to the list syntax and, in fact, parallel arrays enjoy the same level of syntactic support as lists where the brackets [j and j] denote array ....

G. Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitat Berlin, Fachbereich Informatik, 1999.


How Portable is Nested Data Parallelism? - Chakravarty, Keller (1999)   Self-citation (Keller)   (Correct)

....[9] and scalar vector single memory shared memory distributed memory Workstations (SPARC etc. VX 1 (Fujitsu) E6500 (Sun) T94 (Cray) Origin 2000 (SGI) T3E (Cray) AP 3000 (Fujitsu) VPP 700 (Fujitsu) Workstation Clusters Fig. 1. Architecture space the second author s thesis [14] demonstrated that attened code can indeed be successfully compiled to modern machines. In the approach discussed in this paper, we optimise attened code using calculational fusion [19, 14] and generate C code that uses collective communication operations, which are implemented using one sided ....

....(Fujitsu) VPP 700 (Fujitsu) Workstation Clusters Fig. 1. Architecture space the second author s thesis [14] demonstrated that attened code can indeed be successfully compiled to modern machines. In the approach discussed in this paper, we optimise attened code using calculational fusion [19, 14] and generate C code that uses collective communication operations, which are implemented using one sided communication; the latter allows a uniform view on di erent memory models. In summary, the paper makes the following three contributions: 1) It proposes a design for a highly portable ....

[Article contains additional citation context not shown here]

Gabriele Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitt Berlin, Fachbereich Informatik, 1999.


On the Distributed Implementation of Aggregate Data.. - Keller, Chakravarty (1999)   (3 citations)  Self-citation (Keller)   (Correct)

....are easily re ordered and can be proved correct individually. Although we provide some technical detail, we do not have enough room for discussing all important points; the proposed technique has been applied to the implementation of nested data parallelism in the first author s PhD thesis [17], where the missing detail can be found. Our main contributions are the following three: We introduce the notion of distributed types, to statically distinguish between local and distributed data structures. We outline an intermediate language based on distributed types that allows to ....

.... following: propagate A : hhAii hhAii for integers and floats propagate maxA : hhAii hhAii for integers and floats propagate and : hhBoolii hhBoolii and so on : Formally, the semantics of the split, join, and propagate operations can be characterized by a set of axioms [17] 4 Optimizing with Distributed Types With L DT we can realize the three compiler phases marked by the grey area in Figure 1: unfolding of the library primitives, optimizations, and code generation. We do not have the space to discuss them in detail, but like to outline some important points of ....

[Article contains additional citation context not shown here]

Gabriele Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitt Berlin, Fachbereich Informatik, 1998. To appear.


More Types for Nested Data Parallel Programming - Chakravarty, Keller (2000)   (7 citations)  Self-citation (Keller)   (Correct)

....for exploiting the memory hierarchies of modern processor architectures. We use a combination of an intermediate language that makes distribution explicit in the type system and aggressive deforestation techniques to produce code that performs well on networked, cache based processors [17, 15] this is handled by the transformation steps following the grey box in Figure 1 and not further discussed in the present paper. Currently, the only alternative to using the attening transformation for the implementation of general nested parallelism are thread based compilation techniques [2, ....

....v = e1 in e2 ) v :e2 ) e1 . Although, for an ecient implementation of a wide range of data parallel algorithms, we need some more functions (like permutations, reductions, and scans) we omit them here as their inclusion would not lead to any additional technical insights. See, for example, [15] for a comprehensive discussion of the required functions and their parallel implementation. Regarding mapP, we assume that the restrictions outlined in Subsection 2.4 are met. We will neither formalise the type system nor the semantics of PA due to the limited space and as both follow the ....

[Article contains additional citation context not shown here]

G. Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitat Berlin, Fachbereich Informatik, 1999.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC