Download:
|
by Christoph W. Ke��ler, Fachbereich Informatik
http://www.cs.rice.edu/~kremer/AP95/papers/kessler.ps
Add To MetaCart
Abstract:
Scalable parallel numerical libraries and automatically parallelizing compilers seem to be contrary approaches to the same goal: the user--friendly generation of efficient parallel numerical programs for shared--memory and distributed--memory multiprocessors. We propose a framework that integrates the library approach and parallelizing compiler technology. It is based on fast and powerful pattern recognition in sequential source programs and considerate local algorithm replacement. By a simplified prototype implementation, we demonstrate the functionality of this approach for a massively parallel shared--memory target machine, the SB-PRAM. We further propose constructive guidelines to adapt the method to distributed--memory multiprocessors by integrating an automatic array distribution engine, by synthetic performance prediction, and by data--distribution--independent design of library routine specification.
Citations
|
552
|
Partial evaluation and automatic program generation
– Jones, Gomard, et al.
- 1993
|
|
150
|
ScaLAPACK: a scalable linear algebra library for distributed memoryconcurrent computers
– Choi, Dongarra, et al.
- 1992
|
|
129
|
Data optimization: Allocation of arrays to reduce communication on SIMD machines
– Knobe, Lukas, et al.
- 1990
|
|
120
|
Compiling communication-efficient programs for massively parallel machines
– LI, CHEN
- 1991
|
|
115
|
Index Domain Alignment: Minimizing Cost of Cross{referencing between Distributed Arrays
– Li, Chen
- 1990
|
|
113
|
Supporting Shared Data Structures on Distributed Memory Architectures. PPoPP
– Koelbel, Mehrotra, et al.
- 1990
|
|
88
|
User’s guide to the p4 parallel programming system
– Butler, Lusk
- 1992
|
|
66
|
Automatic Data Layout Using 0{1 Integer Programming
– Bixby, Kennedy, et al.
- 1993
|
|
61
|
NP-completeness of Dynamic Remapping
– Kremer
- 1993
|
|
26
|
Data Optimization: Minimizing Residual Interprocessor Data Motion on SIMD machines
– Knobe, Natarajan
- 1990
|
|
19
|
The multicomputer toolbox approach to concurrent
– Falgout, Skjellum, et al.
- 1992
|
|
14
|
On Physical Realizations of the Theoretical PRAM Model
– Abolhassan, Keller, et al.
- 1991
|
|
13
|
The MPI Message Passing Interface Standard
– Clarke, Glendinning, et al.
- 1994
|
|
11
|
FORK: A High--Level Language for PRAMs. Future Generation Computer Systems
– Hagerup, Schmitt, et al.
- 1992
|
|
8
|
Automatic parallelization for distributed memory multiprocessors
– Dierstein, Hayer, et al.
- 1994
|
|
7
|
ScaLAPACK Reference Manual: Parallel Factorization
– Choi, Dongarra, et al.
- 1994
|
|
5
|
Automatische Parallelisierung numerischer Programme durch Mustererkennung
– Ke��ler
- 1994
|
|
5
|
Symbolic Array Data Flow Analysis and Pattern Recognition in Dense Matrix Computations
– Ke��ler
- 1994
|
|
2
|
Pattern-Driven Automatic Program Transformation and Parallelization
– Ke��ler
- 1995
|
|
2
|
Automatic Parallelization by Pattern Matching
– Ke��ler, Paul
- 1993
|
|
2
|
Making FORK Practical
– Kessler, Seidl
- 1995
|
|
1
|
The Integrated Library Approach to Parallel Computing
– Knies, Adams
- 1993
|