Abstract: The Charon toolkit for piecemeal development of high-efficiency parallel programs for scientific computing is described. The portable toolkit, callable from C and Fortran, provides flexible domain decompositions and high-level distributed constructs for easy translation of serial legacy code or design to distributed environments. Gradual tuning can subsequently be applied to obtain high performance, possibly by using explicit message passing. Charon also features general structured communications that support stencil-based computations with complex recurrences. Through the separation of partitioning and distribution, the toolkit can also be used for blocking of uni-processor code, and for debugging of parallel algorithms on serial machines. An elaborate review of recent parallelization aids is presented to highlight the need for a toolkit like Charon. Some performance results of parallelizing the NAS Parallel Benchmark SP program using Charon are given, showing good scalability.
|
847
|
Memory coherence in shared virtual memory systems
– Li, Hudak
- 1989
|
|
801
|
How to Make a Multiprocessor Computer that Correctly Executes Multiprocess Programs
– Lamport
- 1979
|
|
483
|
MPI: The Complete Reference
– Snir, Otto, et al.
- 1996
|
|
462
|
The NAS Parallel Benchmarks
– Bailey, Barton, et al.
- 1991
|
|
389
|
The High Performance Fortran Handbook
– Koelbel, Loveman, et al.
- 1994
|
|
372
|
TreadMarks: Shared Memory Computing on Networks of Workstations
– Amza, Cox, et al.
- 1996
|
|
298
|
Fortran D Language Specification
– Fox, Hiranandani, et al.
- 1990
|
|
212
|
Maximizing Multiprocessor Performance with the SUIF
– Hall, Anderson, et al.
- 1996
|
|
188
|
The NAS Parallel Benchmarks 2.0
– Bailey, Harris, et al.
- 1995
|
|
188
|
How to write parallel programs: A guide to the perplexed
– Carriero, Gelernter
- 1989
|
|
141
|
ScaLAPACK: a Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance," presented at Supercomputing '96
– Blackford, al
- 1996
|
|
118
|
CASHMERE2L: Software Coherent Shared Memory on a Clustered Remote-Write Network
– Stets, Dwarkadas, et al.
- 1997
|
|
87
|
Task parallelism in a High Performance Fortran framework.IEEE Parallel
– GROSS, O'HALLARON, et al.
- 1994
|
|
86
|
PETSc 2.0 users manual
– Balay, Gropp, et al.
- 1999
|
|
86
|
Techniques for Reducing Consistency-Related Communication in Distributed SharedMemory Systems
– Carter, Bennet, et al.
- 1995
|
|
84
|
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
– Hall, Amarasinghe, et al.
- 1995
|
|
78
|
Easy to Use Object-Oriented Parallel Programming with Mentat
– Grimshaw
- 1993
|
|
56
|
Preliminary experiences with the Fortran D compiler
– Hiranandani, Kennedy, et al.
- 1993
|
|
55
|
A Software Architecture for Multidisciplinary Applications: Integrating Task and Data Parallelism
– Chapman, Mehrotra, et al.
- 1994
|
|
42
|
Towards transparent and efficient software distributed shared memory
– Scales, Gharachorloo
- 1997
|
|
39
|
Programming in Vienna Fortran
– Chapman, Mehrotra, et al.
- 1992
|
|
34
|
Flexible Communication Mechanisms for Dynamic Structured Applications
– Fink, Baden, et al.
- 1996
|
|
33
|
Double Standards: Bringing Task Parallelism to HPF via the Message Passing Interface
– Foster, Kohr, et al.
- 1996
|
|
25
|
Supporting irregular distributions in FORTRAN 90D/HPF compliers
– Ponnusamy, Hwang, et al.
- 1995
|
|
22
|
PLAPACK: Parallel linear algebra package - design overview
– Alpatov, Baker, et al.
- 1997
|
|
22
|
The Optimizing SISAL Compiler: Version 12.0
– Cann
- 1992
|
|
21
|
Computer Aided Parallelisation Tools (CAPTools)-Conceptual Overview and Performance on the Parallelisation of Structured Mesh Codes
– Ierotheou, Johnson, et al.
- 1996
|
|
20
|
PARTI primitives for unstructured and block structured problems
– Sussman, Saltz, et al.
- 1992
|
|
18
|
Portable performance of data parallel languages
– NGO, SNYDER, et al.
- 1997
|
|
17
|
Overture: An object-oriented software system for solving partial differential equations in serial and parallel environments
– Brown, Chesshire, et al.
- 1997
|
|
15
|
Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries
– Balay, Gropp, et al.
- 1997
|
|
10
|
Dame: An environment for preserving efficiency of data parallel computations on distributed systems
– Colajanni, Cermele
- 1997
|
|
6
|
The global array programming model for high performance scientific computing
– Nieplocha, Harrison, et al.
- 1995
|
|
5
|
Programming with the HPC++ Parallel Standard Template Library
– Johnson, Gannon
- 1997
|
|
4
|
Parallel programming in Split C", Supercomputing
– Culler, Dusseau, et al.
- 1993
|
|
4
|
Fortran M: A language for modular programming
– Foster, Chandy
- 1992
|
|
4
|
Implementation of an explicit Navier-Stokes algorithm on a distributed memory parallel computer
– Scherr
- 1993
|
|
4
|
der Wijngaart, "Efficient implementation of a 3-dimensional ADI method on the iPSC/860," Supercomputing '93
– Van
- 1993
|
|
3
|
Evaluation of automatic parallelization strategies for HPF compilers, " Research Report 95-44, Ecole Normale Sup'erieure de
– Boulet, Brandes
- 1996
|
|
3
|
Optimizing data-parallel stencil computations in a portable framework
– Chappelow, Hatcher, et al.
- 1995
|
|
3
|
Implementing a parallel C++ runtime system for scalable parallel systems
– Gannon
- 1993
|
|
3
|
PSPARSLIB: A portable library of parallel sparse iterative solvers
– Saad, Kuznetsov, et al.
- 1997
|
|
3
|
LINSOL, a parallel iterative linear solver package of generalized CG-type for sparse matrices
– Schonauer, Hafner, et al.
- 1997
|
|
3
|
JTPack90: A parallel, object-based, Fortran90 linear algebra package
– Turner, Ferrel, et al.
- 1997
|
|
3
|
An architecture-independent parallel implicit flow solver with efficient I/O
– Wijngaart, Yarrow, et al.
- 1997
|
|
2
|
KeLP User Guide Version 1.3
– Baden, Shalit, et al.
|
|
2
|
Solving PDE problems on parallel and distributed computer systems using the NAG parallel library," High Performance Computing and Networking '97
– Krommer, Derakhshan, et al.
- 1997
|
|
1
|
Automatic Data Distribution for CFD Applications on Structured Grids
– Frumkin, Yan
- 1999
|
|
1
|
SMS Users Guide," NOAA/Forecast Systems Laboratory
– Henderson, Schaffer, et al.
- 2000
|
|
1
|
Use Computer-aided tools to parallelize large CFD applications
– Jin, Frumkin, et al.
- 2000
|