MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Compiling stencils in high performance fortran (1997) [9 citations — 1 self]

Download:
pdf | ps
by Gerald Roth, John Mellor-crummey, Ken Kennedy, R. Gregg Brickner
In Proceedings of SC '97: High Performance Networking and Computing
http://www.supercomp.org/sc97/proceedings/TECH/ROTH/ROTH.PS
Add To MetaCart

Abstract:

For many Fortran90 and HPF programs performing dense matrix computations, the main computational portion of the program belongs to a class of kernels known as stencils. Stencil computations are commonly used in solving partial differential equations, image processing, and geometric modeling. The efficient handling of such stencils is critical for achieving high performance on distributed-memory machines. Compiling stencils into efficient code is viewed as so important that some companies have built special-purpose compilers for handling them and others have added stencilrecognizers to existing compilers. In this paper we present a general compilation strategy for stencils written using Fortran90 array constructs. Our strategy is capable of optimizing single or multistatement stencils and is applicable to stencils specified with shift intrinsics or with array-syntax all equally well. The strategy eliminates the need for pattern-recognition algorithms by orchestrating a set of optimizations that address the overhead of both intraprocessor and interprocessor data movement that results from the translation of Fortran90 array constructs. Our experimental results show that code produced by this strategy beats or matches the best code produced by the special-purpose compilers or pattern-recognition schemes that are known to us. In addition, our strategy produces highly optimized code in situations where the others fail, producing several orders of magnitude performance improvement, and thus provides a stencil compilation strategy that is more robust than its predecessors.

Citations

963 Performance Fortran Forum. High Performance Fortran language specification version 1.0 – High - 1993
639 Efficiently Computing Static Single Assignment Form and the Control Dependence Graph – Cytron, Ferrante, et al. - 1991
441 Optimizing Supercompilers for Supercomputers – Wolfe - 1989
253 Improving data locality with loop transformations – McKinley, Carr, et al. - 1996
188 Compiler optimizations for improving data locality – Carr, McKinley, et al. - 1994
74 Updating Distributed Variables in Local Computations – Gerndt - 1990
38 Fortran at ten gigaflops: The Connection Machine convolution compiler – BROMLEY, HELLER, et al. - 1991
33 Problems to Test Parallel and Vector Languages – Rice, Jing - 1990
31 An HPF compiler for the IBM SP2 – Gupta, Midkiff, et al. - 1995
30 Typed fusion with applications to parallel and sequential code generation – Kennedy, S - 1993
27 Application Benchmark Set for Fortran-D and High Performance Fortran – Mohamed, Fox, et al. - 1992
15 A compiler for a massively parallel distributed memory MIMD computer – Sabot - 1992
13 Compiling Fortran 77D and 90D for MIMD Distributed-Memory Machines – Choudhary, Fox, et al. - 1992
12 A stencil compiler for the Connection Machine models CM-2/200 – BRICKNER, GEORGE, et al. - 1993
11 Polyshift communications software for the Connection – George, Brickner, et al. - 1994
10 Compiling Data Parallel Programs to Message Passing Programs for Massively Parallel MIMD Systems – Brandes - 1993
9 Optimization techniques for SIMD Fortran compilers. Concurrency: Practice and Experience – Knobe, Lukas, et al. - 1993
8 PGHPF -- An optimizing High Performance Fortran compiler for distributed memory machines – Bozkus, Meadows, et al. - 1997
7 Optimizing Fortran 90 shift operations on distributed-memory multicomputers – Kennedy, Mellor-Crummey, et al. - 1995
6 Context optimization for SIMD execution – Kennedy, Roth - 1994
5 Optimizing Fortran90D/HPF for Distributed-Memory Computers – Roth - 1997
3 Techniques for compiling and executing HPF programs on shared-memory and distributed-memory parallel systems – Bozkus, Meadows, et al. - 1994
1 Low level HPF compiler benchmark suite – Haupt, Reddy, et al. - 1995