See this document in CiteSeerX!

A General Stencil Compilation Strategy for Distributed-Memory Machines (1996)  (Make Corrections)  
Gerald Roth, Steve Carr, John Mellor-Crummey, Ken Kennedy



  Home/Search   Context   Related

 
View or download:
rice.edu/pub/CRPC...PCTR96652S.ps.gz
rice.edu/public/roth/stencilTR.ps.gz
gonzaga.edu/~roth/papers...stencilTR.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  rice.edu/CRPC/softli...TRs_online (more)
From:  rice.edu
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: For many Fortran 90 programs performing dense matrix computations, the main computational portion of the program belongs to a class of kernels known as stencils. This paper describes a strategy for optimizing such stencil computations for execution on distributedmemory multiprocessors. The optimizations presented target the overhead of data movement that occurs between processors, within the local memory of the processors, and between the memory and registers of the processors. We focus on the... (Update)

Similar documents (at the sentence level):   More
47.2%:   Optimizing Fortran90D/HPF for Distributed-Memory Computers - Roth (1997)   (Correct)
24.3%:   Compiling Stencils in High Performance Fortran - Roth, Mellor-Crummey.. (1997)   (Correct)
20.3%:   Compiling Stencils in High - Performance Fortran Gerald   (Correct)

Active bibliography (related documents):   More   All
0.7:   Loop Fusion in High Performance Fortran - Roth, Kennedy (1998)   (Correct)
0.3:   Optimizing Fortran 90D Programs for SIMD Execution - Roth (1993)   (Correct)
0.2:   Data Motion and High Performance Computing - S. Lennart Johnsson (1994)   (Correct)

Similar documents based on text:   More   All
0.4:   A New Parallel Matrix Multiplication Algorithm on.. - Choi (1997)   (Correct)
0.3:   Automatic Optimization of Communication in Compiling .. - Bordawekar.. (1996)   (Correct)

BibTeX entry:   (Update)

@misc{ roth-general,
  author = "Gerald Roth and Steve Carr and John Mellor-Crummey and Ken Kennedy",
  title = "A General Stencil Compilation Strategy for Distributed-Memory Machines",
  url = "citeseer.ist.psu.edu/roth96general.html" }
Citations (may not include all citations):
835   High Performance Fortran language specification - Fortran - 1993
474   A data locality optimizing algorithm (context) - Wolf, Lam - 1991
415   Efficiently computing static single assignment form and the .. - Cytron, Ferrante et al. - 1991
376   The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991
283   Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1989
201   Register allocation via coloring (context) - Chaitin, Auslander et al. - 1981
158   Improving register allocation for subscripted variables - Callahan, Carr et al. - 1990
137   Compiler optimizations for improving data locality - Carr, McKinley et al. - 1994
74   Updating distributed variables in local computations (context) - Gerndt - 1990
72   A catalogue of optimizing transformations (context) - Allen, Cocke - 1972
69   Estimating interlock and improving balance for pipelined mac.. - Callahan, Cocke et al. - 1988
51   Improving the ratio of memory operations to floating-point o.. - Carr, Kennedy - 1994
48   Dependence Analysis for Subscripted Variables and Its Applic.. (context) - Allen - 1983
44   Fortran at ten gigaflops: The Connection Machine convolution.. (context) - Bromley, Heller et al. - 1991
25   Problems to test parallel and vector languages (context) - Rice, Jing - 1990
22   for MIMD distributed-memory machines (context) - Choudhary, Fox et al. - 1992
19   Optimization of very high level languages (context) - Schwartz - 1975
13   Loop quantization: An analysis and algorithm (context) - Aiken, Nicolau - 1987
12   A compiler for a massively parallel distributed memory MIMD .. (context) - Sabot - 1992
12   Polyshift communications software for the Connection Machine.. - George, Brickner et al. - 1994
10   Tile size selection using cache organization (context) - Coleman, McKinley - 1995
9   Optimization techniques for SIMD Fortran compilers (context) - Knobe, Lukas et al. - 1993
8   Optimizing for parallelism and memory hierarchy (context) - Kennedy, McKinley - 1992
7   A stencil compiler for the Connection Machine models CM - Brickner, George et al. - 1993
7   Optimized CM Fortran compiler for the Connection Machine com.. (context) - Sabot - 1992
6   Optimizing Fortran 90 shift operations on distributed-memory.. - Kennedy, Mellor-Crummey et al. - 1995
6   Context optimization for SIMD execution - Kennedy, Roth - 1994
5   Performance analysis of four SIMD machines - Fatoohi - 1993

Documents on the same site (http://softlib.rice.edu/CRPC/softlib/TRs_online.html):   More
Experiences on Data-Parallel Programming - Clark, von Hanxleden, Kennedy (1994)   (Correct)
A Priori Estimates for Mixed Finite Element.. - Cowsar, Dupont, Wheeler   (Correct)
An Empirical Evaluation of Dependence Analysis in Parallel Program .. - Monk (1995)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC