(Enter summary)
Abstract: this article, we present compiler optimizations to improve data
locality based on a simple yet accurate cost model. The model computes both temporal and spatial
reuse of cache lines to find desirable loop organizations. The cost model drives the application
of compound transformations consisting of loop permutation, loop fusion, loop distribution, and
loop reversal. We demonstrate that these program transformations are useful for optimizing
many programs. To validate our optimization strategy,... (Update)
Cited by: More
Exploiting Cache Locality At Run-Time - Yan (1998)
(Correct)
Concurrency And Computation: Practice And Experience - Concurrency Computat Pract
(Correct)
Analysis and Evaluation of The Synchronized - Pipelined Parallelism Model (2006)
(Correct)
Similar documents (at the sentence level):
44.3%: Compiler Optimizations for Improving Data Locality - Carr, McKinley, Tseng (1994)
(Correct)
6.6%: An Analysis of Loop Permutation on the HP PA-RISC - Carr, Wu (1995)
(Correct)
Active bibliography (related documents): More All
0.5: Improving Data Locality with Loop Transformations - McKinley (1996)
(Correct)
0.2: Maximizing Loop Parallelism and Improving Data Locality via.. - Kennedy, McKinley (1994)
(Correct)
0.2: A Compiler Optimization Algorithm for Shared-Memory Multiprocessors - McKinley (1998)
(Correct)
Similar documents based on text: More All
0.3: Finding Your Cronies: Static Analysis for Dynamic Object.. - Guyer, McKinley (2004)
(Correct)
0.3: Compiling for the Impulse Memory Controller - Huang, Wang, McKinley (2001)
(Correct)
0.3: Memory Management for High-Performance Applications - Berger (2002)
(Correct)
Related documents from co-citation: More All
63: A Data Locality Optimizing Algorithm (context) - Wolf, Lam - 1991
41: Strategies for cache and local memory management by global program transformaion.. (context) - Gannon, Jalby et al. - 1988
33: The cache performance and Optimizations of Blocked Algorithms (context) - Lam, Rothberg et al. - 1991
BibTeX entry: (Update)
Kathryn S. McKinley, Steve Carr, and Chau-Wen Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Lanaguages and Systems, 18(4):424--453, July 1996. http://citeseer.ist.psu.edu/article/mckinley96improving.html More
@article{ mckinley96improving,
author = "Kathryn S. McKinley and Steve Carr and Chau-Wen Tseng",
title = "Improving Data Locality with Loop Transformations",
journal = "ACM Transactions on Programming Languages and Systems",
volume = "18",
number = "4",
month = "July",
publisher = "ACM Press",
pages = "424--453",
year = "1996",
url = "citeseer.ist.psu.edu/article/mckinley96improving.html" }
Citations (may not include all citations):
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991
376
The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991
258
Automatic translation of Fortran programs to vector form
- Allen, Kennedy - 1987
216
Strategies for cache and local memory management by global p.. (context) - Gannon, Jalby et al. - 1988
178
Supernode partitioning (context) - Irigoin, Triolet - 1988
171
Dependence graphs and compiler optimizations (context) - Kuck, Kuhn et al. - 1981
158
Improving register allocation for subscripted variables
- Callahan, Carr et al. - 1990
124
Tile size selection using cache organization and data layout
- Coleman, McKinley - 1995
110
Practical dependence testing
- Goff, Kennedy et al. - 1991
82
On estimating and enhancing cache effectiveness (context) - Ferrante, Sarkar et al. - 1991
73
Iteration space tiling for memory hierarchies (context) - Wolfe - 1987
71
Improving locality and parallelism in nested loops (context) - Wolf - 1992
69
Access normalization: Loop restructuring for NUMA compilers
- Li, Pingali - 1992
69
Estimating interlock and improving balance for pipelined mac..
- Callahan, Cocke et al. - 1988
65
The ParaScope parallel programming environment
- Cooper, Hall et al. - 1993
51
Improving the ratio of memory operations to floating-point o..
- Carr, Kennedy
49
A methodology for procedure cloning
- Cooper, Hall et al. - 1993
49
Memory-hierarchy management
- Carr - 1992
49
The Tiny loop restructuring research tool
- Wolfe - 1991
47
Scalar replacement in the presence of conditional control fl..
- Carr, Kennedy
43
Automatic loop interchange (context) - Allen, Kennedy - 1984
34
A theory of loop permutations (context) - Banerjee - 1990
32
Interprocedural transformations for parallel code generation
- Hall, Kennedy et al. - 1991
30
Improving the performance of virtual memory computers (context) - Abu-Sufah - 1979
23
Analysis and transformation in an interactive parallel progr..
- Kennedy, McKinley et al. - 1993
19
Automatic and interactive parallelization
- McKinley - 1992
16
Advanced loop interchanging (context) - Wolfe - 1986
13
A hierachical basis for reordering transformations (context) - Warren - 1984
1
An analysis of loop permutation on the HP PA-RISC
- Carr, Wu - 1995
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cs.umd.edu/~tseng/papers.html): More
Reducing Synchronization Overhead for Compiler-Parallelized .. - Han, Tseng, Keleher (1997)
(Correct)
Unified Compilation Techniques for Shared and.. - Tseng, Anderson.. (1995)
(Correct)
Eliminating Conflict Misses for High Performance Architectures - Rivera, Tseng (1998)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC