(Enter summary)
Abstract: Limited set-associativity in hardware caches can cause conflict misses when multiple data items map to the same cache locations. Conflict misses have been found to be a significant source of poor cache performance in scientific programs, particularly within loop nests. We present two compiler transformations to eliminate conflict misses: 1) modifying variable base addresses, 2) padding inner array dimensions. Unlike compiler transformations that restructure the computation performed by the... (Update)
Context of citations to this paper: More
...conflict cache misses related to the sub optimal data layout remain. Array padding has been proposed earlier to reduce the latter [16, 18, 20]. These approaches are useful for reducing the (cross ) conflict misses to some extent. However existing approaches do not eliminate...
Cited by: More
Cache Conscious Data Layout Organization for.. - Kulkarni, Ghez.. (2001)
(Correct)
Similar documents (at the sentence level):
18.0%: Data Layout Optimizations for High-Performance Architectures - Tseng
(Correct)
5.1%: Compiler Optimizations for High Performance Architectures - Han, Rivera, Tseng
(Correct)
Active bibliography (related documents): More All
0.2: Data Transformations for Eliminating Conflict Misses - Rivera, Tseng (1998)
(Correct)
0.1: Eliminating Conflict Misses for High Performance Architectures - Rivera, Tseng (1998)
(Correct)
0.1: A Comparison of Compiler Tiling Algorithms - Rivera, Tseng (1999)
(Correct)
Similar documents based on text: More All
0.3: Reordering and Storage Optimizations for Scientific Programs - Pike (2002)
(Correct)
0.3: Near-Optimal Padding for Removing Conflict Misses - Vera, Llosa, Gonzalez (2002)
(Correct)
0.1: Compiler Transformations for High-Performance Computing - David Bacon Susan (1993)
(Correct)
BibTeX entry: (Update)
G.Rivera, C.Tseng, "Compiler optimizations for eliminating cache conflict misses", Technical Report CS-TR-3819, Dept of Computer Science, University of Maryland, July 1997. http://citeseer.ist.psu.edu/rivera97compiler.html More
@techreport{ rivera97compiler,
author = "Gabriel Rivera and Chau-Wen Tseng",
title = "Compiler Optimizations for Eliminating Cache Conflict Misses",
number = "CS-TR-3819",
year = "1997",
url = "citeseer.ist.psu.edu/rivera97compiler.html" }
Citations (may not include all citations):
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991
344
Design and evaluation of a compiler algorithm for prefetchin..
- Mowry, Lam et al. - 1992
294
A loop transformation theory and an algorithm to maximize pa.. (context) - Wolf, Lam - 1991
283
Optimizing Supercompilers for Supercomputers (context) - Wolfe - 1989
216
Strategies for cache and local memory management by global p.. (context) - Gannon, Jalby et al. - 1988
216
Strategies for cacheand local memory management by global pr.. (context) - Gannon, Jalby et al. - 1987
178
Supernode partitioning (context) - Irigoin, Triolet - 1988
173
SUIF: An infrastructure for research on parallelizing and op..
- Wilson - 1994
137
Compiler optimizations for improving data locality
- Carr, Kinley et al. - 1994
124
Tile size selection using cache organization and data layout
- Coleman, McKinley - 1995
113
Data and computation transformation for multiprocessors
- Anderson, Amarasinghe et al. - 1995
106
Unifying data and control transformations for distributed sh..
- Cierniak, Li - 1995
82
On estimating and enhancing cache effectiveness (context) - Ferrante, Sarkar et al. - 1991
81
Reducing false sharing on shared memory multiprocessors thro..
- Jeremiassen, Eggers - 1995
77
Cache miss equations: An analytical representation of cache ..
- Ghosh, Martonosi et al. - 1997
71
Improving Locality and Parallelism in Nested Loops (context) - Wolf - 1992
70
Simple but effective techniques for NUMA memory management
- Bolosky, Fitzgerald et al. - 1989
67
Detecting coarse-grain parallelism using an interprocedural ..
- Hall, Amarasinghe et al. - 1995
38
the problem of optimizing data transfers for complex memory .. (context) - Gallivan, Jalby et al. - 1988
29
Shared data placement optimizations to reduce multiprocessor.. (context) - Torrellas, Lam et al. - 1990
21
Data cache performance of supercomputer applications (context) - Callahan, Porterfield - 1990
13
Performance prediction of loop constructs on multiprocessor .. (context) - Gallivan, Jalby et al. - 1989
12
A quantitative analysis of loop nest locality (context) - McKinley, Temam - 1996
1
Accessnormalization: Loop restructuring for NUMA compilers (context) - andK - 1992
Documents on the same site (http://www.cs.umd.edu/projects/cosmic/papers.html): More
Eliminating Conflict Misses for High Performance Architectures - Rivera, Tseng (1998)
(Correct)
Eliminating Barrier Synchronization for.. - Han, Tseng, Keleher (1998)
(Correct)
Data Layout Optimizations for High-Performance Architectures - Tseng
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC