See this document in CiteSeerX!

Evaluating the Impact of Memory System Performance on Software Prefetching and Locality Optimizations (2000)  (Make Corrections)  (6 citations)
Aneesh Aggarwal, Abdel-Hameed A. Badawy, Chau-Wen Tseng, Donald Yeung



  Home/Search   Context   Related

 
View or download:
umd.edu/projects/cosmic/p...memtr00.ps
umd.edu/pub/SoftwareLocality.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  umd.edu/~tseng/papers (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Software prefetching and locality optimizations are techniques for overcoming the gap between processor and memory speeds. Using the SimpleScalar simulator, we evaluate the impact of memory bandwidth and latency on the effectiveness of software prefetching and locality optimizations on three types of applications: regular scientific codes, irregular scientific codes, and pointer-based codes. We find software prefetching hides memory costs but increases instruction count and requires greater... (Update)

Cited by:   More
Optimizing Compiler for a CELL Processor - Eichenberger, O'Brien, O'Brien.. (2005)   (Correct)
Effectiveness of Simple Memory Models for Performance.. - Irina Chihaia Thomas (2004)   (Correct)
Avoiding Store Misses to Fully Modified Cache Blocks - Unknown   (Correct)

Similar documents (at the sentence level):
31.8%:   Evaluating the Impact of Memory System Performance on.. - Badawy, Aggarwal.. (2001)   (Correct)
5.7%:   Software Support For Improving Locality in Advanced Scientific Codes - Tseng (2000)   (Correct)

Active bibliography (related documents):   More   All
0.8:   Multi-Chain Prefetching: Exploiting Natural Memory Parallelism.. - Seungryul   (Correct)
0.4:   Software Support For Improving Locality in Scientific Codes - Han, Rivera, Tseng (2000)   (Correct)
0.3:   Quantifying the Performance Potential of a Data Prefetch.. - Mehrotra, Harrison (1995)   (Correct)

Similar documents based on text:   More   All
0.3:   Multi-Chain Prefetching: Exploiting Memory Parallelism in.. - Kohout, Choi, Yeung (2000)   (Correct)
0.1:   Exploiting Application-Level Information to Reduce Memory.. - Agarwal, Yeung (2002)   (Correct)
0.1:   A Study of Source-Level Compiler Algorithms - For Automatic Construction   (Correct)

Related documents from co-citation:   More   All
3:   Cache-conscious structure layout - Chilimbi, Hill et al. - 1999
2:   Memory bandwidth bottleneck and its amelioration by a compiler - Ding, Kennedy - 1999
2:   Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Assoc.. - Jouppi - 1990

BibTeX entry:   (Update)

A. Aggarwal, A.-H. Badawy, D. Yeung, and C.-W. Tseng. Evaluating the impact of memory system performance on software prefetching and locality optimizations. Technical Report CS-TR-4169, Dept. of Computer Science, University of Maryland at College Park, July 2000. http://citeseer.ist.psu.edu/article/aggarwal00evaluating.html   More

@misc{ aggarwal00evaluating,
  author = "A. Aggarwal and A. Badawy and D. Yeung and C. Tseng",
  title = "Evaluating the impact of memory system performance on software prefetching
    and locality optimizations",
  text = "A. Aggarwal, A.-H. Badawy, D. Yeung, and C.-W. Tseng. Evaluating the impact
    of memory system performance on software prefetching and locality optimizations.
    Technical Report CS-TR-4169, Dept. of Computer Science, University of Maryland
    at College Park, July 2000.",
  year = "2000",
  url = "citeseer.ist.psu.edu/article/aggarwal00evaluating.html" }
Citations (may not include all citations):
474   A data locality optimizing algorithm (context) - Wolf, Lam
443   Improving Direct-Mapped Cache Performance by the Addition of.. - Jouppi - 1990
376   The cache performance and optimizations of blocked algorithm.. (context) - Lam, Rothberg et al. - 1991
249   Tolerating Latency Through SoftwareControlled Data Prefetchi.. - Mowry - 1994
249   Tolerating latency through software-controlled prefetching i.. - Mowry, Gupta - 1991
170   A partitioning strategy for nonuniform problems on multiproc.. (context) - Berger, Bokhari - 1987
162   Improving data locality with loop transformations - McKinley, Carr et al. - 1996
161   The SimpleScalar Tool Set (context) - Burger, Austin - 1997
149   Software prefetching (context) - Callahan, Kennedy et al. - 1991
124   Tile size selection using cache organization and data layout - Coleman, McKinley - 1995
121   An Architecture for Software-Controlled Data Prefetching (context) - Klaiber, Levy - 1991
115   Communication optimizations for irregular scientific computa.. - Das, Uysal et al. - 1994
112   Supporting dynamic data structures on distributed memory mac.. - Rogers, Carlisle et al. - 1995
104   Prefetching using Markov Predictors - Joseph, Grunwald - 1997
104   Compiler-based prefetching for recursive data structures - Luk, Mowry - 1996
98   Evaluating Stream Buffers as a Secondary Cache Replacement (context) - Palacharla, Kessler - 1994
94   Stride Directed Prefetching in Scalar Processors (context) - Fu, Patel et al. - 1992
83   Data transformations for eliminating conflict misses - Rivera, Tseng - 1998
78   Data Prefetching in Multiprocessor Vector Cache Memories (context) - Fu, Patel - 1991
77   Cache miss equations: An analytical representation of cache .. - Ghosh, Martonosi et al. - 1997
73   Dependence Based Prefetching for Linked Data Structures - Roth, Moshovos et al. - 1998
73   Cache-conscious structure layout - Chilimbi, Hill et al. - 1999
72   Cacheconscious data placement - Calder, Krintz et al. - 1998
57   Improving cache performance of dynamic applications with com.. - Ding, Kennedy - 1999
48   New tiling techniques to improve cache temporal locality - Song, Li - 1999
38   Effective Jump-Pointer Prefetching for Linked Data Structure.. (context) - Roth, Sohi - 1999
34   A Prefetching Technique for Irregular Accesses to Linked Dat.. - Karlsson, Dahlgren et al. - 2000
31   A comparison of compiler tiling algorithms - Rivera, Tseng - 1999
30   Early experiences with Olden (context) - Carlisle, Rogers et al. - 1993
30   Improving memory hierarchy performance for irregular applica.. (context) - Mellor-Crummey, Whalley et al. - 1999
29   Compiler and software distributed shared memory support for .. - Lu, Cox et al. - 1997
26   Augmenting loop tiling with data alignment for improved cach.. - Panda, Nakamura et al. - 1999
24   Examination of a Memory Access Classification Scheme for Poi.. - Mehrotra, Harrison - 1996
23   Localizing nonaffine array references - Mitchell, Carter et al. - 1999
21   Tolerating latency in multiprocessors through compiler-inser.. (context) - Mowry - 1998
17   A comparison of locality transformations for irregular codes - Han, Tseng - 2000
16   Memory hierarchy management for iterative graph structures (context) - Al-Furaih, Ranka - 1998
12   Effective HardwareBased Data Prefetching for High-Performanc.. (context) - Chen, Baer - 1995
11   Tiling optimizations for 3d scientific computations - Rivera, Tseng - 2000
11   Compiler Optimization Technique for Data Cache Prefetching U.. (context) - Chi - 1994
9   Sunder: A Programmable Hardware Prefetch Architecture for Nu.. (context) - Chiueh - 1994
7   An Effective Programmable Prefetch Engine for On-Chip Caches (context) - Chen - 1995
6   An experimental evaluation of tiling and shacking for memory.. (context) - Kodukula, Pingali et al. - 1999
4   Streaming Prefetch - Temam - 1996



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.umd.edu/~tseng/papers.html):   More
Reducing Synchronization Overhead for Compiler-Parallelized .. - Han, Tseng, Keleher (1997)   (Correct)
Unified Compilation Techniques for Shared and.. - Tseng, Anderson.. (1995)   (Correct)
Eliminating Conflict Misses for High Performance Architectures - Rivera, Tseng (1998)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC