(Enter summary)
Abstract: COOPERATIVE HARDWARE/SOFTWARE CACHING FOR The memory system remains a major performance bottleneck in modern and future architectures. In this dissertation, we propose a hardware/software cooperative approach and demonstrate its effectiveness. This approach combines the global yet imperfect view of the compiler with the timely yet narrow-scope context of the hardware. It relies on a light-weight extension to the instruction set architecture to convey compile-time knowledge (hints) to the... (Update)
Active bibliography (related documents): More All
3.6: Guided Region Prefetching: A Cooperative.. - Wang, Burger.. (2003)
(Correct)
2.2: Using the Compiler to Improve Cache Replacement Decisions - Wang, McKinley, Rosenberg, .. (2002)
(Correct)
0.8: Predictor-Directed Data Prefetching for Pointer-based Applications - Sair (2003)
(Correct)
Similar documents based on text: More All
0.1: Effective Compile-Time Analysis for Data Prefetching in Java - Cahoon (2002)
(Correct)
0.1: Data Reorganization for Improving Cache Performance of.. - Singhai (2002)
(Correct)
0.0: Compiling for the Impulse Memory Controller - Huang, Wang, McKinley (2001)
(Correct)
BibTeX entry: (Update)
@phdthesis{ wang-cooperative,
author = "Zhenlin Wang",
title = "Cooperative Hardware/Software Caching for Next-Generation Memory Systems",
url = "citeseer.ist.psu.edu/wang04nextgeneration.html" }
Citations (may not include all citations):
1575
Computer Architecture: A Quantitative Approach (context) - Hennessy, Patterson - 1995
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991
443
Improving direct-mapped cache performance by the addition of..
- Jouppi - 1990
344
Design and evaluation of a compiler algorithm for prefetchin..
- Mowry, Lam et al. - 1992
232
A study of replacement algorithms for a virtual-storage comp.. (context) - Belady - 1966
228
Points-to analysis in almost linear time
- Steensgaard - 1996
175
Evaluating associativity in CPU caches (context) - Hill, Smith - 1989
171
Dependence graphs and compiler optimizations (context) - Kuck, Kuhn et al. - 1981
164
A practical algorithm for exact array dependence analysis (context) - Pugh - 1992
162
Improving data locality with loop transformations
- McKinley, Carr et al. - 1996
149
Software prefetching (context) - Callahan, Kennedy et al. - 1991
149
An implementation of interprocedural bounded regular section..
- Havlak, Kennedy - 1991
137
Compiler optimizations for improving data locality
- Carr, McKinley et al. - 1994
136
superscalar microprocessor (context) - Yeager - 1996
132
The Alpha 21264 microprocessor (context) - Kessler - 1999
122
An effective on-chip preloading scheme to reduce data access.. (context) - Baer, Chen - 1991
121
An architecture for software-controlled data prefetching (context) - Klaiber, Levy - 1991
117
Clock rate versus IPC: The end of the road for conventional ..
- Agarwal, Hrishikesh et al. - 2000
110
Memory bandwidth limitations of future microprocessors
- Burger, Kagi et al. - 1996
110
Practical dependence testing
- Goff, Kennedy et al. - 1991
109
Cache profiling and the SPEC benchmarks: A case study
- Lebeck, Wood - 1994
104
Compiler-based prefetching for recursive data structures
- Luk, Mowry - 1996
104
Prefetching using Markov predictors
- Joseph, Grunwald - 1997
103
A case for direct-mapped caches (context) - Hill - 1988
98
Evaluating stream buffers as a secondary cache replacement (context) - Palacharla, Kessler - 1994
93
Aspects of Cache Memory and Instruction Buffer Performance (context) - Hill - 1987
90
Reducing memory latency via non-blocking and prefetching cac..
- Chen, Baer - 1992
87
Computing Surveys (context) - Smith - 1982
79
Column-associative caches: A technique for reducing the miss.. (context) - Agarwal, Pudar - 1993
73
Dependence based prefetching for linked data structures
- Roth, Moshovos et al. - 1998
73
Cache-conscious structure layout
- Chilimbi, Hill et al. - 1999
72
Cache-conscious data placement
- Calder, Krintz et al. - 1998
64
The microarchitecture of the Pentium 4 processor (context) - Hinton, Sager et al. - 2001
55
Interactive parallel programming using the ParaScope Editor
- Kennedy, McKinley et al. - 1991
50
Computer Architecture News (context) - Burger, Austin et al. - 1997
48
Speculative precomputation: Long-range prefetching of delinq..
- Collins, Wang et al. - 2001
47
Using generational garbage collection to implement cache-con.. (context) - Chilimbi, Larus - 1998
47
A performance comparison of contemporary DRAM architectures
- Cuppu, Jacob et al. - 1999
46
Precise miss analysis for program transformations with cache..
- Ghosh, Martonosi et al. - 1998
46
Basic block distribution analysis to find periodic behavior ..
- Sherwood, Perelman et al. - 2001
45
The Alpha 21264 microprocessor architecture (context) - Kessler, McLellan et al. - 1999
42
Cache-conscious structure definition
- Chilimbi, Davidson et al. - 1999
42
Lockup free instruction fetchprefetch cache organization (context) - Lockup, fetch et al. - 1981
41
SPAID: Software prefetching in pointer- and call-intensive e..
- Lipasti, Schmidt et al. - 1995
38
Fixed and adaptive sequential prefetching in shared-memory m.. (context) - Dahlgren, Dubois et al. - 1993
38
Compiler-directed page coloring for multiprocessors (context) - Bugnion, Anderson et al. - 1996
38
Effective jump-pointer prefetching for linked data structure.. (context) - Roth, Sohi - 1999
38
Efficient simulation of caches under optimal replacement wit..
- Sugumar, Abraham - 1993
37
Tolerating memory latency through software-controlled preexe..
- Luk - 2001
34
A prefetching technique for irregular accesses to linked dat..
- Karlsson, Dahlgren et al. - 2000
33
Load latency tolerance in dynamically scheduled processors
- Srinivasan, Lebeck - 1999
32
Speeding up irregular applications in shared memory multipro..
- Zhang, Torrellas - 1995
29
The IA-64 architecture at work (context) - Dulong - 1998
29
Data prefetching on the HP PA (context) - Santhanam, Gronish et al. - 1997
29
Compiler techniques for data prefetching on the PowerPC (context) - Bernstein, Cohen et al. - 1996
28
Data prefetching by dependence graph precomputation
- Annavaram, Patel et al. - 2001
26
A fully associative software-managed cache design (context) - Hallnor, Reinhardt - 2000
25
Data Prefetching for High Performance Processors
- Chen - 1993
22
Effectiveness of hardware-based stride and sequential prefet.. (context) - Dahlgren, Stenstrom - 1995
21
Generalized correlation-based hardware prefetching (context) - Charney, Reeves - 1995
21
Design and evaluation of dynamic access ordering hardware
- McKee, Aluwihare et al. - 1996
21
Exploiting spatial locality in data caches using spatial foo..
- Kumar, Wilkerson - 1998
21
A matrix-based approach to the global locality optimization ..
- Kandemir, Choudhary et al. - 1998
20
Dead-block prediction and dead-block correlating prefetchers (context) - Lai, Fide et al. - 2001
19
Run-time spatial locality detection and optimization
- Johnson, Merten et al. - 1997
19
Quantifying loop nest locality using SPEC'95 and the Perfect..
- McKinley, Temam - 1999
19
Hot pages: Software caching for raw microprocessors
- Moritz, Frank et al. - 1999
18
Reducing DRAM latencies with an integrated memory hierarchy ..
- Lin, Reinhardt et al. - 2001
17
EELRU: Simple and effective adaptive page replacement
- Smaragdakis, Kaplan et al. - 1999
15
Utilizing reuse information in data cache management (context) - Rivers, Tam et al. - 1997
15
Predictor-directed stream buffers
- Sherwood, Sair et al. - 2000
15
Dynamic hot data stream prefetching for general-purpose prog.. (context) - Chilimbi, Hirzel - 2002
15
pull: Data movement for linked data structures (context) - Yang, Lebeck - 1997
14
Data flow analysis for software prefetching linked data stru..
- Cahoon, McKinley - 2001
14
non-uniform cache structure for wire-delay dominated on-chip.. (context) - Kim, Burger et al. - 2002
12
Using a user-level memory thread for correlation prefetching
- Solihin, Lee et al. - 2002
12
Exploring the design space of future CMPs
- Huh, Burger et al. - 2001
11
Cooperative prefetching: Compiler and hardware support for e..
- Luk, Mowry - 1998
11
Hardware Techniques to Improve the Performance of the Proces..
- Burger - 1998
10
Distributed predictive cache design for high performance mem.. (context) - Alexander, Kedem - 1996
10
Compiler Support for Software Prefetching
- McIntosh - 1998
10
A compiler-assisted data prefetch controller
- VanderWiel, Lilja - 1999
10
Reducing cache misses using hardware and software page place..
- Sherwood, Calder et al. - 1997
9
Modified LRU policies for improving second-level cache bahav.. (context) - Wong, Baer - 2000
8
of Electrical and Computer Engineering (context) - Pai, Ranganathan et al. - 1997
8
Design and evaluation of compiler algorithms for preexecutio..
- Kim, Yeung - 2002
8
URSIM reference manual (context) - Zhang - 2000
7
Smarter memory: Improving bandwidth for streamed references
- McKee, Klenke et al. - 1998
7
Improving the Performance of Virtual Memory Computers (context) - Abu-Sufah - 1978
7
Buffer block prefetching method (context) - Gindele - 1977
6
Efficient discovery of regular stride patterns in irregular .. (context) - Wu - 2002
6
content-directed data prefetching mechanism (context) - Cooksey, Jordan et al. - 2002
6
or system overhead: Which has the largest impact on uniproce.. (context) - Cuppu, Jacob et al. - 2001
6
Effective hardware based data prefetching (context) - Chen, Baer - 1995
5
Memory-side prefetching for linked data structures
- Hughes, Adve - 2001
5
Cool-cache for hot multimedia
- Unsal, Ashok et al. - 2001
5
Splash: Standford parallel applications for shared-memory (context) - Singh, Weber et al. - 1991
5
Cool-fetch: Compilerenabled power-aware fetch throttling
- Unsal, Koren et al. - 2002
4
Load scheduling with profile information
- Lindenmaier, McKinley et al. - 2000
3
Hybrid compilerhardware prefetching multiprocessor using low.. (context) - Dubois, hardware et al. - 1997
3
An effective programmable prefetch engine for high-performan.. (context) - Chen - 1995
3
An overview of the SPHINX speech recoginition system (context) - Lee, Hon et al. - 1990
3
The minimax cache: An energy-efficient framework for media p..
- Unsal, Koren et al. - 2002
3
integrated hardwaresoftware scheme shared memory multiproces.. (context) - Veidenbaum, hardware et al. - 1994
3
Simple and effective array prefetching for Java
- Cahoon, McKinley - 2002
3
An algorithm for optimally exploiting spatial and temporal l.. (context) - Temam - 1999
2
Flexcache: A framework for compiler generated data caching (context) - Moritz, Frank et al. - 2001
2
The Omega Library (context) - Maryland - 1996
2
Exact analysis of the cache behaviour of nested loops (context) - Chatterjee, Parker et al. - 2001
1
Sequential program prefetching in memeory hierarchies (context) - Smith - 1978
1
Micro-30 SimpleScalar tutorial (context) - Austin, Burger - 1997
1
Energy efficient architectures: Direct addressed caches for .. (context) - Witchel, Larsen et al. - 2001
1
split spatialnonspatial cache survey and reevaluation perfor.. (context) - Dimitrijevic, split et al. - 1999
1
Cool-mem: Combining statically speculative memory accessing .. (context) - Ashok, Chheda et al. - 2002
1
Cool-cache: A compiler-enabled energy efficient data caching.. (context) - Unsal, Ashok et al. - 2003
1
Parallelism in mainstream enterprise platforms of the future (context) - Dileep - 2002
www.cs.umass.edu/Scale
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC