(Enter summary)
Abstract: xvi
1 (Update)
Context of citations to this paper: More
...vector is a NP complete problem, we propose a heuristic algorithm based on the following partitioning rules. Detailed proofs can be found in [21]. Theorem 1 Ordering Rule For a given partitioning vector k(k 1 ; k 2 ; Delta Delta Delta ; kn ) not in decreasing order, the...
Cited by: More
A Memory-layout Oriented Run-time Technique for Locality.. - Yan, Zhang, Zhang
(Correct)
Similar documents (at the sentence level):
8.2%: Adaptively Scheduling Parallel Loops in Distributed.. - Yan, Jin, Zhang (1997)
(Correct)
5.4%: An Adaptive Loop Scheduling Algorithm on Shared-Memory Systems - Canming Jin Cso
(Correct)
Active bibliography (related documents): More All
0.5: An Argument for Simple COMA - Saulsbury, Wilkinson, Carter, Landin (1995)
(Correct)
0.5: Exploiting Network Locality for CC-NUMA Multiprocessors - Hsiao, King
(Correct)
0.5: AlphaServer 4100 Performance Characterization - Cvetanovic, Donaldson (1996)
(Correct)
Similar documents based on text:
98.0: Unknown -
(Correct)
BibTeX entry: (Update)
Y. Yan. Exploiting Cache Locality at Run-time. PhD thesis, Computer Science Department, College of William & Mary, May 1998. http://citeseer.ist.psu.edu/yan98exploiting.html More
@misc{ yan98exploiting,
author = "Y. Yan",
title = "Exploiting Cache Locality at Run-time",
text = "Y. Yan. Exploiting Cache Locality at Run-time. PhD thesis, Computer Science
Department, College of William & Mary, May 1998.",
year = "1998",
url = "citeseer.ist.psu.edu/yan98exploiting.html" }
Citations (may not include all citations):
1575
Computer Architecture: A Quantitative Approach (context) - Hennessy, Patterson - 1996
531
LogP: Towards a Realistic Model of Parallel Computation
- Culler, Karp et al. - 1993
474
A Data Locality Optimizing Algorithm (context) - Wolf, Lam - 1991
443
Improving Direct-mapped Cache Performance by the Addition of..
- Jouppi - 1990
410
Principles of Artificial Intelligence (context) - Nilsson - 1980
376
The Cache Performance and Optimizations of Blocked Algorithm.. (context) - Lam, Rothberg et al. - 1991
294
High Performance Compilers For Parallel Computing (context) - Wolfe - 1996
237
Global Optimizations for Parallelism and Locality on Scalabl..
- Anderson, Lam - 1993
216
Strategies for Cache and Local Memory Management by Global P.. (context) - Gannon, Jalby et al. - 1988
183
Profile Guided Code Positioning (context) - Pettis, Hansen - 1990
176
Shared Memory Consistency Models: A Tutorial
- Adve, Gharachorloo - 1996
175
Evaluating Associativity in CPU Caches (context) - Hill, Smith - 1989
162
Improving Data Locality with Loop Transformations
- McKinley, Carr et al. - 1996
146
Unimodular Transformations of double loops (context) - Banerjee - 1990
142
MINT: A Front End for Efficient Simulation of Shared-Memory ..
- Veenstra, Fowler - 1994
131
Parallel Computer Architecture: A Hardware /Software Approac.. (context) - Culler, Singh et al. - 1997
126
The Impact of Operating System Scheduling Policies and Synch.. (context) - Gupta, Tucker et al. - 1991
124
Tile Size Selection Using Cache Organization and Data Layout
- Coleman, Mckinley - 1995
115
Program Optimization for Instruction Caches (context) - McFarling - 1989
113
Data and Computation Transformations for Multiprocessors
- Anderson, Amarasinghe et al. - 1995
109
Cache Profiling and SPEC Benchmarks: A Case Study
- Lebeck, Wood - 1994
107
Achieving High Instruction Cache Performance With an Optimiz.. (context) - Hwu, Chang - 1989
102
Scheduling Support for Concurrency and Parallelism in the Ma..
- Black - 1990
94
The DASH Prototype:Logic Overhead and Performance
- Lenoski - 1993
94
Optimizing for Parallelism and Data Locality
- Kennedy, Mckinley - 1992
94
The Effect of Context Switches on Cache Performance (context) - Mogul, Gorg - 1991
94
Run-Time Parallelization and Scheduling of Loops (context) - Saltz, Mirchandaney - 1991
82
On estimating and enhancing cache effectiveness (context) - Ferrante, Sarkar et al. - 1991
81
Reducing False Sharing on Shared Memory Multiprocessors Thro..
- Jeremiassen, Eggers - 1995
80
Avoiding Conflict Misses Dynamically in Large Direct-Mapped ..
- Bershad, Lee et al. - 1994
79
Column-Associative Caches: A Technique for Reducing the Miss.. (context) - Agarwal, Pudar - 1993
78
High Performance Fortran (context) - Loveman - 1993
74
The Implications of Cache Affinity on Processor Scheduling f.. (context) - Vaswani, Zahorjan - 1991
69
Access Normalization: Loop Restructuring for NUMA Compilers
- Li, Pingali - 1992
68
Processor self-scheduling for multiple nested parallel loops (context) - Tang, Yew - 1986
67
Page Placement Algorithms for Large Real-indexed Caches
- Kessler, Hill - 1992
65
Run-time Adaptive Cache Hierarchy Management via Reference A..
- Johnson, Hwu - 1997
60
Scheduling and Page Migration for Multiprocessor Compute Ser..
- Chandra, Devine et al. - 1994
58
Automatic Partitioning of Parallel Loops and Data Arrays for..
- Agarwal, Kranz et al. - 1995
58
MemSpy: Analyzing Memory System Bottlenecks in Programs
- Gupta, Martonosi et al. - 1992
51
Optimizing Instruction Cache Performance for Operating Syste..
- Torrellas, Xia et al. - 1995
49
False Sharing and Spatial Locality in Multiprocessor Caches
- Torrellas, Lam et al. - 1994
47
Efficient Procedure Mapping Using Cache Line Coloring
- Hashemi, Kaeli et al. - 1997
44
NUMA policies and their relation to memory architecture (context) - Bolosky, Sott et al. - 1991
41
A Quantitative Analysis of Loop Nest Locality
- Mckinley, Teman - 1996
40
Trapezoid self-scheduling: a practical scheduling scheme for..
- Tzen, Ni - 1993
38
Compiler-Directed Page Coloring for Multiprocessors (context) - Bugnion, Anderson et al. - 1996
35
Dynamic Page Mapping Policies for Cache Conflict Resolution ..
- Romer, Lee et al. - 1994
33
Factoring: a practical and robust method for scheduling para.. (context) - Hummel, Schonberg et al. - 1992
31
Thread Scheduling for Cache Locality
- Philbin, Edler et al. - 1996
30
Performance Debugging Shared Memory Multiprocessor Programs .. (context) - Goldberg, Hennessy - 1991
28
Cache Replacement with Dynamic Exclusion
- McFarling - 1992
25
A Dynamic Scheduling Method for Irregular Parallel Programs (context) - Lucco - 1996
23
Latency Metric: An Experimental Method for Measuring and Eva..
- Zhang, Yan et al. - 1994
20
KSR-1 Technology Background (context) - Research - 1992
16
Affinity scheduling of unbalanced workloads
- Subramaniam, Eager - 1994
16
Fusion of Loops for Parallelism and Locality
- Manjikian, Abdelrahman - 1997
15
Influence of Cross-interferences on Blocked Loops: A Case St..
- Fricker, Temam et al. - 1995
14
Impact of Cache Interferences on Usual Numerical Dense Loop ..
- Temam, Fricker et al. - 1993
14
Adaptively Scheduling Parallel Loops in Distributed Shared-M..
- Yan, Jin et al. - 1997
13
Locality and Loop Scheduling on NUMA Multiprocessors
- Li, Tandri et al. - 1993
13
Design tradeoffs for process scheduling in shared memory mul.. (context) - Ni, Wu - 1989
12
Performance Optimizations, Implementation, and Verification .. (context) - Galles, Williams - 1994
12
Two Fast and High-Associativity Cache Schemes
- Zhang, Zhang et al. - 1997
10
Guided self-scheduling: a practical selfscheduling scheme fo.. (context) - Polychronopoulos, Kuck - 1987
9
Limited Bandwidth to Affect Processor Design
- Burger, Goodman et al. - 1997
9
Trends in Shared Memory Multiprocessing (context) - Stenstrom, Hagersten et al. - 1997
9
Evaluation and Measurement of Multiprocessor Latency Pattern..
- Zhang, Yan et al. - 1994
9
Comparative Modeling and Evaluation of CC-NUMA and CC-COMA o..
- Zhang, Yan - 1995
8
CONVEX Exemplar Architecture (context) - Corporation - 1994
7
Safe Self-scheduling: a parallel loop scheduling scheme for .. (context) - Liu, Saletore et al. - 1994
7
Using Processor-Cache Affinity in SharedMemory Multiprocesso.. (context) - Squillante, Lazowska - 1993
6
Changing Interaction of Compiler and Architecture
- Adve - 1997
6
Benefits of Cache-Affinity Scheduling in Shared-Memory Multi.. (context) - Torrellas, Tucker et al. - 1993
5
Evaluating and Designing Software Mutual Exclusion Algorithm.. (context) - Zhang, Yan et al. - 1996
2
Using Processor Affinity in Loop Scheduling Scheme on Shared.. (context) - Markatos, Leblanc - 1994
2
The AlphaServer 4100 Cached Processor Module Architecture an.. (context) - Steinman, Harris et al. - 1996
2
Latency Analysis of CC-NUMA and CC-COMA Hierarchical Rings (context) - Zhang, Yan - 1994
2
An Overview of the HP/Convex Exemplar Hardware (context) - Astfalk, Brewer - 1997
1
Comparative Performance Analysis and Evaluation of Hot Spots.. (context) - Zhang, Yan et al. - 1995
1
Modeling and Measuring Hot Spots on MINBased and HR-based Sh.. (context) - Zhang, Yan et al. - 1993
1
A Survey of Hardware Solutions for Maintenance of Cache Cons.. (context) - Tomasevic, Milutinovic - 1994
1
Processor Control and Scheduling Issues for Multiprogrammed .. (context) - Tucker, Gupta - 1989
1
Modeling Data Migration on CC-NUMA and CC-COMA Ring Architec.. (context) - Zhang, Yan - 1994
1
Operating System Support for Improving Locality on CC-NUMA C.. (context) - Verghese, Devine et al. - 1996
1
Classifying Software-Based Cache Coherence Solutions (context) - Tartalia, Milutinovic - 1997
1
Software and Visualization Support to Performance Evaluation.. (context) - Yan, Zhang et al. - 1997
1
The Cache Visualization Tool (CVT (context) - Deijl, Temam et al. - 1997
1
A Memory-layout Oriented Run-Time Approach for Locality Opti.. (context) - Yan, Zhang - 1998
1
Experimental Comparison of Memory Mangement Policies for NUM.. (context) - LaRowe, Ellis - 1991
1
An Adaptive Loop Scheduling Algorithm on SharedMemory System..
- Jin, Yan et al. - 1996
1
Data Locality and Local Balancing in COOL (context) - Chandra, Gupta et al. - 1993
1
General Techniques for Combinational Approximation (context) - Sahni - 1977
Documents on the same site (http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/): More
An Effective and Practical Performance Prediction Model for - Parallel Computing On
(Correct)
LightFlood: an Efficient Flooding Scheme for File Search .. - Peer-To-Peer Systems..
(Correct)
Proceedings of the 21st International Conference on.. - Dynamic Load Sharing
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC