See this document in CiteSeerX!

Exploiting Cache Locality At Run-Time (1998)  (Make Corrections)  (1 citation)
Yong Yan



  Home/Search   Context   Related

 
View or download:
ohiostate.edu/hpcs/WWW/...TR981.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  ohiostate.edu/hpcs/WWW/HTML/p... (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: xvi 1 (Update)


Context of citations to this paper:   More

...vector is a NP complete problem, we propose a heuristic algorithm based on the following partitioning rules. Detailed proofs can be found in [21]. Theorem 1 Ordering Rule For a given partitioning vector k(k 1 ; k 2 ; Delta Delta Delta ; kn ) not in decreasing order, the...

Cited by:   More
A Memory-layout Oriented Run-time Technique for Locality.. - Yan, Zhang, Zhang   (Correct)

Similar documents (at the sentence level):
8.2%:   Adaptively Scheduling Parallel Loops in Distributed.. - Yan, Jin, Zhang (1997)   (Correct)
5.4%:   An Adaptive Loop Scheduling Algorithm on Shared-Memory Systems - Canming Jin Cso   (Correct)

Active bibliography (related documents):   More   All
0.5:   An Argument for Simple COMA - Saulsbury, Wilkinson, Carter, Landin (1995)   (Correct)
0.5:   Exploiting Network Locality for CC-NUMA Multiprocessors - Hsiao, King   (Correct)
0.5:   AlphaServer 4100 Performance Characterization - Cvetanovic, Donaldson (1996)   (Correct)

Similar documents based on text:
98.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

Y. Yan. Exploiting Cache Locality at Run-time. PhD thesis, Computer Science Department, College of William & Mary, May 1998. http://citeseer.ist.psu.edu/yan98exploiting.html   More

@misc{ yan98exploiting,
  author = "Y. Yan",
  title = "Exploiting Cache Locality at Run-time",
  text = "Y. Yan. Exploiting Cache Locality at Run-time. PhD thesis, Computer Science
    Department, College of William & Mary, May 1998.",
  year = "1998",
  url = "citeseer.ist.psu.edu/yan98exploiting.html" }
Citations (may not include all citations):
1575   Computer Architecture: A Quantitative Approach (context) - Hennessy, Patterson - 1996
531   LogP: Towards a Realistic Model of Parallel Computation - Culler, Karp et al. - 1993
474   A Data Locality Optimizing Algorithm (context) - Wolf, Lam - 1991
443   Improving Direct-mapped Cache Performance by the Addition of.. - Jouppi - 1990
410   Principles of Artificial Intelligence (context) - Nilsson - 1980
376   The Cache Performance and Optimizations of Blocked Algorithm.. (context) - Lam, Rothberg et al. - 1991
294   High Performance Compilers For Parallel Computing (context) - Wolfe - 1996
237   Global Optimizations for Parallelism and Locality on Scalabl.. - Anderson, Lam - 1993
216   Strategies for Cache and Local Memory Management by Global P.. (context) - Gannon, Jalby et al. - 1988
183   Profile Guided Code Positioning (context) - Pettis, Hansen - 1990
176   Shared Memory Consistency Models: A Tutorial - Adve, Gharachorloo - 1996
175   Evaluating Associativity in CPU Caches (context) - Hill, Smith - 1989
162   Improving Data Locality with Loop Transformations - McKinley, Carr et al. - 1996
146   Unimodular Transformations of double loops (context) - Banerjee - 1990
142   MINT: A Front End for Efficient Simulation of Shared-Memory .. - Veenstra, Fowler - 1994
131   Parallel Computer Architecture: A Hardware /Software Approac.. (context) - Culler, Singh et al. - 1997
126   The Impact of Operating System Scheduling Policies and Synch.. (context) - Gupta, Tucker et al. - 1991
124   Tile Size Selection Using Cache Organization and Data Layout - Coleman, Mckinley - 1995
115   Program Optimization for Instruction Caches (context) - McFarling - 1989
113   Data and Computation Transformations for Multiprocessors - Anderson, Amarasinghe et al. - 1995
109   Cache Profiling and SPEC Benchmarks: A Case Study - Lebeck, Wood - 1994
107   Achieving High Instruction Cache Performance With an Optimiz.. (context) - Hwu, Chang - 1989
102   Scheduling Support for Concurrency and Parallelism in the Ma.. - Black - 1990
94   The DASH Prototype:Logic Overhead and Performance - Lenoski - 1993
94   Optimizing for Parallelism and Data Locality - Kennedy, Mckinley - 1992
94   The Effect of Context Switches on Cache Performance (context) - Mogul, Gorg - 1991
94   Run-Time Parallelization and Scheduling of Loops (context) - Saltz, Mirchandaney - 1991
82   On estimating and enhancing cache effectiveness (context) - Ferrante, Sarkar et al. - 1991
81   Reducing False Sharing on Shared Memory Multiprocessors Thro.. - Jeremiassen, Eggers - 1995
80   Avoiding Conflict Misses Dynamically in Large Direct-Mapped .. - Bershad, Lee et al. - 1994
79   Column-Associative Caches: A Technique for Reducing the Miss.. (context) - Agarwal, Pudar - 1993
78   High Performance Fortran (context) - Loveman - 1993
74   The Implications of Cache Affinity on Processor Scheduling f.. (context) - Vaswani, Zahorjan - 1991
69   Access Normalization: Loop Restructuring for NUMA Compilers - Li, Pingali - 1992
68   Processor self-scheduling for multiple nested parallel loops (context) - Tang, Yew - 1986
67   Page Placement Algorithms for Large Real-indexed Caches - Kessler, Hill - 1992
65   Run-time Adaptive Cache Hierarchy Management via Reference A.. - Johnson, Hwu - 1997
60   Scheduling and Page Migration for Multiprocessor Compute Ser.. - Chandra, Devine et al. - 1994
58   Automatic Partitioning of Parallel Loops and Data Arrays for.. - Agarwal, Kranz et al. - 1995
58   MemSpy: Analyzing Memory System Bottlenecks in Programs - Gupta, Martonosi et al. - 1992
51   Optimizing Instruction Cache Performance for Operating Syste.. - Torrellas, Xia et al. - 1995
49   False Sharing and Spatial Locality in Multiprocessor Caches - Torrellas, Lam et al. - 1994
47   Efficient Procedure Mapping Using Cache Line Coloring - Hashemi, Kaeli et al. - 1997
44   NUMA policies and their relation to memory architecture (context) - Bolosky, Sott et al. - 1991
41   A Quantitative Analysis of Loop Nest Locality - Mckinley, Teman - 1996
40   Trapezoid self-scheduling: a practical scheduling scheme for.. - Tzen, Ni - 1993
38   Compiler-Directed Page Coloring for Multiprocessors (context) - Bugnion, Anderson et al. - 1996
35   Dynamic Page Mapping Policies for Cache Conflict Resolution .. - Romer, Lee et al. - 1994
33   Factoring: a practical and robust method for scheduling para.. (context) - Hummel, Schonberg et al. - 1992
31   Thread Scheduling for Cache Locality - Philbin, Edler et al. - 1996
30   Performance Debugging Shared Memory Multiprocessor Programs .. (context) - Goldberg, Hennessy - 1991
28   Cache Replacement with Dynamic Exclusion - McFarling - 1992
25   A Dynamic Scheduling Method for Irregular Parallel Programs (context) - Lucco - 1996
23   Latency Metric: An Experimental Method for Measuring and Eva.. - Zhang, Yan et al. - 1994
20   KSR-1 Technology Background (context) - Research - 1992
16   Affinity scheduling of unbalanced workloads - Subramaniam, Eager - 1994
16   Fusion of Loops for Parallelism and Locality - Manjikian, Abdelrahman - 1997
15   Influence of Cross-interferences on Blocked Loops: A Case St.. - Fricker, Temam et al. - 1995
14   Impact of Cache Interferences on Usual Numerical Dense Loop .. - Temam, Fricker et al. - 1993
14   Adaptively Scheduling Parallel Loops in Distributed Shared-M.. - Yan, Jin et al. - 1997
13   Locality and Loop Scheduling on NUMA Multiprocessors - Li, Tandri et al. - 1993
13   Design tradeoffs for process scheduling in shared memory mul.. (context) - Ni, Wu - 1989
12   Performance Optimizations, Implementation, and Verification .. (context) - Galles, Williams - 1994
12   Two Fast and High-Associativity Cache Schemes - Zhang, Zhang et al. - 1997
10   Guided self-scheduling: a practical selfscheduling scheme fo.. (context) - Polychronopoulos, Kuck - 1987
9   Limited Bandwidth to Affect Processor Design - Burger, Goodman et al. - 1997
9   Trends in Shared Memory Multiprocessing (context) - Stenstrom, Hagersten et al. - 1997
9   Evaluation and Measurement of Multiprocessor Latency Pattern.. - Zhang, Yan et al. - 1994
9   Comparative Modeling and Evaluation of CC-NUMA and CC-COMA o.. - Zhang, Yan - 1995
8   CONVEX Exemplar Architecture (context) - Corporation - 1994
7   Safe Self-scheduling: a parallel loop scheduling scheme for .. (context) - Liu, Saletore et al. - 1994
7   Using Processor-Cache Affinity in SharedMemory Multiprocesso.. (context) - Squillante, Lazowska - 1993
6   Changing Interaction of Compiler and Architecture - Adve - 1997
6   Benefits of Cache-Affinity Scheduling in Shared-Memory Multi.. (context) - Torrellas, Tucker et al. - 1993
5   Evaluating and Designing Software Mutual Exclusion Algorithm.. (context) - Zhang, Yan et al. - 1996
2   Using Processor Affinity in Loop Scheduling Scheme on Shared.. (context) - Markatos, Leblanc - 1994
2   The AlphaServer 4100 Cached Processor Module Architecture an.. (context) - Steinman, Harris et al. - 1996
2   Latency Analysis of CC-NUMA and CC-COMA Hierarchical Rings (context) - Zhang, Yan - 1994
2   An Overview of the HP/Convex Exemplar Hardware (context) - Astfalk, Brewer - 1997
1   Comparative Performance Analysis and Evaluation of Hot Spots.. (context) - Zhang, Yan et al. - 1995
1   Modeling and Measuring Hot Spots on MINBased and HR-based Sh.. (context) - Zhang, Yan et al. - 1993
1   A Survey of Hardware Solutions for Maintenance of Cache Cons.. (context) - Tomasevic, Milutinovic - 1994
1   Processor Control and Scheduling Issues for Multiprogrammed .. (context) - Tucker, Gupta - 1989
1   Modeling Data Migration on CC-NUMA and CC-COMA Ring Architec.. (context) - Zhang, Yan - 1994
1   Operating System Support for Improving Locality on CC-NUMA C.. (context) - Verghese, Devine et al. - 1996
1   Classifying Software-Based Cache Coherence Solutions (context) - Tartalia, Milutinovic - 1997
1   Software and Visualization Support to Performance Evaluation.. (context) - Yan, Zhang et al. - 1997
1   The Cache Visualization Tool (CVT (context) - Deijl, Temam et al. - 1997
1   A Memory-layout Oriented Run-Time Approach for Locality Opti.. (context) - Yan, Zhang - 1998
1   Experimental Comparison of Memory Mangement Policies for NUM.. (context) - LaRowe, Ellis - 1991
1   An Adaptive Loop Scheduling Algorithm on SharedMemory System.. - Jin, Yan et al. - 1996
1   Data Locality and Local Balancing in COOL (context) - Chandra, Gupta et al. - 1993
1   General Techniques for Combinational Approximation (context) - Sahni - 1977

Documents on the same site (http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/):   More
An Effective and Practical Performance Prediction Model for - Parallel Computing On   (Correct)
LightFlood: an Efficient Flooding Scheme for File Search .. - Peer-To-Peer Systems..   (Correct)
Proceedings of the 21st International Conference on.. - Dynamic Load Sharing   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC