MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  An automated method for software controlled cache prefetching (1998) [3 citations — 0 self]

Download:
Download as a PDF
by Daniel F. Zucker, Ruby B. Lee, Michael J. Flynn
In System Sciences, 1998, Proceedings of the Thirty-First Hawaii International Conference on
http://www.ee.princeton.edu/~rblee/HPpapers/automatedSWcachePrefetching.pdf
Add To MetaCart

Abstract:

Abstract — As the gap between cycle time and main memory access time increases, memory system per-formance becomes increasingly important. The trend to higher instruction level parallelism with superscalar processors puts even higher demands on the memory system. Prefetching is a common strategy to tolerate this increased memory latency. This paper presents a software only technique to prefetch data to the CPU cache before it is needed in order combat this problem. The software prefetching technique presented is moti-vated by emulation of a hardware stride prediction table (SPT). Performance similar, and in some cases superior, to the hardware based technique is achieved with no additional hardware costs. In the first step, a simulation of the hardware SPT is conducted to identify where useful prefetches are best added. In the next step, soft-ware prefetches are added to the executable code. The technique is automated and could be implemented by a compiler as a two phase optimization of a profile step followed by an optimization step. Data is presented for both SPEC95 and multimedia benchmarks. In the best case, a performance improvement of 2.78X is observed over the same code with no prefetching at no extra hardware costs.

Citations

680 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and – Jouppi - 1990
664 ATOM: A system for building customized program analysis tools – Srivastava, Eustace - 1994
537 Cache Memories – Smith - 1982
455 Design and evaluation of a compiler algorithm for prefetching – Mowry, Lam, et al. - 1992
199 An effective on-chip preloading scheme to reduce data access penalty – Baer, Chen - 1991
165 Evaluating Stream Buffers as a Secondary Cache Replacement – Palacharla, Kessler - 1994
159 Effective Hardware-based Data Prefetching for High-performance Processors – Chen, Baer - 1995
135 Software methods for improvement of cache performance on supercomputer applications – Porterfield - 1989
120 A Performance Study of Software and Hardware Data Prefetching Schemes – Chen, Baer - 1994
110 Stride directed prefetching in scalar processors – Fu, Patel - 1992
100 Performance of a software mpeg video decoder – Patel, Smith, et al. - 1993
98 Data prefetching in multiprocessor vector cache memories – Fu, Patel - 1991
37 Prefetch unit for vector operations on scalar computers – Sklenar - 1992
27 1995], A Comparison of Hardware Prefetching Techniques for Multimedia Benchmarks – ZUCKER, FLYNN, et al.
7 RYO: a versatile instruction instrumentation tool for PA-RISC – Zucker, Karp - 1995
7 Hardware and software cache prefetching techniques for MPEG benchmarks – Zucker, Lee, et al. - 1997
7 Architecture and Arithmetic for Multimedia Enhanced Processors – Zucker - 1997
6 Reducing cache miss rates using prediction caches – Bennett, Flynn - 1996
4 Latency tolerance for dynamic processors – Bennett, Flynn - 1996
2 Wei-Chung Hsu, “Data prefetching on the – Santhanam, Gornish - 1997
1 Architecture and Arithmetic for MuMmedia Enhanced Processors – Zucker - 1997
1 Wei-Chung Hsu, "Data prefetching on the HP PA-8000 – Santhanam, Gornish - 1997