MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Impulse Adaptable Memory Controller system: whenever

Download:
Download as a PDF | Download as a PS
by Lixin Zhang, Sally A. Mckee, Wilson C. Hsieh, John B. Carter
ftp://ftp2.cs.utah.edu/pub/users/sam/isca00ws.ps.gz
Add To MetaCart

Abstract:

Prefetching has long been used to mask the latency of memory loads. This paper presents results for an initial implementation of pointer-based prefetching within the Impulse adaptable memory controller. We conduct our experiments on a four-way issue superscalar machine. For the microbenchmarks we examine, we consistently realize about a 20 % improvement in execution time for linked data structures accessed within medium to short loop iterations. This compares favorably to software prefetching when the data working set fits in cache, and exceeds the performance of the latter technique for large working sets. We also find that a superscalar, outof-order processor hides the memory latency of linked data structures accessed in large loop iterations exceptionally well, which makes any pointer prefetching unnecessary.

Citations

680 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and – Jouppi - 1990
455 Design and evaluation of a compiler algorithm for prefetching – Mowry, Lam, et al. - 1992
199 An effective on-chip preloading scheme to reduce data access penalty – Baer, Chen - 1991
165 Compiler-based prefetching for recursive data structures – Luk, Mowry - 1996
137 Dependence Based Prefetching for Linked Data Structures – Roth, Moshovos, et al. - 1998
115 Synchronization and communication in the T3E multiprocessor – Scott - 1996
98 Data prefetching in multiprocessor vector cache memories – Fu, Patel - 1991
84 Effective jump-pointer prefetching for linked data structures – Roth, Sohi - 1999
72 RSIM reference manual, version 1.0 – Pai, Ranganathan, et al. - 1997
67 MemorySystem Design Considerations for Dynamically-Scheduled Processors – Farkas, Chow, et al. - 1997
65 A performance comparison of contemporary DRAM architectures – Cuppu, Jacob, et al. - 1999
63 Impulse: Building a smarter memory controller – Carter, Hsieh, et al. - 1999
60 A Prefetching Technique for Irregular Accesses to Linked Data Structures – Karlsson, Dahlgren, et al. - 2000
53 Sequential Hardware Prefetching in Shared-Memory Multiprocessors – Dahlgren, Dubois, et al. - 1995
43 An effective programmable prefetch engine for on-chip caches – Chen - 1995
39 Examination of a Memory Access Classification Scheme for Pointer-Intensive and Numeric Programs – Mehrotra, Harrison - 1996
39 Increasing TLB Reach Using Superpages Backed by Shadow Memory – Swanson, Stoller, et al. - 1998
37 Prefetch unit for vector operations on scalar computers – Sklenar - 1992
33 et al. Internal Organization of the Alpha 21164, a 300-MHz 64-bit Quad–issue – Edmondson - 1995
21 A vectorizing, software pipelining compiler for LIW and superscalar architecture – Meadows, Nakamoto, et al. - 1992
17 Design and evaluation of dynamic access ordering hardware – McKee - 1996
15 URSIM Reference Manual – Zhang
14 The NAS860 library user's manual – LEE - 1993
14 Code Restructuring to Exploit Page Mode and Read-Ahead – Palacharla, Kessler - 1995
14 Memory system support for image processing – Zhang, Carter, et al. - 1999
11 Single PE optimization techniques for the cray T3D system – Brooks - 1995
5 DRAM on-chip caching – Wong, Baer - 1997
3 Distributed prefetchbuffer /cache design for high performance memory systems – Alexander, Kedem - 1996
2 A comparison of online superpage promotion policies – Fang, Zhang, et al. - 2000