MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Speeding up irregular applications in shared-memory multiprocessors: Memory binding and group prefetching (1995) [29 citations — 4 self]

Download:
Download as a PDF | Download as a PS
by Zheng Zhang, Josep Torrellas
In Proceedings of the 22nd Annual International Symposium on Computer Architecture
http://polaris.cs.uiuc.edu/reports/1466.ps.gz
Add To MetaCart

Abstract:

While many parallel applications exhibit good spatial locality, other important codes in areas like graph problem-solving or CAD do not. Often, these irregular codes contain small records accessed via pointers. Consequently, while the former applications benefit from long cache lines, the latter prefer short lines. One good solution is to combine short lines with prefetching. In this way, each application can exploit the amount of spatial locality that it has. However, prefetching, if provided, should also work for the irregular codes. This paper presents a new prefetching scheme that, while usable by regular applications, is specifically targeted to irregular ones:

Citations

705 SPLASH: Stanford Parallel Applications for Shared Memory – Singh, Weber, et al. - 1992
680 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and – Jouppi - 1990
455 Design and evaluation of a compiler algorithm for prefetching – Mowry, Lam, et al. - 1992
264 Tolerating Latency Through SoftwareControlled Prefetching in Shared-Memory Multiprocessors – Mowry, Gupta - 1991
240 Software prefetching – Callahan, Kennedy, et al. - 1991
199 An effective on-chip preloading scheme to reduce data access penalty – Baer, Chen - 1991
156 An architecture for software-controlled data prefetching – Klaiber, Levy - 1991
120 A Performance Study of Software and Hardware Data Prefetching Schemes – Chen, Baer - 1994
107 Analysis of cache invalidation patterns in multiprocessors – Weber, Gupta - 1989
98 Data prefetching in multiprocessor vector cache memories – Fu, Patel - 1991
97 False sharing and spatial locality in multiprocessor caches – Torrellas, Lam, et al. - 1994
88 Compiler-directed data prefetching in multiprocessors with memory hierarchies – Gornish, Granston, et al. - 1990
67 Data access microarchitectures for superscalar processors with compiler-assisted data prefetching – Chen, Mahlke, et al. - 1991
62 Simulation of Multiprocessors: Accuracy and Performance – Goldschmidt - 1993
55 Adjustable block size coherent caches – Dubnicki, LeBlanc - 1992
40 The Performance Advantages of Integrating Block Data Transfer – Woo, Singh, et al. - 1996
20 Fixed and adaptive sequential prefetching in shared-memory multiprocessors – Dahlgren, Dubois, et al. - 1993
18 Performance Evaluation of Hybrid Hardware and Software Distributed Shared Memory Protocols – Chandra, Gharachorloo, et al. - 1994