MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  A First Look at the Interplay of Code Reordering and Configurable Caches

Download:
pdf
by Ann Gordon-ross, Frank Vahid
http://www.cs.ucr.edu/~vahid/pubs/glsvlsi05_reorder.pdf
Add To MetaCart

Abstract:

The instruction cache is a popular target for optimizations of microprocessor-based systems because of the cache’s high impact on system performance and power, and because of the cache’s predictable temporal and spatial locality. Optimization techniques can be designed based on this predictability. We explore for the first time the interplay of two popular instruction cache optimization techniques: the long-known technique of code reordering and the relatively-new technique of cache configuration. We address the question of whether those two optimizations complement each other or if one optimization dominates the other. Through experiments using embedded system benchmarks, we show that cache configuration dominates a particular category of code reordering techniques with respect to optimizing performance and energy, obviating the need for reordering. We also examine the modern scenario of synthesized custom caches, and show that combining cache configuration with code reordering results in cache size reductions of 13 % on average, and up to 89 % in some benchmarks, beyond just cache configuration alone.

Citations

594 MediaBench: A tool for evaluating and synthesizing multimedia and communication systems – Lee, Potkonjak, et al. - 1997
247 Profile guided code positioning – Pettis, Hansen - 1990
152 Achieving high instruction cache performance with an optimizing compiler – CHANG - 1989
147 A practical system for intermodule code optimization at link-time – Srivastava, Wall - 1992
141 Program optimization for instruction caches – MCFARLING - 1989
108 Cache Design Trade-offs for Power and Performance Optimization: A Case Study – Su, Despain - 1995
90 Reducing branch costs via branch alignment – Calder, Grunwald - 1994
86 Instrumentation and optimization of Win32/Intel executables using etch – Romer, Voelker, et al. - 1999
60 A Low Power Unified Cache Architecture Providing Power and Performance Flexibility – Malik, Moyer, et al. - 2000
54 Procedure Placement Using Temporal Ordering Information – GLOY, BLACKWELL, et al. - 1997
36 PLTO: A link-time optimizer for the Intel IA-32 architecture – Debray, Schwarz, et al. - 2001
26 Spike: An optimizer for Alpha/NT executables – Cohn, Goodwin, et al. - 1997
15 A self-tuning cache architecture for embedded systems – Zhang, Vahid, et al. - 2004
13 Selective cache ways: on demand cache resource allocation – Albonesi - 1999
13 A Highly-Configurable Cache Architecture for Embedded Systems – Zhang, Vahid, et al. - 2003
11 Code reorganization for instruction caches – Samples - 1988
8 Reducing startup latency in web and desktop applications – Lee, Baer, et al. - 1999
8 Cache Configuration Exploration on Prototyping Platforms – Zhang, Vahid - 2003
7 the Embedded Microprocessor Benchmark Consortium, www.eembc.org – EEMBC - 2005
7 Automatic tuning of two-level caches to embedded applications – Gordon-Ross, Vahid, et al. - 2004
6 The Swift Java Compiler: Design and Implementation – Scales, Randall, et al. - 2000
4 Design and analysis of profile-based optimization in Compaq’s compilation tools for Alpha – Cohn, Lowney - 2000
4 alto : A link-time optimizer for the Compaq Alpha. Software Practice and Experience – Muth, Debray, et al. - 2001
4 Efficient Dynamic Procedure Placement – Scales - 1998
3 Code Placement using Temporal Profile Information – Gloy - 1998
2 Cacti2.0: an integraded cache timing and power model – Reinman, Jouppi - 1999
1 Checking program profiles – Moseley, Debray, et al. - 2003