• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. (1999)

by K Ghose, M B Kamble
Venue:In Low Power Electronics and Design,
Add To MetaCart

Tools

Sorted by:
Results 11 - 20 of 135
Next 10 →

Energy behavior of Java applications from the memory perspective

by N. Vijaykrishnan, S. Kim, S. Tomar, A. Sivasubramaniam, M. J. Irwin - in Usenix Java Virtual Machine Research and Technology Symposium (JVM’01 , 2001
"... Permission is granted for noncommercial reproduction of the work for educational or research purposes. ..."
Abstract - Cited by 20 (4 self) - Add to MetaCart
Permission is granted for noncommercial reproduction of the work for educational or research purposes.

Performance and Power Effectiveness in Embedded Processors -- Customizable Partitioned Caches

by Peter Petrov, Alex Orailoglu , 2001
"... This paper explores an application-specific customization technique for the data cache, one of the foremost area/power consuming and performance determining microarchitectural features of modern embedded processors. The automated methodology for customizing the processor microarchitecture that we p ..."
Abstract - Cited by 18 (6 self) - Add to MetaCart
This paper explores an application-specific customization technique for the data cache, one of the foremost area/power consuming and performance determining microarchitectural features of modern embedded processors. The automated methodology for customizing the processor microarchitecture that we propose results in increased performance, reduced power consumption and improved determinism of critical system parts while the fixed design ensures processor standardization. The resulting improvements help to enlarge the significant role of embedded processors in modern hardware–software codesign techniques by leading to increased processor utilization and reduced hardware cost. A novel methodology for static analysis and a microarchitecturally field-reprogrammable implementation of a customizable cache controller that implements a partitioned cache structure is proposed. Partitioning the load/store instructions eliminates cache interference; hence, precise knowledge about the hit/miss behavior of the references within each partition becomes available, resulting in significant reduction in tag reads and comparisons. Moreover, eliminating cache interference naturally leads to a significant reduction in the miss rate. The paper presents an algorithm for defining cache partitions, hardware support for customizable cache partitions, and a set of experimental results. The experimental results indicate significant improvements in both power consumption and miss rate.

Power-aware branch prediction: Characterization and design,”

by D Parikh, K Skadron, Y Zhang, M Stan - IEEE Transactions on Computers, , 2004
"... ..."
Abstract - Cited by 18 (3 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...educe cache energy dissipation. By exploring the performance and energy implications he showed that a small performance degradation can produce significant reduction in cache energy. Ghose and Kamble =-=[12]-=- looked at sub-banking and other organizational techniques for reducing energy in the cache. Zhu and Zhang [37] describe a low-power associative cache mode that performs tag match and data access sequ...

Profile-Based Energy Reduction for High-Performance Processors

by Michael Huang, Jose Renau, Josep Torrellas - 4TH. ACM WORKSHOP ON FEEDBACK DIRECTED AND DYNAMIC OPTIMIZATION (FDDO-4)
"... To reduce the energy consumption of modern processors, designers have proposed many energy-saving techniques. In many cases, these techniques are dynamically activated and deactivated. In systems that employ these techniques, to adapt to changes in application behavior, profiling can help determine ..."
Abstract - Cited by 17 (0 self) - Add to MetaCart
To reduce the energy consumption of modern processors, designers have proposed many energy-saving techniques. In many cases, these techniques are dynamically activated and deactivated. In systems that employ these techniques, to adapt to changes in application behavior, profiling can help determine how to manage the activation of techniques to improve a certain metric. In this paper

Dynamic Allocation Of Datapath Resources For Low Power

by Dmitry Ponomarev, Gurhan Kucuk, Kanad Ghose - in Proc. of Workshop on Complexity–Effective Design, held in conjunction with ISCA–28 , 2001
"... We show by profiling the execution of SPEC95 benchmarks that the usage of datapath resources in a modern superscalar processor is highly dynamic and correlated. The one-sizefits all philosophy used for permanently allocating datapath resources in a modern superscalar CPU is thus complexityineffectiv ..."
Abstract - Cited by 15 (3 self) - Add to MetaCart
We show by profiling the execution of SPEC95 benchmarks that the usage of datapath resources in a modern superscalar processor is highly dynamic and correlated. The one-sizefits all philosophy used for permanently allocating datapath resources in a modern superscalar CPU is thus complexityineffective due to the overcommittment of resources in general. We propose a strategy to dynamically and simultaneously adjust the sizes of two such correlated resources - the dispatch buffer (also known as an issue queue) and the reorder buffer - to reduce power dissipation in the datapath without significant impact on the performance. We also show how the resizing technique can be augmented with dynamic adaptation of dispatch rate. Representative results show reduction in power dissipation of 69% for the dispatch buffer and of 52%for the reorder buffer with an average IPC loss below 8.5%.

Partitioned Instruction Cache Architecture for Energy Efficiency

by Soontae Kim, N. Vijaykrishnan, Mahmut Kandemir, Anand, Anand Sivasub, Mary Jane Irwin - ACM Transactions on Embedded Computing Systems , 2003
"... this paper studies energy-e#cient cache architectures in the memory hierarchy that can have a signi#cant impact on the overall system energy consumption ..."
Abstract - Cited by 14 (0 self) - Add to MetaCart
this paper studies energy-e#cient cache architectures in the memory hierarchy that can have a signi#cant impact on the overall system energy consumption
(Show Context)

Citation Context

...ses, the cache con guration (e.g., capacity, associativity, line size), technology, and the extent of utilizing energy-e cient circuit implementation techniques (e.g., subbanking, bit line isolation [=-=Ghose and Kamble 1999-=-; Su and Despain 1995]). There are two important components in the cache, namely, the tag and the data arrays. The data array is shown pictorially in Figure 2. The major components that consume energy...

Low-Cost Embedded Program Loop Caching - Revisited

by Lea Hwang Lee, Bill Moyer, John Arends - University of Michigan , 1999
"... Many portable and embedded applications are characterized by spending a large fraction of their execution time on small program loops. In these applications, instruction fetch energy can be reduced by using a small instruction cache when executing these tight loops. Recent work has shown that it ..."
Abstract - Cited by 14 (1 self) - Add to MetaCart
Many portable and embedded applications are characterized by spending a large fraction of their execution time on small program loops. In these applications, instruction fetch energy can be reduced by using a small instruction cache when executing these tight loops. Recent work has shown that it is possible to use a small instruction cache without incurring any performance penality [4, 6]. In this paper, we will extend the work done in [6]. In the modified loop caching scheme proposed in this paper, when a program loop is larger than the loop cache size, the loop cache is capable of capturing only part of the program loop without having any cache conflict problem. For a given loop cache size, our loop caching scheme can reduce instruction fetch energy more than other loop cache schemes previously proposed. We will present some quantitative results on how much power can be saved on an integrated embedded design using this scheme. + + December 18th, 1999 2 1 Introduction ...
(Show Context)

Citation Context

...n be reduced by using a small instruction cache when executing these tight loops. Recent work has shown that it is possible to use a small instruction cache without incurring any performance penality =-=[4, 6]-=-. In this paper, we will extend the work done in [6]. In the modified loop caching scheme proposed in this paper, when a program loop is larger than the loop cache size, the loop cache is capable of c...

Power Reduction through Work Reuse

by Emil Talpes, Diana Marculescu - In Int’l Symp. on Low Power Electronics and Design , 2001
"... Power consumption has become one of the big challenges in designing high performance processors. The rapid increase in complexity and speed that comes with each new CPU generation causes greater problems with power consumption and heat dissipation. Traditionally, these concerns are addressed through ..."
Abstract - Cited by 13 (4 self) - Add to MetaCart
Power consumption has become one of the big challenges in designing high performance processors. The rapid increase in complexity and speed that comes with each new CPU generation causes greater problems with power consumption and heat dissipation. Traditionally, these concerns are addressed through semiconductor technology improvements such as voltage reduction and technology scaling. This work proposes an alternative solution to this problem, by dealing with the power consumption in the very early stage of the microarchitecture design. More precisely, we show that by modifying the wellestablished out-of-order, superscalar processor architecture, significant gains can be achieved in terms of power requirements without performance penalty. Our proposed approach relies on reusing as much as possible from the work done by the front-end of a typical pipelined, superscalar out-of-order via the use of a cache nested deeply into the processor structure. Experimental results show up to 52% (20% on average) savings in average energy per committed instruction for two different pipeline structures.
(Show Context)

Citation Context

...However, the access pattern for this structure is highly predictable (most of the cycles we just increment the row address we used for the previous access). This behavior allows us to use sub-banking =-=[14]-=- for implementing this structure and turn off the banks that are not being used in each cycle. All the banks have to be validated at the same time only when we start a new trace (after accessing the T...

Microarchitectural Power Modeling Techniques for Deep Sub-Micron Microprocessors

by Nam Sung Kim, Taeho Kgil, Valeria Bertacco, Todd Austin, Trevor Mudge , 2004
"... The need to perform early design studies that combine architectural simulation with power estimation has become critical as power has become a design constraint whose importance has moved to the fore. To satisfy this demand several microarchitectural power simulators have been developed around Simpl ..."
Abstract - Cited by 12 (1 self) - Add to MetaCart
The need to perform early design studies that combine architectural simulation with power estimation has become critical as power has become a design constraint whose importance has moved to the fore. To satisfy this demand several microarchitectural power simulators have been developed around SimpleScalar, a widely used microarchitectural performance simulator. They have proven to be very useful at providing insights into power/performance trade-offs. However, they are neither parameterized nor technology scalable. In this paper, we propose more accurate parameterized power modeling techniques reflecting the actual technology parameters as well as input switching-events for memory and execution units. Compared to HSPICE, the proposed techniques show 93 % and 91 % accuracies for those blocks, but with a much faster simulation time. We also propose a more realistic power modeling technique for external I/O. In general, our approach includes more detailed microarchitectural and circuit modeling than has been the case in earlier simulators, without incurring a significant simulation time overhead—it can be as small as a few percent.

Fine-Grain CAM-Tag Cache Resizing Using Miss Tags

by Michael Zhang, Krste Asanovic
"... A new dynamic cache resizing scheme for low-power CAMtag caches is introduced. A control algorithm that is only activated on cache misses uses a duplicate set of tags, the miss tags, to minimize active cache size while sustaining close to the same hit rate as a full size cache. The cache partitionin ..."
Abstract - Cited by 12 (1 self) - Add to MetaCart
A new dynamic cache resizing scheme for low-power CAMtag caches is introduced. A control algorithm that is only activated on cache misses uses a duplicate set of tags, the miss tags, to minimize active cache size while sustaining close to the same hit rate as a full size cache. The cache partitioning mechanism saves both switching and leakage energy in unused partitions with little impact on cycle time. Simulation results show that the scheme saves 28--56% of data cache energy and 34--49% of instruction cache energy with minimal performance impact.
(Show Context)

Citation Context

... than 43% [19] of overall power in caches. As a result, there has been great interest in reducing cache power consumption. Initial cache energy reduction techniques focused on dynamic switching power =-=[1, 2, 3, 4, 7, 10, 13, 22]-=-. With technology scaling, leakage current is increasing exponentially, and more attention has been paid to leakage power reduction [9, 11, 15, 16, 18, 20]. Permission to make digital or hard copies o...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University