• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 37
Next 10 →

Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling

by Greg Semeraro, Grigorios Magklis, Rajeev Balasubramonian, David H. Albonesi, Sandhya Dwarkadas, Hya Dwarkadas, Michael L. Scott - In Proceedings of the 8th International Symposium on High-Performance Computer Architecture , 2002
"... As clock frequency increases and feature size decreases, clock distribution and wire delays present a growing challenge to the designers of singly-clocked, globally synchronous systems. We describe an alternative approach, which we call a Multiple Clock Domain (MCD) processor, in which the chip is d ..."
Abstract - Cited by 108 (13 self) - Add to MetaCart
is divided into several (coarse-grained) clock domains, within which independent voltage and frequency scaling can be performed. Boundaries between domains are chosen to exploit existing queues, thereby minimizing inter-domain synchronization costs. We propose four clock domains, corresponding to the front

A Highly Configurable Cache Architecture for Embedded Systems

by Chuanjun Zhang, Frank Vahid, Walid Najjar , 2003
"... Energy consumption is a major concern in many embedded computing systems. Several studies have shown that cache memories account for about 50% of the total energy consumed in these systems. The performance of a given cache architecture is largely determined by the behavior of the application using t ..."
Abstract - Cited by 82 (5 self) - Add to MetaCart
Energy consumption is a major concern in many embedded computing systems. Several studies have shown that cache memories account for about 50% of the total energy consumed in these systems. The performance of a given cache architecture is largely determined by the behavior of the application using

Probabilistic Delay Budget Assignment for Synthesis of Soft Realtime Applications

by Soheil Ghiasi, Po-Kuan Huang, Roozbeh Jafari , 2005
"... Unlike their hard realtime counterparts, soft realtime applications are only expected to guarantee their "expected delay" over input data space. This paradigm shift calls for customized statistical design techniques to replace the conventional pessimistic worst case analysis methodologies. ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
improves the design area. Furthermore, it consistently outperforms optimal time budgeting under hard realtime constraint, which is the best existing competitor. Design area improvements were up to 26% and averaged about 17%, on several MediaBench applications.

An Optimistic and Conservative Register Assignment Heuristic for Chordal Graphs

by Philip Brisk, Ajay K. Verma, Paolo Ienne , 2007
"... This paper presents a new register assignment heuristic for procedures in SSA Form, whose interference graphs are chordal; the heuristic is called optimistic chordal coloring (OCC). Previous register assignment heuristics eliminate copy instructions via coalescing, in other words, merging nodes in t ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
; in this sense, OCC is conservative as well as optimistic. OCC is observed to eliminate at least as many dynamically executed copy instructions as iterated register coalescing (IRC) for a set of chordal interference graphs generated from several Mediabench and MiBench applications. In many cases, OCC and IRC

Power-Efficient Memory Bus Encoding Using Stride-Based Stream Reconstruction

by Kuei-chung Chang, Tsung-ming Hsieh, Tien-fu Chen , 2007
"... With the rapid increase in the complexity of chips and the popularity of portable devices, the performance demand is not any more the only important constraint in the embedded system. Instead, energy consumption has become one of the main design issues for contemporary embedded systems, especially f ..."
Abstract - Add to MetaCart
address streams, we partially compare the previous addresses of existing streams with the current address. Hence, the data transmitted on the bus can be minimally encoded. Experiments with several MediaBench benchmarks show that the scheme can achieve an average of 60 % reduction in bus switching activity.

CARS: A new code generation framework for clustered ILP processors

by Krishnan Kailas, Ashok Agrawala - In HPCA , 2001
"... Clustered ILP processors are characterized by a large number of non-centralized on-chip resources grouped into clusters. Traditional code generation schemes for these processors consist of multiple phases for cluster assignment, register allocation and instruction scheduling. Most of these approache ..."
Abstract - Cited by 51 (1 self) - Add to MetaCart
of these approaches need additional re-scheduling phases because they often do not impose finite resource constraints in all phases of code generation. These phase-ordered solutions have several drawbacks, resulting in the generation of poor performance code. Moreover, the iterative/back-tracking algorithms used

I‘ A Self-Tuning Cache Archi- tecture for Embedded Systems,” DATE’U4

by Chuanjun Zhang, Frank Vahid, Roman Lysecky , 2004
"... Memory accesses can account for about half of a microprocessor system’s power consumption. Customizing a microprocessor cache’s total size, line size and associativity to a particular program is well known to have tremendous benefits for performance and power. Customizing caches has until recently b ..."
Abstract - Cited by 38 (6 self) - Add to MetaCart
been restricted to core-based flows, in which a new chip will be fabricated. However, several configurable cache architectures have been proposed recently for use in pre-fabricated microprocessor platforms. Tuning those caches to a program is still however a cumbersome task left for designers, assisted

Improving Software Performance with Configurable Logic

by Jason Villarreal , Dinesh Suresh, Greg Stitt, Frank Vahid, Walid Najjar - KLUWER JOURNAL ON DESIGN AUTOMATION OF EMBEDDED SYSTEMS, NOVEMBER 2002, VOLUME 7, ISSUE , 2002
"... We examine the energy and performance benefits that can be obtained by re-mapping frequently executed loops from a microprocessor to reconfigurable logic. We present a design flow that finds critical software loops automatically and manually re-implements these in configurable logic by implementing ..."
Abstract - Cited by 13 (4 self) - Add to MetaCart
them in SA-C, a C language variation supporting a dataflow computation model and designed to specify and map DSP applications onto reconfigurable logic. We apply this design flow on several examples from the MediaBench benchmark suite and report the energy and performance improvements.

Dynamic Strands: Collapsing Speculative Dependence Chains for Reducing Pipeline Communication

by Peter G. Sassone, D. Scott Wills , 2004
"... In the modern era of wire-dominated architectures, specific effort must be made to reduce needless communication within out-of-order pipelines while still maintaining binary compatibility. To ease pressure on highly-connected elements such as the issue logic and bypass network, we propose the dynami ..."
Abstract - Cited by 25 (5 self) - Add to MetaCart
occupancy by over a third. Additionally, these strands have several properties which make them amenable to simple performance optimizations. Our experiments show average IPC increases of 17% on a four-wide machine and 20% on an eight-wide machine in Spec2000int and Mediabench applications. Finally, strands

A Super-Scheduler for Embedded Reconfigurable Systems

by S. Ogrenci Memik, E. Bozorgzadeh, S. Ogrenci, Memik E. Bozorgzadeh, R. Kastner, M. Sarrafzadeh , 2001
"... Emerging reconfigurable systems attain high peformance with embedded optimized cores. For mapping designs on such special architectures, synthesis tools, that are aware of the special capabilities of the underlying architecture are necessary. In this paper we are proposing an algorithm to perform si ..."
Abstract - Cited by 19 (8 self) - Add to MetaCart
, such as ASAP, ALAP, and list scheduling. Hence we refer to it as a super-scheduler. Our algorithm is a path-based scheduling algorithm. At each step, an individual path from the input DFG is scheduled. Our experiments with several DFG's extracted from MediaBench suit indicate promising results. Our
Next 10 →
Results 1 - 10 of 37
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University