Results 1 - 10
of
20
On the Use of Microarchitecture-Driven Dynamic Voltage Scaling
, 2000
"... This paper proposes microarchitecture-driven dynamic voltage scaling as a viable solution to power efficient architectures, with little or no performance penalty. The run-time behavior exhibited by common applications, with active periods, alternated with stall periods due to cache misses, is exploi ..."
Abstract
-
Cited by 53 (0 self)
- Add to MetaCart
This paper proposes microarchitecture-driven dynamic voltage scaling as a viable solution to power efficient architectures, with little or no performance penalty. The run-time behavior exhibited by common applications, with active periods, alternated with stall periods due to cache misses, is exploited to reduce the dynamic component of power consumption via selective voltage scaling. As it is shown by experimental results, up to 20% reduction in total energy consumption, 22% in average power and 14% in peak power have been achieved with less than 6% penalty in performance. The study shows that microarchitecture-driven dynamic voltage scaling can become an effective tool for energy reduction in high-performance processors. 1 Introduction Driven by the increased levels of complexity and emergence of mobile applications, power dissipation has become a critical design concern in recent years. In addition, power consumption has especially become important for designers of high-performanc...
An Integer Linear Programming Based Approach for Parallelizing Applications in On-Chip Multiprocessors
- In IEEE/ACM Design Automation Conference
, 2002
"... With energy consumption becoming one of the first-class optimization parameters in computer system design, compilation techniques that consider performance and energy simultaneously are expected to play a central role. In particular, compiling a given application code under performance and energy co ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
With energy consumption becoming one of the first-class optimization parameters in computer system design, compilation techniques that consider performance and energy simultaneously are expected to play a central role. In particular, compiling a given application code under performance and energy constraints is becoming an important problem. In this paper, we focus on an on-chip multiprocessor architecture and present a parallelization strategy based on integer linear programming. Given an array-intensive application, our optimization strategy determines the number of processors to be used in executing each nest based on the objective function and additional compilation constraints provided by the user. Our initial experience with this strategy shows that it is very successful in optimizing array-intensive applications on on-chip multiprocessors under energy and performance constraints.
Profile-Driven Code Execution for Low Power Dissipation
, 2000
"... This paper proposes a novel technique for powerperformance trade-off based on a profile-driven code execution methodology. Specifically, we show that there is an optimal level of parallelism for energy consumption and propose a compiler-assisted technique for code annotation that can be used at run- ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
(Show Context)
This paper proposes a novel technique for powerperformance trade-off based on a profile-driven code execution methodology. Specifically, we show that there is an optimal level of parallelism for energy consumption and propose a compiler-assisted technique for code annotation that can be used at run-time to adaptively trade-off power and performance. As shown by experimental results, our approach is up to 23% better than clock throttling and is as efficient as voltage scaling (up to 10% better in some cases). The technique proposed in this paper can be used by an ACPI-compliant power manager for prolonging battery life or as a passive cooling feature for thermal management. 1 Introduction Power dissipation has become a critical design concern in recent years, driven by the increased levels of complexity and emergence of mobile applications. While it is generally agreed that tools for power estimation and optimization do exist for hardware specifications at different levels (circuit, g...
A design framework to efficiently explore energy-delay tradeofss
- In CODES
, 2001
"... Comprehensive exploration of the design space parameters at the system-level is a crucial task to evaluate architec-tural tradeoffs accounting for both energy and performance constraints. In this paper, we propose a system-level de-sign methodology for the efficient exploration of the mem-ory archit ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Comprehensive exploration of the design space parameters at the system-level is a crucial task to evaluate architec-tural tradeoffs accounting for both energy and performance constraints. In this paper, we propose a system-level de-sign methodology for the efficient exploration of the mem-ory architecture from the energy-delay combined perspec-tive. The aim is to find a sub-optimal configuration of the memory hierarchy without performing the exhaustive analy-sis of the parameters space. The target system architecture includes the processor, separated instruction and data level-one caches, the main memory, and the system buses. The methodology is based on the sensitivity analysis of the op-timization function with respect to the tuning parameters of the cache architecture (mainly cache size, block size and associativity). The effectiveness of the proposed method-ology has been demonstrated through the design space ex-ploration of a real-world example: a MicroSPARC2-based system running the Mediabench suite. Experimental results have shown an optimization speedup of 329 times with re-spect to the full search, while the near-optimal system-level configuration is characterized by a distance from the optimal full search configuration in the band of 10%. 1.
A flexible framework for fast multi-objective design space exploration of embedded systems
- Proceedings of 13th International Workshop, PATMOS 2003
"... Abstract. The evaluation of the best system-level architecture in terms of energy and performance is of mainly importance for a broad range of embedded SOC platforms. In this paper, we address the problem of the efficient exploration of the architectural design space for parame-terized microprocesso ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
(Show Context)
Abstract. The evaluation of the best system-level architecture in terms of energy and performance is of mainly importance for a broad range of embedded SOC platforms. In this paper, we address the problem of the efficient exploration of the architectural design space for parame-terized microprocessor-based systems. The architectural design space is multi-objective, so our aim is to find all the Pareto-optimal configura-tions representing the best power-performance design trade-offs by vary-ing the architectural parameters of the target system. In particular, the paper presents a Design Space Exploration (DSE) framework tuned to efficiently derive Pareto-optimal curves. The main characteristics of the proposed framework consist of its flexibility and modularity, mainly in terms of target architecture, related system-level executable models, ex-ploration algorithms and system-level metrics. The analysis of the pro-posed framework has been carried out for a parameterized superscalar architecture executing a selected set of benchmarks. The reported results have shown a reduction of the simulation time of up to three orders of magnitude with respect to the full search strategy, while maintaining a good level of accuracy (under 4 % on average). 1
Application-Driven Processor Design Exploration for Power-Performance Trade-off Analysis
, 2001
"... This paper presents an efficient design exploration environment for high-end core processors. The heart of the proposed design exploration framework is a two-level simulation engine that combines detailed simulation for critical portions of the code with fast profiling for the rest. Our two-level si ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
This paper presents an efficient design exploration environment for high-end core processors. The heart of the proposed design exploration framework is a two-level simulation engine that combines detailed simulation for critical portions of the code with fast profiling for the rest. Our two-level simulation methodology relies on the inherent clustered structure of application programs and is completely general and applicable to any microarchitectural power/performance simulation engine. The proposed simulation methodology is 3-17X faster, while being sufficiently accurate (within 5%) when compared to the fully detailed simulator. The design exploration environment is able to vary different microarchitectural configurations and find the optimal one as far as energyxdelay product is concerned in a matter of minutes. The parameters that are found to affect drastically the core processor power/performance metrics are issue width, instruction window size, and pipeline depth, along with correlated clock frequency. For very high-end configurations for which balanced pipelining may not be possible, opportunities for running faster stages at lower voltage exist. In such cases, by using up to 3 voltage levels, the energyxdelay product is reduced by 23-30% when compared to the single voltage implementation.
Static Power Modeling of 32-bit Microprocessors
, 2002
"... The paper presents a novel strategy aimed at modelling instruction energy consumption of 32-bits microprocessors. Differently from former approaches, the proposed instruction-level power model is founded on a functional decomposition of the activities accomplished by a generic microprocessor. The p ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
The paper presents a novel strategy aimed at modelling instruction energy consumption of 32-bits microprocessors. Differently from former approaches, the proposed instruction-level power model is founded on a functional decomposition of the activities accomplished by a generic microprocessor. The proposed model has significant generalization capabilities. It allows estimation of the power figures of the entire instruction-set starting from the analysis of a subset, as well as to power characterize new processors by using the model obtained by considering other microprocessors. The model is formally presented and justified and its actual application over five commercial microprocessors is included.This static characterization is the basic information for system-level power modelling of hardware/software architectures.
The shape of the processor design space and its implications for early stage explorations
- In 7th WSEAS International Conference on Automatic Control, Modeling and Simulation
"... Abstract:- Designing a microprocessor involves determining the optimal microarchitecture for a given objective function and a given set of constraints. This paper studies the shape of the design space of superscalar out-of-order processors under different objective functions and constraints. We show ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
Abstract:- Designing a microprocessor involves determining the optimal microarchitecture for a given objective function and a given set of constraints. This paper studies the shape of the design space of superscalar out-of-order processors under different objective functions and constraints. We show that local optima exist whose objective function values are significantly worse than for the global optimum, in several cases more than 20 % off. We subsequently consider the implications of this observation for early design stage exploration studies. Four design space search algorithms (random descent, steepest descent, one-parameter-at-a-time and simulated annealing) are evaluated according to their ability to avoid local optima and their overall simulation time. We conclude that one-parameter-at-a-time achieves a good balance between both criteria. In addition, we study the usefulness of fast simulation techniques for early design stage exploration. A case study with statistical simulation shows that significant simulation speedups are achieved while incurring little inaccuracy (a few percent) on the optimal design point.
A system-level methodology for fast multi-objective design space exploration
- In Proceedings of the 13th ACM Great Lakes symposium on VLSI
, 2003
"... In this paper, we address the problem of the efficient explo-ration of the architectural design space for parameterized systems. Since the design space is multi-objective, our aim is to find all the Pareto-optimal configurations that represent the best design trade-offs by varying the architectural ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
In this paper, we address the problem of the efficient explo-ration of the architectural design space for parameterized systems. Since the design space is multi-objective, our aim is to find all the Pareto-optimal configurations that represent the best design trade-offs by varying the architectural pa-rameters of the target system. In particular, the paper pro-poses a Design Space Exploration (DSE) framework based on a random search algorithm that has been tuned to effi-ciently derive Pareto-optimal curves. The reported design space exploration results have shown a reduction of the sim-ulation time of up to two orders of magnitude with respect to full search strategy, while maintaining an average accuracy within 3%.
Multi-objective Co-exploration of Source Code Transformations and Design Space Architectures for Low-power Embedded Systems
- In SAC ’04: Proceedings of the 2004 ACM Symposium on Applied computing
, 2004
"... The exploration of the architectural design space in terms of energy and performance is of mainly importance for a broad range of embedded platforms based on the System-On-Chip approach. This paper proposes a methodology for the co-exploration of the design space composed of architec-tural parameter ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
The exploration of the architectural design space in terms of energy and performance is of mainly importance for a broad range of embedded platforms based on the System-On-Chip approach. This paper proposes a methodology for the co-exploration of the design space composed of architec-tural parameters and source program transformations. A heuristic technique based on Pareto Simulated Annealing (PSA) has been used to efficiently span the multi-objective co-design space composed of the product of the parame-ters related to the selected program transformations and the configurable architecture. The analysis of the proposed framework has been carried out for a parameterized super-scalar architecture executing a selected set of benchmarks. The reported results show the effectiveness of the proposed co-exploration with respect to the independent exploration of the transformation and architectural spaces to efficiently derive approximate Pareto curves.