Results 1 - 10
of
19
Powerpack: Energy profiling and analysis of high-performance systems and applications
- IEEE Transactions on Parallel and Distributed Systems
, 2009
"... Abstract—Energy efficiency is a major concern in modern high-performance computing system design. In the past few years, there has been mounting evidence that power usage limits system scale and computing density, and thus, ultimately system performance. However, despite the impact of power and ener ..."
Abstract
-
Cited by 58 (10 self)
- Add to MetaCart
(Show Context)
Abstract—Energy efficiency is a major concern in modern high-performance computing system design. In the past few years, there has been mounting evidence that power usage limits system scale and computing density, and thus, ultimately system performance. However, despite the impact of power and energy on the computer systems community, few studies provide insight to where and how power is consumed on high-performance systems and applications. In previous work, we designed a framework called PowerPack that was the first tool to isolate the power consumption of devices including disks, memory, NICs, and processors in a high-performance cluster and correlate these measurements to application functions. In this work, we extend our framework to support systems with multicore, multiprocessor-based nodes, and then provide in-depth analyses of the energy consumption of parallel applications on clusters of these systems. These analyses include the impacts of chip multiprocessing on power and energy efficiency, and its interaction with application executions. In addition, we use PowerPack to study the power dynamics and energy efficiencies of dynamic voltage and frequency scaling (DVFS) techniques on clusters. Our experiments reveal conclusively how intelligent DVFS scheduling can enhance system energy efficiency while maintaining performance. Index Terms—Distributed system, CMP-based cluster, energy efficiency, power measurement, system tools, power management, dynamic voltage and frequency scaling.
Cpu miser: A performancedirected, run-time system for power-aware clusters
- In ICPP ’07: Proceedings of the 2007 International Conference on Parallel Processing
, 2007
"... ..."
(Show Context)
Bounding Energy Consumption in Large-Scale MPI Programs
"... Power is now a first-order design constraint in large-scale parallel computing. Used carefully, dynamic voltage scaling can execute parts of a program at a slower CPU speed to achieve energy savings with a relatively small (possibly zero) time delay. However, the problem of when to change frequencie ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
(Show Context)
Power is now a first-order design constraint in large-scale parallel computing. Used carefully, dynamic voltage scaling can execute parts of a program at a slower CPU speed to achieve energy savings with a relatively small (possibly zero) time delay. However, the problem of when to change frequencies in order to optimize energy savings is NP-complete, which has led to many heuristic energysaving algorithms. To determine how closely these algorithms approach optimal savings, we developed a system that determines a bound on the energy savings for an application. Our system uses a linear programming solver that takes as inputs the application communication trace and the cluster power characteristics and then outputs a schedule that realizes this bound. We apply our system to three scientific programs, two of which exhibit load imbalance—particle simulation and UMT2K. Results from our bounding technique show particle simulation is more amenable to energy savings than UMT2K. 1.
A roofline model of energy
"... Abstract—We describe an energy-based analogue of the timebased roofline model. We create this model from the perspective of algorithm designers and performance tuners, with the intent not of making exact predictions, but rather, developing highlevel analytic insights into the possible relationships ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Abstract—We describe an energy-based analogue of the timebased roofline model. We create this model from the perspective of algorithm designers and performance tuners, with the intent not of making exact predictions, but rather, developing highlevel analytic insights into the possible relationships among the time, energy, and power costs of an algorithm. The model expresses algorithms in terms of operations, concurrency, and memory traffic; and characterizes the machine based on a small number of simple cost parameters, namely, the time and energy costs per operation or per word of communication. We confirm the basic form of the model experimentally. From this model, we suggest under what conditions we ought to expect an algorithmic time-energy trade-off, and show how algorithm properties may help inform power management. Keywords-performance analysis; power and energy modeling; computational intensity; machine balance; roofline model I.
Iso-energy-efficiency: An approach to scalable system power-performance optimization
- in International Conference for High Performance Computing, Networking,Storage and Analysis (SC11), Ph.D. forum
, 2011
"... Abstract—The power consumption of a large scale system ultimately limits its performance. Consuming less energy while preserving performance leads to better system utilization at scale. The iso-energy-efficiency model was proposed as a metric and methodology for explaining power and performance effi ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Abstract—The power consumption of a large scale system ultimately limits its performance. Consuming less energy while preserving performance leads to better system utilization at scale. The iso-energy-efficiency model was proposed as a metric and methodology for explaining power and performance efficiency on scalable systems. For use in practice, we need to determine what parameters should be modified to maintain a desired efficiency. Unfortunately, without extension, the iso-energy-efficiency model cannot be used for this purpose. In this paper we extend the iso-energy-efficiency model to identify ap-propriate efficiency values for workload and power scaling on clusters. We propose the use of “correlation functions ” to quan-titatively explain the isolated and interacting effects of these two parameters for three representative applications: LINPACK, row-oriented matrix multiplication, and 3D Fourier transform. We show quantitatively that the iso-energy-efficiency model with correlation functions is effective at maintaining efficiency as system size scales. Keywords-Iso-energy-efficiency; performance isoefficiency; system utilization; power aware computing; I.
Practical Performance Prediction Under Dynamic Voltage Frequency Scaling
- In Int’l Green Computing Conf. and Workshops
, 2011
"... Predicting performance under Dynamic Voltage Frequency Scaling (DVFS) remains an open problem. Current best practice explores available performance counters to serve as input to linear regression models that predict performance. However, the inaccuracies of these models require that large-scale DVF ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Predicting performance under Dynamic Voltage Frequency Scaling (DVFS) remains an open problem. Current best practice explores available performance counters to serve as input to linear regression models that predict performance. However, the inaccuracies of these models require that large-scale DVFS runtime algorithms predict performance conservatively in order to avoid significant consequences of mispredictions. Recent theoretical work based on interval analysis advocates a more accurate and reliable solution based on a single new performance counter, Leading Loads. In this paper, we evaluate a processor-independent analytic framework for existing performance counters based on this interval analysis model. We begin with an analysis of the counters used in many published models. We then briefly describe the Leading Loads architectural model and describe how we can use Leading Loads Cycles to predict performance under DVFS. We validate this approach for the NAS Parallel Benchmarks and SPEC CPU 2006 benchmarks, demonstrating an order of magnitude improvement in both error and standard deviation compared to the best existing approaches.
The system power control unit based on the on-chip wireless communication system
- Article ID 939254, 9 pages, 2013. 14 The Scientific World Journal
"... Currently, the on-chip wireless communication system (OWCS) includes 2nd-generation (2G), 3rd-generation (3G), and long-term evolution (LTE) communication subsystems. To improve the power consumption of OWCS, a typical architecture design of system power control unit (SPCU) is given in this paper, ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Currently, the on-chip wireless communication system (OWCS) includes 2nd-generation (2G), 3rd-generation (3G), and long-term evolution (LTE) communication subsystems. To improve the power consumption of OWCS, a typical architecture design of system power control unit (SPCU) is given in this paper, which can not only make a 2G, a 3G, and an LTE subsystems enter sleep mode, but it can also wake them up from sleep mode via the interrupt. During the sleep mode period, either the real-time sleep timer or the global system for mobile (GSM) communication sleep timer can be used individually to arouse the corresponding subsystem. Compared to previous sole voltage supplies on the OWCS, a 2G, a 3G, or an LTE subsystem can be independently configured with three different voltages and frequencies in normal work mode. In the meantime, the voltage supply monitor, which is an important part in the SPCU, can significantly guard the voltage of OWCS in real time. Finally, the SPCU may implement dynamic voltage and frequency scaling (DVFS) for a 2G, a 3G, or an LTE subsystem, which is automatically accomplished by the hardware.
Software Power Analysis And Optimization For Power-Aware Multicore Systems
, 2014
"... ..."
(Show Context)
Transition Probability: A Novel Modeling Approach of Energy Consumption for Storage Subsystem
"... Abstract: In this paper, considering the transitions among different power states such as active, idle, and standby, we define the transition probability mathematically from active mode to idle mode, and propose a novel analytical approach to evaluate energy consumption and performance metrics, i.e ..."
Abstract
- Add to MetaCart
Abstract: In this paper, considering the transitions among different power states such as active, idle, and standby, we define the transition probability mathematically from active mode to idle mode, and propose a novel analytical approach to evaluate energy consumption and performance metrics, i.e., queue length, throughput, and response time. Simulation results indicate that our proposed model might motivate storage researchers to exploit a quick analysis toolkit and give some insights for the design of power-aware storage systems.
1 Energy Profiling and Analysis of the HPC Challenge Benchmarks
"... Future high performance systems must use energy efficiently to achieve PFLOPS computational speeds and beyond. To address this challenge, we must first understand the power and energy characteristics of high performance computing applications. In this paper, we use a power-performance profiling fram ..."
Abstract
- Add to MetaCart
(Show Context)
Future high performance systems must use energy efficiently to achieve PFLOPS computational speeds and beyond. To address this challenge, we must first understand the power and energy characteristics of high performance computing applications. In this paper, we use a power-performance profiling framework called PowerPack to study the power and energy profiles of the HPC Challenge benchmarks. We present detailed experimental results along with in-depth analysis of how each benchmark's workload characteristics affect power consumption and energy efficiency. This paper summarizes various findings using the HPC Challenge benchmarks including but not limited to: (1) identifying application power profiles by function and component in a high performance cluster; (2) correlating application's memory access patterns to power consumption for these benchmarks; and (3) exploring how energy consumption scales with system size and workload. 1