• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

On Energy Management, Load Balancing and Replication. (2010)

by W Lang, J M Patel, J F Naughton
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 14
Next 10 →

Energy Management for MapReduce Clusters

by Willis Lang, Jignesh M. Patel
"... The area of cluster-level energy management has attracted significant research attention over the past few years. One class of techniques to reduce the energy consumption of clusters is to selectively power down nodes during periods of low utilization to increase energy efficiency. One can think of ..."
Abstract - Cited by 52 (3 self) - Add to MetaCart
The area of cluster-level energy management has attracted significant research attention over the past few years. One class of techniques to reduce the energy consumption of clusters is to selectively power down nodes during periods of low utilization to increase energy efficiency. One can think of a number of ways of selectively powering down nodes, each with varying impact on the workload response time and overall energy consumption. Since the MapReduce framework is becoming “ubiquitous”, the focus of this paper is on developing a framework for systematically considering various MapReduce node power down strategies, and their impact on the overall energy consumption and workload response time. We closely examine two extreme techniques that can be accommodated in this framework. The first is based on a recently proposed technique called “Covering Set ” (CS) that keeps only a small fraction of the nodes powered up during periods of low utilization. At the other extreme is a technique that we propose in this paper, called the All-In Strategy (AIS). AIS uses all the nodes in the cluster to run a workload and then powers down the entire cluster. Using both actual evaluation and analytical modeling we bring out the differences between these two extreme techniques and show that AIS is often the right energy saving strategy. 1.
(Show Context)

Citation Context

...resented a study on MR operating variables [12]. Studies into shutting down online web servers were discussed in [30, 32]. Shutting down a replicated parallel database environ135ment was analyzed in =-=[20]-=-. Other related methods [29, 31] either rely on learning request skew, specialized hardware, and data migration. Increasing utilization can be done by consolidation using a virtual machine (VM) soluti...

Wimpy Node Clusters: What About Non-Wimpy Workloads

by Willis Lang, Jignesh M. Patel, Srinath Shankar - In DaMoN , 2010
"... The high cost associated with powering servers has introduced new challenges in improving the energy efficiency of clusters running data processing jobs. Traditional high-performance servers are largely energy inefficient due to various factors such as the over-provisioning of resources. The increas ..."
Abstract - Cited by 28 (6 self) - Add to MetaCart
The high cost associated with powering servers has introduced new challenges in improving the energy efficiency of clusters running data processing jobs. Traditional high-performance servers are largely energy inefficient due to various factors such as the over-provisioning of resources. The increasing trend to replace traditional high- performance server nodes with low-power low-end nodes in clusters has recently been touted as a solution to the cluster energy problem. However, the key tacit assumption that drives such a solution is that the proportional scale-out of such low-power cluster nodes results in constant scaleup in performance. This paper studies the validity of such an assumption using measured price and performance results from a low-power Atom-based node and a traditional Xeon-based server and a number of published parallel scaleup results. Our results show that in most cases, computationally complex queries exhibit disproportionate scaleup characteristics which potentially makes scale-out with low-end nodes an expensive and lower performance solution. 1.
(Show Context)

Citation Context

...ardware has been presented which targets the needs of datacenter operators [11]. Powering down cluster nodes has been one method in which these studies have looked at achieving energy proportionality =-=[14, 15, 16]-=-. In [14], a study on powering down a replicated parallel database cluster with an eye on load balancing was presented. Powering down MapReduce clusters was discussed in [15]. A new server architectur...

Towards Energy-Efficient Database Cluster Design

by Willis Lang, Mehul A. Shah, Stavros Harizopoulos, Dimitris Tsirogiannis, Jignesh M. Patel
"... Energy is a growing component of the operational cost for many “big data ” deployments, and hence has become increasingly important for practitioners of large-scale data analysis who require scale-out clusters or parallel DBMS appliances. Although a number of recent studies have investigated the ene ..."
Abstract - Cited by 11 (0 self) - Add to MetaCart
Energy is a growing component of the operational cost for many “big data ” deployments, and hence has become increasingly important for practitioners of large-scale data analysis who require scale-out clusters or parallel DBMS appliances. Although a number of recent studies have investigated the energy efficiency of DBMSs, none of these studies have looked at the architectural design space of energy-efficient parallel DBMS clusters. There are many challenges to increasing the energy efficiency of a DBMS cluster, including dealing with the inherent scaling inefficiency of parallel data processing, and choosing the appropriate energy-efficient hardware. In this paper, we experimentally examine and analyze a number of key parameters related to these challenges for designing energy-efficient database clusters. We explore the cluster design space using empirical results and propose a model that considers the key bottlenecks to energy efficiency in a parallel DBMS. This paper represents a key first step in designing energy-efficient database clusters, which is increasingly important given the trend toward parallel database appliances. 1.
(Show Context)

Citation Context

...ks 8x300GB TPC-H size 1TB (scale 1000) Network 1Gb/s CPU Intel X5550 2 sockets SysPower 130.03C0.2369 Table 1: Cluster-V Configuration C = CPU utilization onto few servers and turn off unused servers =-=[23, 24, 27]-=-. However, switching servers on and off has direct costs such as increased query latency and decreased hardware reliability. Another approach is to consolidate the server use for a given task, and imp...

Toward multitenant performance SLOs

by Willis Lang, Srinath Shankar, Jignesh M. Patel, Ajay Kalhan - in Proc. IEEE 28th ICDE , 2012
"... Abstract—As traditional and mission-critical relational database workloads migrate to the cloud in the form of Database-as-a-Service (DaaS), there is an increasing motivation to provide performance goals in Service Level Objectives (SLOs). Providing such performance goals is challenging for DaaS pro ..."
Abstract - Cited by 8 (0 self) - Add to MetaCart
Abstract—As traditional and mission-critical relational database workloads migrate to the cloud in the form of Database-as-a-Service (DaaS), there is an increasing motivation to provide performance goals in Service Level Objectives (SLOs). Providing such performance goals is challenging for DaaS providers as they must balance the performance that they can deliver to tenants and the data center’s operating costs. In general, aggressively aggregating tenants on each server reduces the operating costs but degrades performance for the tenants, and vice versa. In this paper, we present a framework that takes as input the tenant workloads, their performance SLOs, and the server hardware that is available to the DaaS provider, and outputs a cost-effective recipe that specifies how much hardware to provision and how to schedule the tenants on each hardware resource. We evaluate our method and show that it produces effective solutions that can reduce the costs for the DaaS provider while meeting performance goals. Index Terms—Database management, relational databases 1
(Show Context)

Citation Context

...ptions provides a rich direction for future work. One direction for future work is to include the impact of replication and load-balancing in our framework, perhaps building on the ideas presented in =-=[28]-=-. Additionally, while our experimental evaluation uses average performance as an SLO metric, it could be extended to include variance as well (as implied by the use of random variables in Definition 2...

A Case for Micro-Cellstores: Energy-Efficient Data Management on Recycled Smartphones ∗

by Stavros Harizopoulos, Spiros Papadimitriou
"... Increased energy costs and concerns for sustainability make the following question more relevant than ever: can we turn old or unused computing equipment into cost- and energyefficient modules that can be readily repurposed? We believe the answer is yes, and our proposal is to turn unused smartphone ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
Increased energy costs and concerns for sustainability make the following question more relevant than ever: can we turn old or unused computing equipment into cost- and energyefficient modules that can be readily repurposed? We believe the answer is yes, and our proposal is to turn unused smartphones into micro-data center composable modules. In this paper, we introduce the concept of a Micro-Cellstore (MCS), a stand-alone data-appliance housing dozens of recycled smartphones. Through detailed power and performance measurements on a Linux-based current-generation smartphone, we assess the potential of MCSs as a data management platform. In this paper we focus on scan-based partitionable workloads. We show that smartphones are overall more energy efficient than recently proposed low-power alternatives, based on an initial evaluation over a wide range of single-node database scan workloads, and that the gains become more significant when operating on narrow tuples (i.e., column-stores, or compressed row-stores). Our initial results are very encouraging, showing efficiency gains of up to 6×, and indicate several promising future directions.
(Show Context)

Citation Context

... of clusters and data centers include holistic redesigns that treat a data center as a single computer [4, 16], cluster workload consolidation to meet power constraints and reduce energy requirements =-=[15, 13, 12]-=-, and considerations of low-power architectures [3, 19]. In this section we briefly discuss recent efforts in improving the energy efficiency of database applications. Energy efficiency in databases. ...

Energy proportionality for disk storage using replication

by Jinoh Kim, Doron Rotem , 2010
"... Saving energy for storage is of major importance as storage devices (and cooling them off) may contribute over 25 percent of the total energy consumed in a datacenter. Recent work introduced the concept of energy proportionality and argued that it is a more relevant metric than just energy saving as ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Saving energy for storage is of major importance as storage devices (and cooling them off) may contribute over 25 percent of the total energy consumed in a datacenter. Recent work introduced the concept of energy proportionality and argued that it is a more relevant metric than just energy saving as it takes into account the tradeoff between energy consumption and performance. In this paper, we present a novel approach, called FREP (Fractional Replication for Energy Proportionality), for energy management in large datacenters. FREP includes a replication strategy and basic functions to enable flexible energy management. Specifically, our method provides performance guarantees by adaptively controlling the power states of a group of disks based on observed and predicted workloads. Our experiments, using a set of real and synthetic traces, show that FREP dramatically reduces energy requirements with a minimal response time penalty. Categories andSubject Descriptors C.4 [Performance of Systems]: Reliability, availability, and serviceability
(Show Context)

Citation Context

...s it is reported that storage resources in datacenters are often considerably under-utilized and use only a small fraction of the total available capacity (less than 25% according to several studies) =-=[14, 19, 21]-=-. In this paper, we present a novelreplication strategy that achieves energy benefits while maintaining performance and fault tolerance. In particular, our fractional replication enables flexible gra...

Exploiting redundancies and deferred writes to conserve energy in erasure-coded storage clusters

by Jianzhong Huang, Fenghao Zhang, Xiao Qin - Trans. Storage , 2013
"... We present a power-efficient scheme for erasure-coded storage clusters—ECS2—which aims to offer high en-ergy efficiency with marginal reliability degradation. ECS2 utilizes data redundancies and deferred writes to conserve energy. In ECS2 parity blocks are buffered exclusively in active data nodes w ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
We present a power-efficient scheme for erasure-coded storage clusters—ECS2—which aims to offer high en-ergy efficiency with marginal reliability degradation. ECS2 utilizes data redundancies and deferred writes to conserve energy. In ECS2 parity blocks are buffered exclusively in active data nodes whereas parity nodes are placed into low-power mode. (k+ r,k) RS-coded ECS2 can achieve (r+ 1)/2-fault tolerance for k active data nodes and r-fault tolerance for all k+r nodes. ECS2 employs the following three optimizing approaches to improve the energy efficiency of storage clusters. (1) An adaptive threshold policy takes system configu-rations and I/O workloads into account to maximize standby time periods; (2) a selective activation policy minimizes the number of power-transitions in storage nodes; and (3) a region-based buffer policy speeds up the synchronization process by migrating parity blocks in a batch method. After implementing an ECS2-based prototype in a Linux cluster, we evaluated its energy efficiency and performance using four different types of I/O workloads. The experimental results indicate that compared to energy-oblivious erasure-coded storage, ECS2 can save the energy used by storage clusters up to 29.8 % and 28.0 % in read-intensive and write-dominated workloads when k = 6 and r = 3, respectively. The results also show that ECS2 accomplishes high power efficiency in both normal and failed cases without noticeably affecting the I/O performance of storage clusters.
(Show Context)

Citation Context

...ficient than that at the disk level in storage clusters. Cluster-oriented energy management has been addressed to achieve high scalability and cost effectiveness of data storage [Lang and Patel 2010; =-=Lang et al. 2010-=-]. Thus, we focus on conserving the energy of erasure-coded storage clusters by managing the power of storage nodes rather than disks. In an erasure-coded storage cluster, redundant data nodes (parity...

A Primary Shift Protocol for Improving Availability in Replication Systems

by Almetwally M. Mostafa, Ahmed E. Youssef
"... Primary Backup Replication (PBR) is the most common technique to achieve availability in distributed systems. However, primary failure remains a crucial problem that threatens availability. When the primary fails, backup nodes in the system have to elect a new primary node in order to maintain adequ ..."
Abstract - Add to MetaCart
Primary Backup Replication (PBR) is the most common technique to achieve availability in distributed systems. However, primary failure remains a crucial problem that threatens availability. When the primary fails, backup nodes in the system have to elect a new primary node in order to maintain adequate system’s operation. During election, the system suffers from transaction loss, communication overhead due to messages exchange necessary to preserve data consistency, and a notable delay caused by the execution of Leader Election Algorithms (LEA). Primary failures can be unpredictable (i.e., unplanned), such as primary node crashes and network outages, or predictable (i.e., planned), such as primary’s scheduled shutdown to perform routine maintenance or software upgrade. Traditionally, PBR employ LEA to recover from both unplanned and planned outages. In this paper, we propose a novel protocol, called Primary Shift Replication (PSR), to avoid election during planned outages. PSR shifts the primary role from the current primary to another scheduled node (without election) when a planned outage is about to occur. Number of messages and communication time required to shift the primary node to another node is much less than number of messages and time required to perform leader election; therefore, PSR improves system’s availability. Moreover, PSR guarantees no transactions loss during the shift mode, hence, it preserves data consistency.
(Show Context)

Citation Context

...uire a high degree of consistency and availability of shared data objects. PBR approach requires one node (i.e., leader or primary) to act as an organizer for other replicas in the distributed system =-=[1, 2, 12, 16, 17, 18]-=-. The primary maintains object store consistency by controlling access to the shared objects and executes the transactions that clients submit at different replicas. In Chain Replication (CR) [16], th...

E 2 ARS: An Energy-Effective Adaptive Replication Strategy in Cloud Storage System

by Xindong You, Li Zhou, Jie Huang, Jinli Zhang, Congfeng Jiang, Jian Wan , 2013
"... Abstract: In order to solve the urgent issue of the energy consumption in the cloud storage system. An Energy-effective adaptive replication strategy (E 2 ARS) is proposed in this paper, in which data partition mechanism, minimal replicas determining model, replicas placement strategies and the adap ..."
Abstract - Add to MetaCart
Abstract: In order to solve the urgent issue of the energy consumption in the cloud storage system. An Energy-effective adaptive replication strategy (E 2 ARS) is proposed in this paper, in which data partition mechanism, minimal replicas determining model, replicas placement strategies and the adaptive gear-shifting mechanism are elaborately designed. We try to conserve the energy consumption while satisfying the users ’ desired response time by our E 2 ARS scheme. Mathematical analysis show that our E 2 ARS scheme will save energy consumption definitely when the system’s workload is light or desired response time is loose. And the simulation experiment results demonstrate that through our E 2 ARS scheme energy consumption can be saved while with Qos satisfied and data availability guaranteed, when varying the arrival rate, desired response, replicas number, parallelism degrees.

Rethinking Query Processing for Energy Efficiency: Slowing Down

by To Win The Race, Willis Lang, Ramakrishnan K, Jignesh M. Patel
"... The biggest change in the TPC benchmarks in over two decades is now well underway – namely the addition of an energy efficiency metric along with traditional performance metrics. This change is fueled by the growing, real, and urgent demand for energy-efficient database processing. Database query pr ..."
Abstract - Add to MetaCart
The biggest change in the TPC benchmarks in over two decades is now well underway – namely the addition of an energy efficiency metric along with traditional performance metrics. This change is fueled by the growing, real, and urgent demand for energy-efficient database processing. Database query processing engines must now consider becoming energy-aware, else they risk missing many opportunities for significant energy savings. While other recent work has focused on solely optimizing for energy efficiency, we recognize that such methods are only practical if they also consider performance requirements specified in SLAs. The focus of this paper is on the design and evaluation of a general framework for query optimization that considers both performance constraints and energy consumption as first-class optimization criteria. Our method recognizes and exploits the evolution of modern computing hardware that allows hardware components to operate in different energy and performance states. Our optimization framework considers these states and uses an energy consumption model for database query operations. We have also built a model for an actual commercial DBMS. Using our model the query optimizer can pick query plans that meet traditional performance goals (e.g., specified by SLAs), but result in lower energy consumption. Our experimental evaluations show that our system-wide energy savings can be significant and point toward greater opportunities with upcoming energy-aware technologies on the horizon. 1
(Show Context)

Citation Context

...re studied by [31, 32], as it can be applied to optimizers for parallel DBMSs as well. Such parallel DBMSs also have system settings that include cluster configuration as well as server configuration =-=[20, 21]-=-. Such considerations for parallel DBMSs is part of future work. To show the potential and validity of this approach, in Figure 1 we show an actual ERP for a single equijoin query on two Wisconsin Ben...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University