• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Load rebalancing for distributed file systems in clouds,” (2013)

by H Hsiao, H Chung, H Shen, Y Chao
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

 Load Balancing in Multi-Cloud with Performance Analysis

by Akankhya Gogoi , Mr A M J Muthukumaran
"... ..."
Abstract - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...at specific time. It’s a term which is generally used in case of Internet. The whole Internet can be viewed as a cloud. Capital and operational costs can be cut using cloud computing. In spite of its enormous advantages, single cloud environment is not efficient for Many-task computing in order to handle the parallel processing of MTC Applications. Single-Cloud architecture does not minimize the sharing of hardware, memory, resources and I/O Response and Cost optimizations. The main aim of the MTC Application is to implement the High Performance Computing characters in the parallel processing [1]. A common way to mitigate risk is to employ redundant independent systems. Public cloud users can achieve this by using two or more different cloud providers. Public cloud has both technical and business risk. Technical risks such as outages and other service failures are often at the forefront of IT concerns. Working with multiple cloud provider help mitigate the risk of service disruption in local data center. Many Task Computing (MTC) defines different types of high-performance applications involving many different tasks, and requires large number of computational resources over short peri...

Enhancing Throughput of Hadoop Distributed File System for Interaction-Intensive Tasks

by Xiayu Hua , Hao Wu , Shangping Ren
"... Abstract-The performance of the Hadoop Distributed File System (HDFS)decreases dramatically when handling interactionintensive files, i.e., files that have relatively small size but are accessed frequently. The paper analyzes the cause of throughput degradation issue when accessing interaction-inte ..."
Abstract - Add to MetaCart
Abstract-The performance of the Hadoop Distributed File System (HDFS)decreases dramatically when handling interactionintensive files, i.e., files that have relatively small size but are accessed frequently. The paper analyzes the cause of throughput degradation issue when accessing interaction-intensive files and presents an enhanced HDFS architecture along with an associated storage allocation algorithm that overcomes the performance degradation problem. Experiments have shown that with the proposed architecture together with the associated storage allocation algorithm, the HDFS throughput for interaction-intensive files increase 300% in average with only a negligible performance decrease for large data set tasks.
(Show Context)

Citation Context

...S, i.e., the single namenode that supervises and manages every access to datanodes [2]. If the number of datanodes is large, the single namenode can quickly become a bottleneck when there is high frequency of I/O requests. To overcome the issue for interaction-intensive tasks, efforts are often made from three directions: (a) improve metadata structure or use cache to provide faster I/O with less overhead [3], [4], (b) extend the namenode with a hierarchical structure [5], [6] to avoid single namenode overload, and (c) design a better storage allocation algorithm to improve data accessibility [7], [8]. In this paper, we present an integrated approach to addressing the HDFS performance degradation issue for interactionintensive tasks. In particular, we extend the HDFS architecture by adding cache support and transforming the single namenode to an extended hierarchical namenode architecture. Based on the extended architecture, we develop a Particle Swarm Optimization (PSO) based storage allocation algorithm to improve the HDFS throughput for interaction-intensive tasks. II. EXTENDED HDFS NAMENODE STRUCTURE To overcome the bottleneck issue existed in the original HDFS architecture for in...

Evaluating Storage Systems for Scientific Data in the

by Ketan Maheshwari, Justin M. Wozniak, Hao Yang, Daniel S. Katz, Matei Ripeanu, Victor Zavala, Michael Wilde
"... Infrastructure-as-a-Service (IaaS) clouds are an appealing resource for scientific computing. However, the bare-bones presentation of raw Linux virtual machines leaves much to the application developer. For many cloud applications, ef-fective data handling is critical to efficient application exe-cu ..."
Abstract - Add to MetaCart
Infrastructure-as-a-Service (IaaS) clouds are an appealing resource for scientific computing. However, the bare-bones presentation of raw Linux virtual machines leaves much to the application developer. For many cloud applications, ef-fective data handling is critical to efficient application exe-cution. This paper investigates the capabilities of a variety of POSIX-accessible distributed storage systems to manage data access patterns resulting from workflow application ex-ecutions in the cloud. We leverage the expressivity of the Swift parallel scripting framework to benchmark the perfor-mance of a number of storage systems using synthetic work-loads and three real-world applications. We characterize two representative commercial storage systems (Amazon S3 and HDFS, respectively) and two emerging research-based stor-age systems (Chirp/Parrot and MosaStore). We find the use of aggregated node-local resources effective and economical compared with remotely located S3 storage. Our experi-ments show that applications run at scale with MosaStore show up to 30 % improvement in makespan time compared with those run with S3. We also find that storage-system driven application deployments in the cloud results in better runtime performance compared with an on-demand data-staging driven approach.
(Show Context)

Citation Context

...istinct data management techniques. Work described in [11] addresses the need for data oriented services specific to cloud environments such as content specific access and security. Work described in =-=[9]-=- presents algorithms to augment the load balancing among file blocks on distributed storage systems such as HDFS. Other projects have used clouds to extend high-end computing, for example a federated ...

Research Article An Effective Cache Algorithm for Heterogeneous Storage Systems

by Yong Li, Dan Feng, Zhan Shi
"... Copyright © 2013 Yong Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Modern storage environment is commonly composed of he ..."
Abstract - Add to MetaCart
Copyright © 2013 Yong Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Modern storage environment is commonly composed of heterogeneous storage devices. However, traditional cache algorithms exhibit performance degradation in heterogeneous storage systems because they were not designed to work with the diverse performance characteristics. In this paper, we present a new cache algorithm called HCM for heterogeneous storage systems. The HCM algorithm partitions the cache among the disks and adopts an effective scheme to balance the work across the disks. Furthermore, it applies benefit-cost analysis to choose the best allocation of cache block to improve the performance. Conducting simulations with a variety of traces and a wide range of cache size, our experiments show that HCM significantly outperforms the existing state-of-the-art storage-aware cache algorithms. 1.

Minimization of Cloud Task Execution Length with Workload Prediction Errors

by Sheng Di, Cho-li Wang
"... Abstract—In cloud systems, it is non-trivial to optimize task’s execution performance under user’s affordable budget, especially with possible workload prediction errors. Based on an optimal algorithm that can minimize cloud task’s execution length with predicted workload and budget, we theoreticall ..."
Abstract - Add to MetaCart
Abstract—In cloud systems, it is non-trivial to optimize task’s execution performance under user’s affordable budget, especially with possible workload prediction errors. Based on an optimal algorithm that can minimize cloud task’s execution length with predicted workload and budget, we theoretically derive the upper bound of the task execution length by taking into account the possible workload prediction errors. With such a state-of-the-art bound, the worst-case performance of a task execution with a certain workload prediction errors is predictable. On the other hand, we build a close-to-practice cloud prototype over a real cluster environment deployed with 56 virtual machines, and evaluate our solution with differ-ent resource contention degrees. Experiments show that task execution lengths under our solution with estimates of worst-case performance are close to their theoretical ideal values, in both non-competitive situation with adequate resources and the competitive situation with a certain limited available resources. We also observe a fair treatment on the resource allocation among all tasks. I.
(Show Context)

Citation Context

...K Cloud resource allocation problem has been extensively studied for years, however, most of the existing work overlooks the practical issue with possible erroneous workload predictions. Usiao et al. =-=[21]-=- proposed a distributed load rebalancing method for distributed file systems in Clouds. Unlike the file system where data size is relatively easy to predict precisely, we have to deal with erroneous p...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University