Results 1 - 10
of
30
Timecard: Controlling User-Perceived Delays in Server-Based Mobile Applications
"... Providing consistent response times to users of mobile applications is challenging because there are several variable delays between the start of a user’s request and the completion of the response. These delays include location lookup, sensor data acquisition, radio wake-up, network transmissions, ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
Providing consistent response times to users of mobile applications is challenging because there are several variable delays between the start of a user’s request and the completion of the response. These delays include location lookup, sensor data acquisition, radio wake-up, network transmissions, and processing on both the client and server. To allow applications to achieve consistent response times in the face of these variable delays, this paper presents the design, implementation, and evaluation of the Timecard system. Timecard provides two abstractions: the first returns the time elapsed since the user started the request, and the second returns an estimate of the time it would take to transmit the response from the server to the client and process the response at the client. With these abstractions, the server can adapt its processing time to control the end-to-end delay for the request. Implementing these abstractions requires Timecard to track delays across multiple asynchronous activities, handle time skew between client and server, and estimate network transfer times. Experiments with Timecard incorporated into two mobile applications show that the end-to-end delay is within 50 ms of the target delay of 1200 ms over 90 % of the time. 1
The Case for Tiny Tasks in Compute Clusters
"... To see the world in a grain of sand... ..."
(Show Context)
Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks
"... Tachyon is a distributed file system enabling reliable data sharing at memory speed across cluster computing frame-works. While caching today improves read workloads, writes are either network or disk bound, as replication is used for fault-tolerance. Tachyon eliminates this bottleneck by pushing li ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
Tachyon is a distributed file system enabling reliable data sharing at memory speed across cluster computing frame-works. While caching today improves read workloads, writes are either network or disk bound, as replication is used for fault-tolerance. Tachyon eliminates this bottleneck by pushing lineage, a well-known technique, into the storage layer. The key challenge in making a long-running lineage-based storage system is timely data recovery in case of fail-ures. Tachyon addresses this issue by introducing a check-pointing algorithm that guarantees bounded recovery cost and resource allocation strategies for recomputation under commonly used resource schedulers. Our evaluation shows that Tachyon outperforms in-memory HDFS by 110x for writes. It also improves the end-to-end latency of a realistic workflow by 4x. Tachyon is open source and is deployed at multiple companies. 1.
The Quantcast File System
"... The Quantcast File System (QFS) is an efficient alternative to the Hadoop Distributed File System (HDFS). QFS is written in C++, is plugin compatible with Hadoop MapReduce, and offers several efficiency improvements relative to HDFS: 50 % disk space savings through erasure coding instead of replicat ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
The Quantcast File System (QFS) is an efficient alternative to the Hadoop Distributed File System (HDFS). QFS is written in C++, is plugin compatible with Hadoop MapReduce, and offers several efficiency improvements relative to HDFS: 50 % disk space savings through erasure coding instead of replication, a resulting doubling of write throughput, a faster name node, support for faster sorting and logging through a concurrent append feature, a native command line client much faster than hadoop fs, and global feedback-directed I/O device management. As QFS works out of the box with Hadoop, migrating data from HDFS to QFS involves simply executing hadoop distcp. QFS is being developed fully open source and is available under an Apache license from
hatS: A Heterogeneity-Aware Tiered Storage for Hadoop
"... Abstract—Hadoop has become the de-facto large-scale data processing framework for modern analytics applications. A major obstacle for sustaining high performance and scalability in Hadoop is managing the data growth while meeting the ever higher I/O demand. To this end, a promising trend in storage ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
Abstract—Hadoop has become the de-facto large-scale data processing framework for modern analytics applications. A major obstacle for sustaining high performance and scalability in Hadoop is managing the data growth while meeting the ever higher I/O demand. To this end, a promising trend in storage systems is to utilize hybrid and heterogeneous devices — Solid State Disks (SSD), ramdisks and Network Attached Storage (NAS), which can help achieve very high I/O rates at acceptable cost. However, the Hadoop Distributed File System (HDFS) that is unable to exploit such heterogeneous storage. This is because HDFS works on the assumption that the underlying devices are homogeneous storage blocks, disregarding their individual I/O characteristics, which leads to performance degradation. In this paper, we present hatS, a Heterogeneity-Aware Tiered Storage, which is a novel redesign of HDFS into a multi-tiered storage system that seamlessly integrates heterogeneous storage technologies into the Hadoop ecosystem. hatS also proposes data placement and retrieval policies, which improve the utilization of the storage devices based on their characteristics such as I/O throughput and capacity. We evaluate hatS using an actual implementation on a medium-sized cluster consisting of HDDs and two types of SSDs (i.e., SATA SSD and PCIe SSD). Experiments show that hatS achieves 32.6 % higher read bandwidth, on average, than HDFS for the test Hadoop jobs (such as Grep and TestDFSIO) by directing 64 % of the I/O accesses to the SSD tiers. We also evaluate our approach with trace-driven simulations using synthetic Facebook workloads, and show that compared to the standard setup, hatS improves the average I/O rate by 36%, which results in 26 % improvement in the job completion time. Keywords-Tiered storage; Hadoop Distributed File System (HDFS); data placement and retrieval policy.
Exalt: Empowering Researchers to Evaluate Large-Scale Storage Systems
"... This paper presents Exalt, a library that gives back to researchers the ability to test the scalability of today’s large storage systems. To that end, we introduce Tar-dis, a data representation scheme that allows data to be identified and efficiently compressed even at low-level storage layers that ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
This paper presents Exalt, a library that gives back to researchers the ability to test the scalability of today’s large storage systems. To that end, we introduce Tar-dis, a data representation scheme that allows data to be identified and efficiently compressed even at low-level storage layers that are not aware of the semantics and formatting used by higher levels of the system. This compression enables a high degree of node colocation, which makes it possible to run large-scale experiments on as few as a hundred machines. Our experience with HDFS and HBase shows that, by allowing us to run the real system code at an unprecedented scale, Exalt can help identify scalability problems that are not observable at lower scales: in particular, Exalt helped us pinpoint and resolve issues in HDFS that improved its aggregate throughput by an order of magnitude. 1
IBIS: Interposed Big-data I/O Scheduler
"... As the needs of data-intensive computing continue to grow in various disciplines, it becomes increasingly common to use shared infrastructure to build big-data systems for such applications. Although computing resources (CPUs) are relatively easy to be partitioned among concurrent applications, stor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
As the needs of data-intensive computing continue to grow in various disciplines, it becomes increasingly common to use shared infrastructure to build big-data systems for such applications. Although computing resources (CPUs) are relatively easy to be partitioned among concurrent applications, storage resources (I/O bandwidth) are difficult to be allocated in a big-data system, which are critical to the performance of data-intensive applications. Existing big-data systems (e.g., Hadoop/MapReduce) do not expose management of shared storage I/O resources, and as a result, an application’s performance may degrade in unpredictable ways under I/O contention. This paper proposes IBIS, a new Interposed Big-data I/O Scheduler, to provide performance differentiation for competing applications ’ I/Os in a shared MapReduce-type big-data system. It is able to transparently intercept the I/Os from competing big-data applications and isolate and schedule them on every data node via an I/O interposition layer. It then allows the distributed I/O schedulers to coordinate with each other on global bandwidth allocation of the entire big-data system in a scalable manner. The proposed approach is implemented in Hadoop by interposing HDFS as well as local/network I/Os and scheduling them using an SFQ-based proportional-sharing algorithm. Experiments based on a variety of representative big-data applications show that the IBIS scheduler can provide low overhead (< 5 % in application runtime) as well as strong performance isolation for an application (Word-Count) even when it is under heavy contention from a highly I/O-intensive application (TeraGen) (< 3 % slowdown in total runtime). Results also show that it can achieve better proportional sharing of the global bandwidth among competing parallel applications (with up to 18 % performance increase for TeraSort), while the coordination supported by distributed IBIS schedulers can effectively deal with the uneven distribution of local services in big-data systems. 1.
Tachyon: Memory Throughput I/O for Cluster Computing Frameworks
"... As ever more big data computations start to be in-memory, I/O throughput dominates the running times of many workloads. For distributed storage, the read throughput can be improved using caching, however, the write throughput is limited by both disk and network bandwidth due to data replication for ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
As ever more big data computations start to be in-memory, I/O throughput dominates the running times of many workloads. For distributed storage, the read throughput can be improved using caching, however, the write throughput is limited by both disk and network bandwidth due to data replication for fault-tolerance. This paper proposes a new file sys-tem architecture to enable frameworks to both read and write reliably at memory speed, by avoiding syn-chronous data replication on writes. 1
MetaSync: File Synchronization Across Multiple Untrusted Storage Services
"... Cloud-based file synchronization services, such as Drop-box and OneDrive, are a worldwide resource for many millions of users. However, individual services often have tight resource limits, varying performance in re-gions of the world, temporary outages or even shut-downs, and sometimes silently cor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Cloud-based file synchronization services, such as Drop-box and OneDrive, are a worldwide resource for many millions of users. However, individual services often have tight resource limits, varying performance in re-gions of the world, temporary outages or even shut-downs, and sometimes silently corrupt or leak user data. We design, implement, and evaluate MetaSync, a se-cure and reliable file synchronization service that uses multiple cloud synchronization services as untrusted storage providers. To make MetaSync work correctly, we devise a novel variant of Paxos that provides lineariz-able updates on top of the unmodified APIs exported by existing services. Our system automatically redistributes files upon adding, removing, or resizing a provider with a novel deterministic replication scheme. Our evaluation shows MetaSync provides low update latency and high update throughput, close to the perfor-mance of commercial services, but is more reliable and available. For synchronization, MetaSync outperforms its underlying cloud services by 1.2X-10X on three real-istic workloads. 1
Signature redacted Certified by.................-.....-....
, 2014
"... The mobile application ("app") ecosystem has grown at a tremendous pace with millions of apps and hundreds of thousands of app developers. Mobile apps run across a wide range of network, hardware, location, and usage conditions that are hard for developers to emulate or even anticipate dur ..."
Abstract
- Add to MetaCart
(Show Context)
The mobile application ("app") ecosystem has grown at a tremendous pace with millions of apps and hundreds of thousands of app developers. Mobile apps run across a wide range of network, hardware, location, and usage conditions that are hard for developers to emulate or even anticipate during lab testing. Hence, app failures and performance problems are common in the wild. Scarce resources, shift away from familiar synchronous programming models, and poor development support has made it more difficult for app developers to overcome these problems. This dissertation focuses on systems that make it significantly easy for app developers to diagnose and improve their mobile apps. To reduce user annoyance and survive the brutally competitive mobile app marketplace, developers need systems that (i) identify potential failures before the app is released, (ii) diagnose problems after the app is deployed in the wild, and (iii) provide reliable app perfor-mance in the face of varying conditions in the wild. This dissertation presents systems that satisfy these needs. VanarSena makes it easy to diagnose common failures in mobile apps before deployment, AppInsight makes it easy to monitor mobile apps after deployment,