Results 1 - 10
of
105
The Case for VM-based Cloudlets in Mobile Computing
"... Mobile computing is at a fork in the road. After two decades of sustained effort by many researchers, we have developed the core concepts, techniques and mechanisms to provide a solid foundation for this still fast-growing ..."
Abstract
-
Cited by 229 (23 self)
- Add to MetaCart
(Show Context)
Mobile computing is at a fork in the road. After two decades of sustained effort by many researchers, we have developed the core concepts, techniques and mechanisms to provide a solid foundation for this still fast-growing
Experiences building planetlab
- In Proceedings of the 7th USENIX Symp. on Operating Systems Design and Implementation (OSDI
, 2006
"... Abstract. This paper reports our experiences building PlanetLab over the last four years. It identifies the requirements that shaped PlanetLab, explains the design decisions that resulted from resolving conflicts among these requirements, and reports our experience implementing and supporting the sy ..."
Abstract
-
Cited by 90 (11 self)
- Add to MetaCart
(Show Context)
Abstract. This paper reports our experiences building PlanetLab over the last four years. It identifies the requirements that shaped PlanetLab, explains the design decisions that resulted from resolving conflicts among these requirements, and reports our experience implementing and supporting the system. Due in large part to the nature of the “PlanetLab experiment, ” the discussion focuses on synthesis rather than new techniques, balancing system-wide considerations rather than improving performance along a single dimension, and learning from feedback from a live system rather than controlled experiments using synthetic workloads. 1
Scale and performance in the CoBlitz largefile distribution service
- In Proceedings of the 3rd USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI
"... Scalable distribution of large files has been the area of much research and commercial interest in the past few years. In this paper, we describe the CoBlitz system, which efficiently distributes large files using a content distribution network (CDN) designed for HTTP. As a result, CoBlitz is able t ..."
Abstract
-
Cited by 62 (6 self)
- Add to MetaCart
(Show Context)
Scalable distribution of large files has been the area of much research and commercial interest in the past few years. In this paper, we describe the CoBlitz system, which efficiently distributes large files using a content distribution network (CDN) designed for HTTP. As a result, CoBlitz is able to serve large files without requiring any modifications to standard Web servers and clients, making it an interesting option both for end users as well as infrastructure services. Over the 18 months that CoBlitz and its partner service, CoDeploy, have been running on PlanetLab, we have had the opportunity to observe its algorithms in practice, and to evolve its design. These changes stem not only from observations on its use, but also from a better understanding of their behavior in real-world conditions. This utilitarian approach has led us to better understand the effects of scale, peering policies, replication behavior, and congestion, giving us new insights into how to better improve their performance. With these changes, CoBlitz is able to deliver in excess of 1 Gbps on PlanetLab, and to outperform a range of systems, including research systems as well as the widely-used BitTorrent. 1
PRACTI replication
- IN PROC NSDI
, 2006
"... We present PRACTI, a new approach for large-scale replication. PRACTI systems can replicate or cache any subset of data on any node (Partial Replication), provide a broad range of consistency guarantees (Arbitrary Consistency), and permit any node to send information to any other node (Topology Inde ..."
Abstract
-
Cited by 61 (15 self)
- Add to MetaCart
(Show Context)
We present PRACTI, a new approach for large-scale replication. PRACTI systems can replicate or cache any subset of data on any node (Partial Replication), provide a broad range of consistency guarantees (Arbitrary Consistency), and permit any node to send information to any other node (Topology Independence). A PRACTI architecture yields two significant advantages. First, by providing all three PRACTI properties, it enables better trade-offs than existing mechanisms that support at most two of the three desirable properties. The PRACTI approach thus exposes new points in the design space for replication systems. Second, the flexibility of PRACTI protocols simplifies the design of replication systems by allowing a single architecture to subsume a broad range of existing systems and to reduce development costs for new ones. To illustrate both advantages, we use our PRACTI prototype to emulate existing server replication, client-server, and object replication systems and to implement novel policies that improve performance for mobile users, web edge servers, and grid computing by as much as an order of magnitude.
Exploiting Similarity for Multi-Source Downloads using File Handprints
- In USENIX NSDI’07
"... Many contemporary approaches for speeding up large file transfers attempt to download chunks of a data ob-ject from multiple sources. Systems such as BitTorrent quickly locate sources that have an exact copy of the de-sired object, but they are unable to use sources that serve similar but non-identi ..."
Abstract
-
Cited by 58 (9 self)
- Add to MetaCart
(Show Context)
Many contemporary approaches for speeding up large file transfers attempt to download chunks of a data ob-ject from multiple sources. Systems such as BitTorrent quickly locate sources that have an exact copy of the de-sired object, but they are unable to use sources that serve similar but non-identical objects. Other systems automati-cally exploit cross-file similarity by identifying sources for each chunk of the object. These systems, however, require a number of lookups proportional to the number of chunks in the object and a mapping for each unique chunk in every identical and similar object to its corre-sponding sources. Thus, the lookups and mappings in such a system can be quite large, limiting its scalability. This paper presents a hybrid system that provides the
Antfarm: Efficient Content Distribution with Managed Swarms
"... This paper describes Antfarm, a content distribution system based on managed swarms. A managed swarm couples peer-to-peer data exchange with a coordinator that directs bandwidth allocation at each peer. Antfarm achieves high throughput by viewing content distribution as a global optimization problem ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
(Show Context)
This paper describes Antfarm, a content distribution system based on managed swarms. A managed swarm couples peer-to-peer data exchange with a coordinator that directs bandwidth allocation at each peer. Antfarm achieves high throughput by viewing content distribution as a global optimization problem, where the goal is to minimize download latencies for participants subject to bandwidth constraints and swarm dynamics. The system is based on a wire protocol that enables the Antfarm coordinator to gather information on swarm dynamics, detect misbehaving hosts, and direct the peers ’ allotment of upload bandwidth among multiple swarms. Antfarm’s coordinator grants autonomy and local optimization opportunities to participating nodes while guiding the swarms toward an efficient allocation of resources. Extensive simulations and a PlanetLab deployment show that the system can significantly outperform centralized distribution services as well as swarming systems such as BitTorrent. 1
Fabric: A platform for secure distributed computation and storage
- In SOSP ’09, Big Sky
, 2009
"... Abstract Fabric is a new system and language for building secure distributed information systems. It is a decentralized system that allows heterogeneous network nodes to securely share both information and computation resources despite mutual distrust. Its high-level programming language makes dist ..."
Abstract
-
Cited by 53 (12 self)
- Add to MetaCart
(Show Context)
Abstract Fabric is a new system and language for building secure distributed information systems. It is a decentralized system that allows heterogeneous network nodes to securely share both information and computation resources despite mutual distrust. Its high-level programming language makes distribution and persistence largely transparent to programmers. Fabric supports data-shipping and function-shipping styles of computation: both computation and information can move between nodes to meet security requirements or to improve performance. Fabric provides a rich, Java-like object model, but data resources are labeled with confidentiality and integrity policies that are enforced through a combination of compile-time and run-time mechanisms. Optimistic, nested transactions ensure consistency across all objects and nodes. A peer-to-peer dissemination layer helps to increase availability and to balance load. Results from applications built using Fabric suggest that Fabric has a clean, concise programming model, offers good performance, and enforces security.
Extreme binning: Scalable, parallel deduplication for chunk-based file backup
- In Proceedings of the 17th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS
, 2009
"... Abstract—Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams w ..."
Abstract
-
Cited by 46 (2 self)
- Add to MetaCart
(Show Context)
Abstract—Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams with high locality, which existing deduplication techniques require to provide reasonable throughput. We present Extreme Binning, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time. Due to lack of locality, existing techniques perform poorly on these workloads. Extreme Binning exploits file similarity instead of locality, and makes only one disk access for chunk lookup per file, which gives reasonable throughput. Multi-node backup systems built with Extreme Binning scale gracefully with the amount of input data; more backup nodes can be added to boost throughput. Each file is allocated using a stateless routing algorithm to only one node, allowing for maximum parallelization, and each backup node is autonomous with no dependency across nodes, making data management tasks robust with low overhead. I.
Towards understanding modern web traffic (extended abstract
- In Proc. ACM SIGMETRICS
, 2011
"... As the nature of Web traffic evolves over time, we must update our understanding of underlying nature of today’s Web, which is necessary to improve response time, understand caching effectiveness, and to design intermediary systems, such as firewalls, security analyzers, and reporting or management ..."
Abstract
-
Cited by 42 (4 self)
- Add to MetaCart
(Show Context)
As the nature of Web traffic evolves over time, we must update our understanding of underlying nature of today’s Web, which is necessary to improve response time, understand caching effectiveness, and to design intermediary systems, such as firewalls, security analyzers, and reporting or management systems. In this paper, we analyze five years (2006-2010) of real Web traffic from a globally-distributed proxy system, which captures the browsing behavior of over 70,000 daily users from 187 countries. Using this data set, we examine major changes in Web traffic characteristics during this period, and also investigate the redundancy of this traffic, using both traditional object-level caching as well as content-based approaches.
The Effectiveness of Deduplication on Virtual Machine Disk Images
"... Virtualization is becoming widely deployed in servers to efficiently provide many logically separate execution environments while reducing the need for physical servers. While this approach saves physical CPU resources, it still consumes large amounts of storage because each virtual machine (VM) ins ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
(Show Context)
Virtualization is becoming widely deployed in servers to efficiently provide many logically separate execution environments while reducing the need for physical servers. While this approach saves physical CPU resources, it still consumes large amounts of storage because each virtual machine (VM) instance requires its own multi-gigabyte disk image. Moreover, existing systems do not support ad hoc block sharing between disk images, instead relying on techniques such as overlays to build multiple VMs from a single “base ” image. Instead, we propose the use of deduplication to both reduce the total storage required for VM disk images and increase the ability of VMs to share disk blocks. To test the effectiveness of deduplication, we conducted extensive evaluations on different sets of virtual machine disk images with different chunking strategies. Our experiments found that the amount of stored data grows very slowly after the first few virtual disk images if only the locale or software configuration is changed, with the rate of compression suffering when different versions of an operating system or different operating systems are included. We also show that fixedlength chunks work well, achieving nearly the same compression rate as variable-length chunks. Finally, we show that simply identifying zero-filled blocks, even in ready-touse virtual machine disk images available online, can provide significant savings in storage.