Results 1 - 10
of
342
Object Replication Strategies in Content Distribution Networks
- Computer Communications
, 2001
"... content distribution networks (CDNs). In this paper we study the problem of optimally replicating objects in CDN servers. In our model, each Internet Au- tonomous System (AS) is a node with finite storage ca- pacity for replicating objects. The optimization problem is to replicate objects so that wh ..."
Abstract
-
Cited by 174 (0 self)
- Add to MetaCart
(Show Context)
content distribution networks (CDNs). In this paper we study the problem of optimally replicating objects in CDN servers. In our model, each Internet Au- tonomous System (AS) is a node with finite storage ca- pacity for replicating objects. The optimization problem is to replicate objects so that when clients fetch objects from the nearest CDN server with the requested object, the average number of ASs traversed is minimized. We formulate this problem as a combinatorial optimization problem. We show that this optimization problem is NP complete. We develop four natural heuristics and compare them numerically using real Internet topology data. We find that the best results are obtained with heuristics that have all the CDN servers cooperating in making the replication decisions. We also develop a model for studying the benefits of cooperation between nodes, which provides insight into peer-to-peer content distribution.
Greedy Facility Location Algorithms analyzed using Dual Fitting with Factor-Revealing LP
- Journal of the ACM
, 2001
"... We present a natural greedy algorithm for the metric uncapacitated facility location problem and use the method of dual fitting to analyze its approximation ratio, which turns out to be 1.861. The running time of our algorithm is O(m log m), where m is the total number of edges in the underlying c ..."
Abstract
-
Cited by 148 (12 self)
- Add to MetaCart
(Show Context)
We present a natural greedy algorithm for the metric uncapacitated facility location problem and use the method of dual fitting to analyze its approximation ratio, which turns out to be 1.861. The running time of our algorithm is O(m log m), where m is the total number of edges in the underlying complete bipartite graph between cities and facilities. We use our algorithm to improve recent results for some variants of the problem, such as the fault tolerant and outlier versions. In addition, we introduce a new variant which can be seen as a special case of the concave cost version of this problem.
A new greedy approach for facility location problems
"... We present a simple and natural greedy algorithm for the metric uncapacitated facility location problem achieving an approximation guarantee of 1.61 whereas the best previously known was 1.73. Furthermore, we will show that our algorithm has a property which allows us to apply the technique of Lagra ..."
Abstract
-
Cited by 143 (9 self)
- Add to MetaCart
We present a simple and natural greedy algorithm for the metric uncapacitated facility location problem achieving an approximation guarantee of 1.61 whereas the best previously known was 1.73. Furthermore, we will show that our algorithm has a property which allows us to apply the technique of Lagrangian relaxation. Using this property, we can nd better approximation algorithms for many variants of the facility location problem, such as the capacitated facility location problem with soft capacities and a common generalization of the k-median and facility location problem. We will also prove a lower bound on the approximability of the k-median problem.
Chain Replication for Supporting High Throughput and Availability
"... Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication pro ..."
Abstract
-
Cited by 108 (5 self)
- Add to MetaCart
Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication protocols themselves, simulation experiments explore the performance characteristics of a prototype implementation. Throughput, availability, and several objectplacement strategies (including schemes based on distributed hash table routing) are discussed.
Insight and Perspective for Content Delivery Networks
- in Communications of the ACM
, 2006
"... Striking a balance between the costs for Web content providers and the quality of service for Web customers. More efficient content delivery over the Web has become an important element of improving Web performance. Content Delivery Networks (CDNs) have been proposed to maximize bandwidth, improve a ..."
Abstract
-
Cited by 90 (10 self)
- Add to MetaCart
(Show Context)
Striking a balance between the costs for Web content providers and the quality of service for Web customers. More efficient content delivery over the Web has become an important element of improving Web performance. Content Delivery Networks (CDNs) have been proposed to maximize bandwidth, improve accessibility, and maintain correctness through content replication [11]. With CDNs, content is distributed to cache servers located close to users, resulting in fast, reliable applications and Web services for the users. More specifically, CDNs maintain multiple Points of Presence (PoP) with clusters of (the so-called surrogate) servers that store copies of identical content, such that users ’ requests are satisfied by the most appropriate site (see the figure here). Typically, a CDN topology involves: • A set of surrogate servers (distributed around the world) that cache the origin servers ’ content; • Routers and network elements that deliver
Dynamic Replica Placement for Scalable Content Delivery
- In Proceedings of IPTPS’02
, 2002
"... In this paper, we propose the dissemination tree, a dynamic content distribution system built on top of a peer-to-peer location service. We present a replica placement protocol that builds the tree while meeting QoS and server capacity constraints. The number of replicas as well as the delay and ban ..."
Abstract
-
Cited by 88 (1 self)
- Add to MetaCart
(Show Context)
In this paper, we propose the dissemination tree, a dynamic content distribution system built on top of a peer-to-peer location service. We present a replica placement protocol that builds the tree while meeting QoS and server capacity constraints. The number of replicas as well as the delay and bandwidth consumption for update propagation are significantly reduced. Simulation results show that the dissemination tree has close to the optimal number of replicas, good load distribution, small delay and bandwidth penalties for update multicast compared with the ideal case: static replica placement on IP multicast.
Replication for web hosting systems
- ACM COMPUTING SURVEYS
, 2004
"... Replication is a well-known technique to improve the accessibility of Web sites. It generally offers reduced client latencies and increases a site’s availability. However, applying replication techniques is not trivial, and various Content Delivery Networks (CDNs) have been created to facilitate rep ..."
Abstract
-
Cited by 60 (9 self)
- Add to MetaCart
Replication is a well-known technique to improve the accessibility of Web sites. It generally offers reduced client latencies and increases a site’s availability. However, applying replication techniques is not trivial, and various Content Delivery Networks (CDNs) have been created to facilitate replication for digital content providers. The
Selfish Caching in Distributed Systems: A Game-Theoretic Analysis
- in Proc. ACM Symposium on Principles of Distributed Computing (ACM PODC
, 2004
"... We analyze replication of resources by server nodes that act selfishly, using a game-theoretic approach. We refer to this as the selfish caching problem. In our model, nodes incur either cost for replicating resources or cost for access to a remote replica. We show the existence of pure strategy Nas ..."
Abstract
-
Cited by 57 (2 self)
- Add to MetaCart
(Show Context)
We analyze replication of resources by server nodes that act selfishly, using a game-theoretic approach. We refer to this as the selfish caching problem. In our model, nodes incur either cost for replicating resources or cost for access to a remote replica. We show the existence of pure strategy Nash equilibria and investigate the price of anarchy, which is the relative cost of the lack of coordination. The price of anarchy can be high due to undersupply problems, but with certain network topologies it has better bounds. With a payment scheme the game can always implement the social optimum in the best case by giving servers incentive to replicate.
Antfarm: Efficient Content Distribution with Managed Swarms
"... This paper describes Antfarm, a content distribution system based on managed swarms. A managed swarm couples peer-to-peer data exchange with a coordinator that directs bandwidth allocation at each peer. Antfarm achieves high throughput by viewing content distribution as a global optimization problem ..."
Abstract
-
Cited by 53 (1 self)
- Add to MetaCart
(Show Context)
This paper describes Antfarm, a content distribution system based on managed swarms. A managed swarm couples peer-to-peer data exchange with a coordinator that directs bandwidth allocation at each peer. Antfarm achieves high throughput by viewing content distribution as a global optimization problem, where the goal is to minimize download latencies for participants subject to bandwidth constraints and swarm dynamics. The system is based on a wire protocol that enables the Antfarm coordinator to gather information on swarm dynamics, detect misbehaving hosts, and direct the peers ’ allotment of upload bandwidth among multiple swarms. Antfarm’s coordinator grants autonomy and local optimization opportunities to participating nodes while guiding the swarms toward an efficient allocation of resources. Extensive simulations and a PlanetLab deployment show that the system can significantly outperform centralized distribution services as well as swarming systems such as BitTorrent. 1
Choosing Replica Placement Heuristics for Wide-Area Systems
- In ICDCS ’04: Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS’04
, 2004
"... Data replication is used extensively in wide-area distributed systems to achieve low data-access latency. A large number of heuristics have been proposed to perform replica placement. Practical experience indicates that the choice of heuristic makes a big difference in terms of the cost of required ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
(Show Context)
Data replication is used extensively in wide-area distributed systems to achieve low data-access latency. A large number of heuristics have been proposed to perform replica placement. Practical experience indicates that the choice of heuristic makes a big difference in terms of the cost of required infrastructure (e.g., storage capacity and network bandwidth), depending on system topology, workload and performance goals.