Results 1 - 10
of
30
A Scalable Distributed Information Management System
- In Proc SIGCOMM
, 2003
"... We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary ..."
Abstract
-
Cited by 127 (18 self)
- Add to MetaCart
We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary views of global information. To serve as a basic building block, a SDIMS should have four properties: scalability to many nodes and attributes, flexibility to accommodate a broad range of applications, administrative isolation for security and availability, and robustness to node and network failures. We design, implement and evaluate a SDIMS that (1) leverages Distributed Hash Tables (DHT) to create scalable aggregation trees, (2) provides flexibility through a simple API that lets applications control propagation of reads and writes, (3) provides administrative isolation through simple extensions to current DHT algorithms, and (4) achieves robustness to node and network reconfigurations through lazy reaggregation, on-demand reaggregation, and tunable spatial replication. Through extensive simulations and micro-benchmark experiments, we observe that our system is an order of magnitude more scalable than existing approaches, achieves isolation properties at the cost of modestly increased read latency in comparison to flat DHTs, and gracefully handles failures.
End-to-end WAN Service Availability
- In Proc. 3rd USITS
, 2001
"... This study seeks to understand how network failures affect the availability of service delivery across wide area networks and to evaluate classes of techniques for improving end-to-end service availability. Using several large-scale connectivity traces, we develop a model of network unavailability t ..."
Abstract
-
Cited by 96 (14 self)
- Add to MetaCart
This study seeks to understand how network failures affect the availability of service delivery across wide area networks and to evaluate classes of techniques for improving end-to-end service availability. Using several large-scale connectivity traces, we develop a model of network unavailability that includes key parameters such as failure location and failure duration. We then use trace-based simulation to evaluate several classes of techniques for coping with network unavailability. We find that caching alone is seldom effective at insulating services from failures but that the combination of mobile extension code and prefetching can improve average unavailability by as much as an order of magnitude for classes of service whose semantics support disconnected operation. We find that routing-based techniques may provide significant improvements, but that the improvements of many individual techniques are limited because they do not address all significant categories of network failures. By combining the techniques we examine, some systems may be able to reduce average unavailability by as much as one or two orders of magnitude.
Model-Based Resource Provisioning in a Web Service Utility
- In Proceedings of the Fourth USENIX Symposium on Internet Technologies and Systems (USITS
, 2003
"... Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. ..."
Abstract
-
Cited by 91 (7 self)
- Add to MetaCart
Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
TCP Nice: A Mechanism for Background Transfers
, 2002
"... background transfers transfers of data that humans are not waiting for to improve availability, reliability, latency or consistency. However, given the rapid fluctuations of available network bandwidth and changing resource costs due to technology trends, hand tuning the aggressiveness of background ..."
Abstract
-
Cited by 77 (12 self)
- Add to MetaCart
background transfers transfers of data that humans are not waiting for to improve availability, reliability, latency or consistency. However, given the rapid fluctuations of available network bandwidth and changing resource costs due to technology trends, hand tuning the aggressiveness of background transfers risks (1) complicating applications, (2) being too aggressive and interfering with other applications, and (3) being too timid and not gaining the benefits of background transfers. Our goal is for the operating system to manage network resources in order to provide a simple abstraction of near zero-cost background transfers. Our system, TCP Nice, can provably bound the interference inflicted by background flows on foreground flows in a restricted network model. And our microbenchmarks and case study applications suggest that in practice it interferes little with foreground flows, reaps a large fraction of spare network bandwidth, and simplifies application construction and deployment. For example, in our prefetching case study application, aggressive prefetching improves demand performance by a factor of three when Nice manages resources; but the same prefetching hurts demand performance by a factor of six under standard network congestion control.
Insight and Perspective for Content Delivery Networks
- in Communications of the ACM
, 2006
"... Striking a balance between the costs for Web content providers and the quality of service for Web customers. More efficient content delivery over the Web has become an important element of improving Web performance. Content Delivery Networks (CDNs) have been proposed to maximize bandwidth, improve a ..."
Abstract
-
Cited by 40 (7 self)
- Add to MetaCart
Striking a balance between the costs for Web content providers and the quality of service for Web customers. More efficient content delivery over the Web has become an important element of improving Web performance. Content Delivery Networks (CDNs) have been proposed to maximize bandwidth, improve accessibility, and maintain correctness through content replication [11]. With CDNs, content is distributed to cache servers located close to users, resulting in fast, reliable applications and Web services for the users. More specifically, CDNs maintain multiple Points of Presence (PoP) with clusters of (the so-called surrogate) servers that store copies of identical content, such that users ’ requests are satisfied by the most appropriate site (see the figure here). Typically, a CDN topology involves: • A set of surrogate servers (distributed around the world) that cache the origin servers ’ content; • Routers and network elements that deliver
Efficient and Adaptive Web Replication using Content Clustering
- IEEE Journal on Selected Areas in Communications
, 2003
"... Recently there has been an increasing deployment of content distribution networks (CDNs) that offer hosting services to Web content providers. In this paper, we first compare the uncooperative pulling of Web contents used by commercial CDNs with the cooperative pushing. Our results show that the lat ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Recently there has been an increasing deployment of content distribution networks (CDNs) that offer hosting services to Web content providers. In this paper, we first compare the uncooperative pulling of Web contents used by commercial CDNs with the cooperative pushing. Our results show that the latter can achieve comparable users' perceived performance with only 4 - 5% of replication and update traffic compared to the former scheme. Therefore we explore how to efficiently push content to CDN nodes. Using trace-driven simulation, we show that replicating content in units of URLs can yield 60 - 70% reduction in clients' latency, compared to replicating in units of Web sites. However, it is very expensive to perform such a fine-grained replication.
Transparent information dissemination
- In Proc. Middleware
, 2004
"... Abstract. This paper describes Transparent Replication through Invalidation and Prefetching (TRIP), a self tuning data replication middleware system that enables transparent replication of large-scale information dissemination services. The TRIP middleware is a key building block for constructing in ..."
Abstract
-
Cited by 21 (11 self)
- Add to MetaCart
Abstract. This paper describes Transparent Replication through Invalidation and Prefetching (TRIP), a self tuning data replication middleware system that enables transparent replication of large-scale information dissemination services. The TRIP middleware is a key building block for constructing information dissemination services, a class of services where updates occur at an origin server and reads occur at a number of replicas; examples information dissemination services include content distribution networks such as Akamai [1] and IBM’s Sport and Event replication system [2]. Furthermore, the TRIP middleware can be used to build key parts of general applications that distribute content such as file systems, distributed databases, and publish-subscribe systems. Our data replication middleware supports transparent replication by providing two crucial properties: (1) sequential consistency to avoid introducing anomalous behavior to increasingly complex services and (2) selftuning transmission of updates to maximize performance and availability given available system resources. Our analysis of simulations and our evaluation of a prototype support the hypothesis that it is feasible to provide transparent replication for dissemination services. For example, in simulations, our system’s performance is a factor of three to four faster than a demand-based middleware system for a wide range of configurations. 1
Cooperative Leases: Scalable Consistency Maintenance in Content Distribution Networks
, 2002
"... In this paper, we argue that cache consistency mechanisms designed for stand-alone proxies do not scale to the large number of proxies in a content distribution network and are not flexible enough to allow consistency guarantees to be tailored to object needs. To meet the twin challenges of scalabil ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
In this paper, we argue that cache consistency mechanisms designed for stand-alone proxies do not scale to the large number of proxies in a content distribution network and are not flexible enough to allow consistency guarantees to be tailored to object needs. To meet the twin challenges of scalability and flexibility, we introduce the notion of cooperative consistency along with a mechanism, called cooperative leases, to achieve it. By supporting -consistency semantics and by using a single lease for multiple proxies, cooperative leases allows the notion of leases to be applied in a flexible, scalable manner to CDNs. Further, the approach employs application-level multicast to propagate server notifications to proxies in a scalable manner. We implement our approach in the Apache web server and the Squid proxy cache and demonstrate its efficacy using a detailed experimental evaluation. Our results show a factor of 2.5 reduction in server message overhead and a 20% reduction in server state space overhead when compared to original leases albeit at an increased inter-proxy communication overhead.
Scalable Consistency Maintenance in Content Distribution Networks Using Cooperative Leases
- IEEE Transactions on Knowledge and Data Engineering
, 2001
"... In this paper, we argue that cache consistency mechanisms designed for stand-alone proxies do not scale to the large number of proxies in a content distribution network and are not flexible enough to allow consistency guarantees to be tailored to object needs. To meet the twin challenges of scala ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
In this paper, we argue that cache consistency mechanisms designed for stand-alone proxies do not scale to the large number of proxies in a content distribution network and are not flexible enough to allow consistency guarantees to be tailored to object needs. To meet the twin challenges of scalability and flexibility, we introduce the notion of cooperative consistency along with a mechanism, called cooperative leases, to achieve it. By supporting #-consistency semantics and by using a single lease for multiple proxies, cooperative leases allows the notion of leases to be applied in a flexible, scalable manner to CDNs. Further, the approach employs application-level multicast to propagate server notifications to proxies in a scalable manner. We implement our approach in the Apache web server and the Squid proxy cache and demonstrate its efficacy using a detailed experimental evaluation. Our results show a factor of 2.5 reduction in server message overhead and a 20% reduction in server state space overhead when compared to original leases albeit at an increased inter-proxy communication overhead.
PRACTI Replication for Large-Scale Systems
, 2004
"... Many replication mechanisms for large scale distributed systems exist, but they require a designer to compromise a system's replication policy (e.g., by requiring full replication of all data to all nodes), consistency policy (e.g., by supporting per-object coherence but not multiobject consistency) ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
Many replication mechanisms for large scale distributed systems exist, but they require a designer to compromise a system's replication policy (e.g., by requiring full replication of all data to all nodes), consistency policy (e.g., by supporting per-object coherence but not multiobject consistency), or topology policy (e.g., by assuming a hierarchical organization of nodes.) In this paper, we present the first PRACTI (Partial Replication, Arbitrary Consistency, and Topology Independence) mechanisms for replication in large scale systems. These new mechanisms allow construction of systems that replicate or cache any data on any node, that provide a broad range of consistency and coherence guarantees, and that permit any node to communicate with any other node at any time. Our evaluation of a prototype suggests that by disentangling mechanism from policy, PRACTI replication enables better trade-offs for system designers than possible with existing mechanisms. For example, for one workload we study, PRACTI's partial replication reduces bandwidth requirements by over an order of magnitude compared to full replication for nodes that only care about a subset of the system's data.

