Results 1 - 10
of
27
Overcast: Reliable Multicasting with an Overlay Network
, 2000
"... Overcast is an application-level multicasting system that can be incrementally deployed using today's Internet infrastructure. These properties stem from Overcast's implementation as an overlay network. An overlay network consists of a collection of nodes placed at strategic locations in an existing ..."
Abstract
-
Cited by 435 (10 self)
- Add to MetaCart
Overcast is an application-level multicasting system that can be incrementally deployed using today's Internet infrastructure. These properties stem from Overcast's implementation as an overlay network. An overlay network consists of a collection of nodes placed at strategic locations in an existing network fabric. These nodes implement a network abstraction on top of the network provided by the underlying substrate network.
Enhancing the Web's Infrastructure -- From Caching to Replication
- IEEE INTERNET COMPUTING
, 1997
"... ..."
Selection Algorithms for Replicated Web Servers
- Performance Evaluation Review
, 1998
"... Replication of documents on geographically distributed servers can improve both performance and reliability of the Web service. Server selection algorithms allow Web clients to select one of the replicated servers which is "close" to them and thereby minimize the response time of the Web service. Us ..."
Abstract
-
Cited by 65 (4 self)
- Add to MetaCart
Replication of documents on geographically distributed servers can improve both performance and reliability of the Web service. Server selection algorithms allow Web clients to select one of the replicated servers which is "close" to them and thereby minimize the response time of the Web service. Using client proxy server traces, we compare the effectiveness of several "proximity" metrics including the number of hops between the client and server, the ping round trip time and the HTTP request latency. Based on this analysis, we design two new algorithms for selection of replicated servers and compare their performance against other existing algorithms. We show that the new server selection algorithms improve the performance of other existing algorithms on the average by 55%. In addition, the new algorithms improve the performance of the existing nonreplicated Web servers on average by 69%. 1. Introduction Although the Web is becoming a widely accepted medium for distributing all kind...
The Architectural Design of Globe: A Wide-Area Distributed System
, 1997
"... . Developing large-scale wide-area applications requires an infrastructure that is presently lacking entirely. Currently, applications have to be built on top of raw communication services, such as TCP connections. All additional services, including those for naming, replication, migration, persiste ..."
Abstract
-
Cited by 62 (7 self)
- Add to MetaCart
. Developing large-scale wide-area applications requires an infrastructure that is presently lacking entirely. Currently, applications have to be built on top of raw communication services, such as TCP connections. All additional services, including those for naming, replication, migration, persistence, fault tolerance, and security, have to be implemented for each application anew. Not only is this a waste of effort, it also makes interoperability between different applications difficult or even impossible. We present a novel, object-based framework for developing wide-area distributed applications. The framework is based on the concept of a distributed shared object, which has the characteristic feature that its state can be physically distributed across multiple machines at the same time. All implementation aspects, including communication protocols, replication strategies, and distribution and migration of state, are part of an object and are hidden behind its interface. The curren...
Semantic cache mechanism for heterogeneous Web querying
, 1999
"... In Web-based searching systems that access distributed information providers, efficient query processing requires an advanced caching mechanism to reduce the query response time. The keyword-based querying is often the only way to retrieve data from Web providers, and therefore standard page-based a ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
In Web-based searching systems that access distributed information providers, efficient query processing requires an advanced caching mechanism to reduce the query response time. The keyword-based querying is often the only way to retrieve data from Web providers, and therefore standard page-based and tuple-based caching mechanisms turn out to be improper for such a task. In this work, we develop a mechanism for efficient caching of Web queries and the answers received from heterogeneous Web providers. We also report results of experiments and show how the caching mechanism is implemented in the Knowledge Broker system. Published by Elsevier Science B.V. All rights reserved.
Improving the WWW: Caching or Multicast
- Computer Networks and ISDN Systems
, 1998
"... We consider two schemes for the distribution of Web documents. In the first scheme the sender repeatedly transmits the Web document into a multicast address, and receivers asynchronously join the corresponding multicast tree to receive a copy. In the second scheme, the document is distributed to the ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
We consider two schemes for the distribution of Web documents. In the first scheme the sender repeatedly transmits the Web document into a multicast address, and receivers asynchronously join the corresponding multicast tree to receive a copy. In the second scheme, the document is distributed to the receivers through a hierarchy of Web caches. We develop analytical models for both schemes, and use the models to compare the two schemes in terms of latency and bandwidth usage. We find that except for documents that change very frequently, hierarchical caching gives lower latency and uses less bandwidth than multicast. For rapidly changing documents, multicast distribution reduces latency, saves network bandwidth, and reduces the load on the origin server. Furthermore, if a document is updated randomly rather than periodically, the relative performance of CMP improves. Therefore, the best overall performance is achieved when the Internet implements both solutions, hierarchical caching and multicast.
Mitigating Server-Side Congestion in the Internet Through Pseudoserving
, 1999
"... Server-side congestion arises when a large number of users wish to retrieve files from a server over a short period of time. Under such conditions, users are in a unique position to benefit enormously by sharing retrieved files. Pseudoserving, a new paradigm for Internet access, provides incentives ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Server-side congestion arises when a large number of users wish to retrieve files from a server over a short period of time. Under such conditions, users are in a unique position to benefit enormously by sharing retrieved files. Pseudoserving, a new paradigm for Internet access, provides incentives for users to contribute to the speedy dissemination of server files through a contract set by a “superserver.” Under this contract, the superserver grants a user a referral to where a copy of the requested file may be retrieved in exchange for the user’s assurance to serve other users for a specified period of time. Simulations that consider only network congestion occurring near the server show that: 1) pseudoserving is effective because it self-scales to handle very high request rates; 2) pseudoserving is feasible because a user who participates as a pseudoserver benefits enormously in return for a relatively small contribution of the user’s resources; 3) pseudoserving is robust under realistic user behavior because it can tolerate a large percentage of contract breaches; and 4) pseudoserving can exploit locality to reduce usage of network resources. Experiments performed on a local area network that account for the processing of additional layers of protocols and the finite processing and storage capacities of the server and the clients, corroborate the simulation results. They also demonstrate the benefits of exploiting network locality in reducing download times and network traffic while making referrals to a pseudoserver. Limitations of pseudoserving and potential solutions to them are also discussed in this paper.
Partial Collection Replication versus Caching for Information Retrieval Systems
- IN THE ACM INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL
, 2000
"... The explosion of content in distributed information retrieval (IR) systems requires new mechanisms to attain timely and accurate retrieval of unstructured text. In this paper, we compare two mechanisms to improve IR system performance: partial collection replication and caching. When queries have lo ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
The explosion of content in distributed information retrieval (IR) systems requires new mechanisms to attain timely and accurate retrieval of unstructured text. In this paper, we compare two mechanisms to improve IR system performance: partial collection replication and caching. When queries have locality, both mechanisms return results more quickly than sending queries to the original collection (s). Caches return results when queries exactly match a previous one. Partial replicas are a form of caching that return results when the IR technology determines the query is a good match. Caches are simpler and faster, but replicas can increase locality by detecting similarity between queries that are not exactly the same. We use real traces from THOMAS and Excite to measure query locality and similarity. With a very restrictive definition of query similarity, similarity improves query locality up to 15% over exact match. We use a validated simulator to compare their performance, and find that even if the partial replica hit rate increases only 3 to 6%, it will outperform simple caching under a variety of configurations. A combined approach will probably yield the best performance.
Partial Replica Selection Based on Relevance for Information Retrieval
- IN PROCEEDINGS OF THE 22TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL
, 1999
"... Partial collection replication improves performance and scalability of a large-scale distributed information retrieval system by distributing excessive workloads, reducing network latency, and restricting some searches to a small percentage of data. In this paper, we first examine queries from real ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Partial collection replication improves performance and scalability of a large-scale distributed information retrieval system by distributing excessive workloads, reducing network latency, and restricting some searches to a small percentage of data. In this paper, we first examine queries from real system logs and show that there is sufficient query locality in real systems to justify partial collection replication. We then present a method for constructing a hierarchy of partial replicas from a collection where each replica is a subset of all larger replicas, and extend the inference network model to rank and select partial replicas. We compare our new selection algorithm to previous work on collection selection over a range of tuning parameters. For a given query, our replica selection algorithm correctly determines the most relevant of the replicas or original collection, and thus maintains the highest retrieval effectiveness while searching the least data as compared with the other...

