Results 1 - 10
of
34
The Grid Protocol: A High Performance Scheme for Maintaining Replicated Data
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... We present a new protocol for maintaining replicated data that can provide both high data availability and low response time. In the protocol, the nodes are organized in a logical grid. Existing protocols are designed primarily to achieve high availability by updating a large fraction of the copies ..."
Abstract
-
Cited by 108 (4 self)
- Add to MetaCart
We present a new protocol for maintaining replicated data that can provide both high data availability and low response time. In the protocol, the nodes are organized in a logical grid. Existing protocols are designed primarily to achieve high availability by updating a large fraction of the copies which provides some (although not significant) load sharing. In the new protocol, transaction processing is shared effectively among nodes storing copies of the data and both the response time experienced by transactions and the system throughput are improved significantly. We present an analysis of the availability of the new protocol and use simulation to study the effect of load sharing on the response time of transactions. We also compare the new protocol with a voting based scheme. This work was supported in part by NSF grants NCR-8604850 and CCR-8806358, and by the University Research Committee of Emory University. 1 Introduction A distributed system consists of cooperating process...
Application-Layer Anycasting
- In Proceedings of IEEE Infocom
, 1997
"... The anycasting communication paradigm is designed to support server replication by allowing applications to easily select and communicate with the "best" server, according to some performance or policy criteria, in a group of content-equivalent servers. We examine the definition and support of the a ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
The anycasting communication paradigm is designed to support server replication by allowing applications to easily select and communicate with the "best" server, according to some performance or policy criteria, in a group of content-equivalent servers. We examine the definition and support of the anycasting paradigm at the application layer, providing a service that maps anycast domain names into one or more IP addresses using anycast resolvers. In addition to being independent from network-layer support, our definition includes the notion of filters, functions that are applied to groups of addresses to affect the selection process. We consider both metric-based filters (e.g., server response time) and policy-based filters. An expanded version of this work can be found as a technical report. 1 . 1 Introduction The Internet is increasingly being viewed as providing services, and not just connectivity. As this view becomes more prevalent, it becomes important to provide, within the ...
Are Quorums an Alternative for Data Replication
- ACM TRANSACTIONS ON DATABASE SYSTEMS
, 2003
"... ... this article, we analyze several quorum types in order to better understand their behavior in practice. The results obtained challenge many of the assumptions behind quorum based replication. Our evaluation indicates that the conventional read-one/write-all-available approach is the best choice ..."
Abstract
-
Cited by 32 (10 self)
- Add to MetaCart
... this article, we analyze several quorum types in order to better understand their behavior in practice. The results obtained challenge many of the assumptions behind quorum based replication. Our evaluation indicates that the conventional read-one/write-all-available approach is the best choice for a large range of applications requiring data replication. We believe this is an important result for anybody developing code for computing clusters as the read-one/write-all-available strategy is much simpler to implement and more flexible than quorum-based approaches. In this article, we show that, in addition, it is also the best choice using a number of other selection criteria
Improving the Throughput of Point-to-Multipoint ARQ Protocols Through Destination Set Splitting
- In Proceedings of IEEE Infocom
, 1992
"... Point-to-Multipoint ARQ protocols are required to guarantee the correct delivery and proper sequencing of messages being sent from one source to multiple destinations. A source of inefficiency for such protocols is that the transmitter requires acknowledgements from all receivers before it considers ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
Point-to-Multipoint ARQ protocols are required to guarantee the correct delivery and proper sequencing of messages being sent from one source to multiple destinations. A source of inefficiency for such protocols is that the transmitter requires acknowledgements from all receivers before it considers that a message has been correctly received. To address this inefficiency, we propose a scheme in which the set of destinations is split into disjoint groups. The transmitter carries a separate conversation with each group. The conversations are time multiplexed over a single channel, which leads to a tradeoff between the increased throughput as a result of the destination set splitting and the wasted bandwidth required for the multiplexing. We consider memoryless, limited-memory and full-memory versions of Stop-and-Wait and Go-Back-N protocols. We evaluate the maximum throughput achievable with our protocols, and address the issue of selecting the grouping of the destinations that maximizes...
An Efficient Scheme for Dynamic Data Replication
, 1993
"... This paper presents an efficient scheme for dynamic replication of data in distributed environments. The aim of the scheme is to increase system performance by intelligent data placement so as to optimize the message traffic in the network. Research in the recent past has comparatively focussed very ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
This paper presents an efficient scheme for dynamic replication of data in distributed environments. The aim of the scheme is to increase system performance by intelligent data placement so as to optimize the message traffic in the network. Research in the recent past has comparatively focussed very little on using replication for increasing performance but has instead been directed more at improving system availability through replication. However, with the advent of mobile or nomadic computing, research in replication needs to change direction-- the underlying assumption of high speed networks no longer hold true. Wireless networks not only have lower bandwidth but are also very expensive to use. In such an environment, it is imperative that data be distributed intelligently to achieve a good system performance in terms of message costs and turnaround time. Besides, with mobility introduced in the system, earlier static schemes for improving performance (e.g., the File Alloc...
Multi-Destination Communication over Tunable-Receiver Single-Hop WDM Networks
, 1997
"... We address the issue of providing efficient mechanisms for multi-destination communication over one class of lightwave WDM architectures, namely, single-hop networks with tunability provided only at the receiving side. We distinguish a number of multicast traffic types, we present a number of altern ..."
Abstract
-
Cited by 22 (9 self)
- Add to MetaCart
We address the issue of providing efficient mechanisms for multi-destination communication over one class of lightwave WDM architectures, namely, single-hop networks with tunability provided only at the receiving side. We distinguish a number of multicast traffic types, we present a number of alternative broadcast/multicast TDMA schedules for each type, and we develop heuristics to obtain schedules that result in low average packet delay. One of our major contributions is the development of a suite of adaptive multicast protocols which are simple to implement, and have good performance under changing multicast traffic conditions.
Optimizing Vote and Quorum Assignments for Reading and Writing Replicated Data
- IEEE Transactions on Knowledge and Data Engineering
, 1989
"... In the weighted voting protocol which is used to maintain the consistency of replicated data, the availability of the data to read and write operations not only depends on the availability of the nodes storing the data but also on the vote and quorum assignments used. We consider the problem of dete ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
In the weighted voting protocol which is used to maintain the consistency of replicated data, the availability of the data to read and write operations not only depends on the availability of the nodes storing the data but also on the vote and quorum assignments used. We consider the problem of determining the vote and quorum assignments that yield the best permormance in a distributed system where node availabilities can be different and the mix of the read and write operations is arbitrary. The optimal vote and quorum assignments depend not only on the system parameters such as node availability and operation mix, but also on the performance measure. We present an enumeration algorithm that can be used to find the vote and quorum assignments that need to be considered for achieving optimal performance. When the performance measure is data availability, an analytical method is derived to evaluate it for any vote and quorum assignment. This method and the enumeration algorithm is used ...
Partial Collection Replication versus Caching for Information Retrieval Systems
- IN THE ACM INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL
, 2000
"... The explosion of content in distributed information retrieval (IR) systems requires new mechanisms to attain timely and accurate retrieval of unstructured text. In this paper, we compare two mechanisms to improve IR system performance: partial collection replication and caching. When queries have lo ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
The explosion of content in distributed information retrieval (IR) systems requires new mechanisms to attain timely and accurate retrieval of unstructured text. In this paper, we compare two mechanisms to improve IR system performance: partial collection replication and caching. When queries have locality, both mechanisms return results more quickly than sending queries to the original collection (s). Caches return results when queries exactly match a previous one. Partial replicas are a form of caching that return results when the IR technology determines the query is a good match. Caches are simpler and faster, but replicas can increase locality by detecting similarity between queries that are not exactly the same. We use real traces from THOMAS and Excite to measure query locality and similarity. With a very restrictive definition of query similarity, similarity improves query locality up to 15% over exact match. We use a validated simulator to compare their performance, and find that even if the partial replica hit rate increases only 3 to 6%, it will outperform simple caching under a variety of configurations. A combined approach will probably yield the best performance.
Voting with Regenerable Volatile Witnesses
, 1991
"... Voting protocols ensure the consistency of replicated objects by requiring all read and write requests to collect an appropriate quorum of replicas. We propose to replace some of these replicas by volatile witnesses that have no data and require no stable storage, and to regenerate them instead of w ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Voting protocols ensure the consistency of replicated objects by requiring all read and write requests to collect an appropriate quorum of replicas. We propose to replace some of these replicas by volatile witnesses that have no data and require no stable storage, and to regenerate them instead of waiting for recovery. The small size of volatile witnesses allows them to be regenerated much easier than full replicas. Regeneration attempts are also much more likely to succeed since volatile witnesses can be stored on diskless sites. We show that under standard Markovian assumptions two full replicas and one regenerable volatile witness managed by a two-tier dynamic voting protocol provide a higher data availability than three full replicas managed by majority consensus voting or optimistic dynamic voting provided site failures can be detected significantly faster than they can be repaired. Keywords: distributed file systems, replicated data, voting, witnesses. 1. INTRODUCTION Fault-tol...
Partial Replica Selection Based on Relevance for Information Retrieval
- IN PROCEEDINGS OF THE 22TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL
, 1999
"... Partial collection replication improves performance and scalability of a large-scale distributed information retrieval system by distributing excessive workloads, reducing network latency, and restricting some searches to a small percentage of data. In this paper, we first examine queries from real ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Partial collection replication improves performance and scalability of a large-scale distributed information retrieval system by distributing excessive workloads, reducing network latency, and restricting some searches to a small percentage of data. In this paper, we first examine queries from real system logs and show that there is sufficient query locality in real systems to justify partial collection replication. We then present a method for constructing a hierarchy of partial replicas from a collection where each replica is a subset of all larger replicas, and extend the inference network model to rank and select partial replicas. We compare our new selection algorithm to previous work on collection selection over a range of tuning parameters. For a given query, our replica selection algorithm correctly determines the most relevant of the replicas or original collection, and thus maintains the highest retrieval effectiveness while searching the least data as compared with the other...

