Results 1 - 10
of
36
A new approach to developing and implementing eager database replication protocols
- ACM TODS
"... Database replication is traditionally seen as a way to increase the availability and performance of distributed databases. Although a large number of protocols providing data consistency and fault-tolerance have been proposed, few of these ideas have ever been used in commercial products due to thei ..."
Abstract
-
Cited by 101 (12 self)
- Add to MetaCart
Database replication is traditionally seen as a way to increase the availability and performance of distributed databases. Although a large number of protocols providing data consistency and fault-tolerance have been proposed, few of these ideas have ever been used in commercial products due to their complexity and performance implications. Instead, current products allow inconsistencies and often resort to centralized approaches which eliminates some of the advantages of replication. As an alternative, we propose a suite of replication protocols that addresses the main problems related to database replication. On the one hand, our protocols maintain data consistency and the same transactional semantics found in centralized systems. On the other hand, they provide flexibility and reasonable performance. To do so, our protocols take advantage of the rich semantics of group communication primitives and the relaxed isolation guarantees provided by most databases. This allows us to eliminate the possibility of deadlocks, reduce the message overhead and increase performance. A detailed simulation study shows the feasibility of the approach and the flexibility with which different types of bottlenecks can be circumvented.
Middle-R: Consistent Database Replication at the Middleware Level
- ACM Trans. Comput. Syst
, 2005
"... The widespread use of clusters and web farms has increased the importance of data replication. In this paper, we show how to implement consistent and scalable data replication at the middleware level. We do this by combining transactional concurrency control with group communication primitives. The ..."
Abstract
-
Cited by 59 (7 self)
- Add to MetaCart
The widespread use of clusters and web farms has increased the importance of data replication. In this paper, we show how to implement consistent and scalable data replication at the middleware level. We do this by combining transactional concurrency control with group communication primitives. The paper presents different replication protocols, argues their correctness, describes their implementation as part of a generic middleware tool, and proves their feasibility with an extensive performance evaluation. The solution proposed is well suited for a variety of applications including web farms and distributed object platforms.
Efficient Concurrency Control for Broadcast Environments
"... A crucial consideration in environments where data is broadcast to clients is the low bandwidth available for clients to communicate with servers. Advanced applications in such environments do need to read data that is mutually consistent aswell as current. However, given the asymmetric communicatio ..."
Abstract
-
Cited by 47 (1 self)
- Add to MetaCart
A crucial consideration in environments where data is broadcast to clients is the low bandwidth available for clients to communicate with servers. Advanced applications in such environments do need to read data that is mutually consistent aswell as current. However, given the asymmetric communication capabilities and the needs of clients in mobile environments, traditional serializability-based approaches are too restrictive, unnecessary, and impractical. We thus propose the use of a weaker correctness criterion called update consistency and outline mechanisms based on this criterion that ensure (1) the mutual consistency of data maintained by the server and read by clients, and (2) the currency of data read by clients. Using these mechanisms, clients can obtain data that is current and mutually consistent "off the air", i.e., without contacting the server to, say, obtain locks. Experimental results show a substantial reduction in response times as compared to existing (serializability-based) approaches. A further attractive feature of the approach is that if caching is possible at a client, weaker forms of currency can be obtained while still satisfying the mutual consistency of data.
Making Snapshots Isolation Serializable
, 2000
"... Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson et al. [1995]. SI is attractive because it provides an isolation level that avoids many of the common concurrency anomalies, and has been implemented by Oracle and Microsoft SQL Server (with certain ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson et al. [1995]. SI is attractive because it provides an isolation level that avoids many of the common concurrency anomalies, and has been implemented by Oracle and Microsoft SQL Server (with certain minor variations). SI does not guarantee serializability in all cases, but the TPC-C benchmark application [TPC-C], for example, executes under SI without serialization anomalies. All major database system products are delivered with default nonserializable isolation levels, often ones that encounter serialization anomalies more commonly than SI, and we suspect that numerous isolation errors occur each day at many large sites because of this, leading to corrupt data sometimes noted in data warehouse applications. The classical justification for lower isolation levels is that applications can be run under such levels to improve efficiency when they can be shown not to result in serious errors, but little or no guidance has been offered to application programmers and DBAs by vendors as to how to avoid such errors. This article develops a theory that characterizes when nonserializable executions of applications can occur under SI. Near the end of the article, we apply this theory to demonstrate that the TPC-C benchmark application has no serialization anomalies under SI, and then discuss how this demonstration can be generalized to other applications. We also present a discussion on how to modify the program logic of applications that are nonserializable under SI so that serializability will be guaranteed.
Epidemic Algorithms for Replicated Databases
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2003
"... We present a family of epidemic algorithms for maintaining replicated database systems. The algorithms are based on the causal delivery of log records where each record corresponds to one transaction instead of one operation. The first algorithm in this family is a pessimistic protocol that ensure ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
We present a family of epidemic algorithms for maintaining replicated database systems. The algorithms are based on the causal delivery of log records where each record corresponds to one transaction instead of one operation. The first algorithm in this family is a pessimistic protocol that ensures serializability and guarantees strict executions. Since we expect the epidemic algorithms to be used in environments with low probability of conflicts among transactions, we develop a variant of the pessimistic algorithm which is optimistic in that transactions commit as soon as they terminate locally and inconsistencies are detected asynchronously as the effects of committed transactions propagate through the system. The last member of the family of epidemic algorithms is pessimistic and uses voting with quorums to resolve conflicts and improve transaction response time. A simulation study evaluates the performance of the protocols.
Epidemic Quorums for Managing Replicated Data
- In Proc. 19th IEEE Intl. Performance, Computing, and Communications Conf. (IPCCC
, 1999
"... In the epidemic model an update is initiated on a single site and is propagated to other sites in a lazy manner. When combined with version vectors and event logs, this propagation mechanism delivers updates in causal order despite communication failures. We integrate quorums into the epidemic model ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
In the epidemic model an update is initiated on a single site and is propagated to other sites in a lazy manner. When combined with version vectors and event logs, this propagation mechanism delivers updates in causal order despite communication failures. We integrate quorums into the epidemic model to process transactions on replicated data while ensuring global serializability. We present a detailed simulation of a distributed replicated database and demonstrate the performance improvements. 1 Introduction Asynchronous replication has been deployed successfully for maintaining control information in distributed systems and computer networks. For example, name servers, yellow pages, and server directories are maintained redundantly on multiple sites and updates are incorporated in a lazy manner through gossip messages, epidemic propagation, and anti-entropy [12]. In this paper we use the epidemic communication model as the basis for an algorithm that supports transaction processing ...
Support for Speculative Update Propagation and Mobility in Deno
"... This paper presents the replication framework of Deno, an object replication system specifically designed for mobile and weakly-connected environments. Deno uses weighted voting for availability and pair-wise, epidemic information flow for flexibility. This combination allows the protocols to operat ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
This paper presents the replication framework of Deno, an object replication system specifically designed for mobile and weakly-connected environments. Deno uses weighted voting for availability and pair-wise, epidemic information flow for flexibility. This combination allows the protocols to operate with less than full connectivity, to easily adapt to changes in group membership, and to make few assumptions about the underlying network topology. Deno has been implemented and runs on top of Linux and Win32 platforms. We use the Deno prototype to characterize the performance of two versions of Deno's protocol. The first version enables globally serializable execution of update transactions. The second supports a weaker consistency level that still guarantees transactionally-consistent access to replicated data. We demonstrate that the incremental cost of providing global serializability is low, and that speculative dissemination of updates can significantly improve commit performance.
FlashLogging: Exploiting Flash Devices for Synchronous Logging Performance
"... Synchronous transactional logging is the central mechanism for ensuring data persistency and recoverability in database systems. Unfortunately, magnetic disks are ill-suited for the small sequential write pattern of synchronous logging. Alternative solutions (e.g., backup servers or sophisticated ba ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Synchronous transactional logging is the central mechanism for ensuring data persistency and recoverability in database systems. Unfortunately, magnetic disks are ill-suited for the small sequential write pattern of synchronous logging. Alternative solutions (e.g., backup servers or sophisticated battery-backed write caches in high-end disk arrays) are either expensive or complicated. In this paper, we exploit flash devices for synchronous logging based on the observation that flash devices support small sequential writes well. Comparing a wide variety of flash devices, we find that USB flash drives are a good match for this task because of its unique characteristics: widely available USB ports, hot-plug capability useful for coping with flash wear, and low price so that multiple drives are affordable. We propose FlashLogging, a logging solution that exploits multiple (USB) flash drives for synchronous logging. We identify and address four challenges: (i) efficiently exploiting multiple flash drives for logging; (ii) coping with the large variance of write latencies because of device erasure operations; (iii) efficient recovery processing; and (iv) combining flash drives and disks for better logging and recovery performance. We implemented our solution within MySQL-InnoDB. Our real machine experiments running online transaction processing workloads (TPCC) show that FlashLogging achieves up to 5.7X improvements over magnetic-disk-based logging, and obtains up to 98.6 % of the ideal performance. We further compare our design with one that uses Solid-State Drives (SSDs), and find that although SSDs improve logging performance, multiple USB flash drives can achieve comparable or better performance with much lower price. Categories andSubject Descriptors
Fine-Grained Replication and Scheduling with Freshness and Correctness Guarantees
- In Proceedings of the 31st International Conference on Very Large Data Bases
, 2005
"... Lazy replication protocols provide good scalability properties by decoupling transaction execution from the propagation of new values to replica sites while guaranteeing a correct and more efficient transaction processing and replica maintenance. ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Lazy replication protocols provide good scalability properties by decoupling transaction execution from the propagation of new values to replica sites while guaranteeing a correct and more efficient transaction processing and replica maintenance.
Database Replication Using Epidemic Update
, 2000
"... Due to severe performance penalties associated with synchronous replication, there is an increasing interest in asynchronous replica management protocols in which database transactions are executed locally, and the effects of these transactions are incorporated asynchronously on remote database co ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Due to severe performance penalties associated with synchronous replication, there is an increasing interest in asynchronous replica management protocols in which database transactions are executed locally, and the effects of these transactions are incorporated asynchronously on remote database copies. However, the asynchronous protocols currently in use either do not guarantee consistency and serializability as needed by transactional semantics or they impose restrictions on placement of data and on which data objects can be updated. In this paper we investigate an epidemic update protocol that guarantees consistency and serializability in spite of a write-anywhere capability. We conducted experiments on a detailed simulation of a distributed, replicated database to evaluate this protocol. Our results establish that this epidemic approach is indeed a viable alternative to traditional eager update protocols for a distributed database environment where consistency and full seri...

