Results 1 - 10
of
14
Zyzzyva: Speculative byzantine fault tolerance
- In Symposium on Operating Systems Principles (SOSP
, 2007
"... We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond to a client’s request without first running an expensive three-phase commit protocol to reach agreement on the order in ..."
Abstract
-
Cited by 188 (16 self)
- Add to MetaCart
We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond to a client’s request without first running an expensive three-phase commit protocol to reach agreement on the order in which the request must be processed. Instead, they optimistically adopt the order proposed by the primary and respond immediately to the client. Replicas can thus become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minima.
SPORC: Group Collaboration using Untrusted Cloud Resources
- 9TH USENIX SYMPOSIUM ON OPERATING SYSTEMS SYSTEMS DESIGN AND IMPLEMENTATION (OSDI ’10)
, 2010
"... Cloud-based services are an attractive deployment model for user-facing applications like word processing and calendaring. Unlike desktop applications, cloud services allow multiple users to edit shared state concurrently and in real-time, while being scalable, highly available, and globally accessi ..."
Abstract
-
Cited by 80 (6 self)
- Add to MetaCart
(Show Context)
Cloud-based services are an attractive deployment model for user-facing applications like word processing and calendaring. Unlike desktop applications, cloud services allow multiple users to edit shared state concurrently and in real-time, while being scalable, highly available, and globally accessible. Unfortunately, these benefits come at the cost of fully trusting cloud providers with potentially sensitive and important data. To overcome this strict tradeoff, we present SPORC, a generic framework for building a wide variety of collaborative applications with untrusted servers. In SPORC, a server observes only encrypted data and cannot deviate from correct execution without being detected. SPORC allows concurrent, low-latency editing of shared state, permits disconnected operation, and supports dynamic access control even in the presence of concurrency. We demonstrate SPORC’s flexibility through two prototype applications: a causally-consistent key-value store and a browser-based collaborative text editor. Conceptually, SPORC illustrates the complementary benefits of operational transformation (OT) and fork* consistency. The former allows SPORC clients to execute concurrent operations without locking and to resolve any resulting conflicts automatically. The latter prevents a misbehaving server from equivocating about the order of operations unless it is willing to fork clients into disjoint sets. Notably, unlike previous systems, SPORC can automatically recover from such malicious forks by leveraging OT’s conflict resolution mechanism.
Increasing performance in Byzantine fault-tolerant systems with on-demand replica consistency
- Proceedings of the 6th ACM SIGOPS/EuroSys European Systems Conference, Saltzburg
, 2011
"... Traditional agreement-based Byzantine fault-tolerant (BFT) systems process all requests on all replicas to ensure consis-tency. In addition to the overhead for BFT protocol and state-machine replication, this practice degrades performance and prevents throughput scalability. In this paper, we propos ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Traditional agreement-based Byzantine fault-tolerant (BFT) systems process all requests on all replicas to ensure consis-tency. In addition to the overhead for BFT protocol and state-machine replication, this practice degrades performance and prevents throughput scalability. In this paper, we propose an extension to existing BFT architectures that increases per-formance for the default number of replicas by optimizing the resource utilization of their execution stages. Our approach executes a request on only a selected subset of replicas, using a selector component co-located with each replica. As this leads to divergent replica states, a selector on-demand updates outdated objects on the local replica prior to processing a request. Our evaluation shows that with each replica executing only a part of all requests, the overall performance of a Byzantine fault-tolerant NFS can be almost doubled; our prototype even outperforms unreplicated NFS.
High performance state-machine replication,”
, 2010
"... Abstract-State-machine replication is a well-established approach to fault tolerance. The idea is to replicate a service on multiple servers so that it remains available despite the failure of one or more servers. From a performance perspective, statemachine replication has two limitations. First, ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
Abstract-State-machine replication is a well-established approach to fault tolerance. The idea is to replicate a service on multiple servers so that it remains available despite the failure of one or more servers. From a performance perspective, statemachine replication has two limitations. First, it introduces some overhead in service response time, due to the requirement to totally order commands. Second, service throughput cannot be augmented by adding replicas to the system. We address the two issues in this paper. We use speculative execution to reduce the response time and state partitioning to increase the throughput of state-machine replication. We illustrate these techniques with a highly available parallel B-tree service.
Improving data center resource management, deployment, and availability with virtualization
, 2009
"... The increasing demand for storage and computation has driven the growth of large data centers–the massive server farms that run many of today’s Internet and business applications. A data center can comprise many thousands of servers and can use as much energy as a small city. The massive amounts of ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The increasing demand for storage and computation has driven the growth of large data centers–the massive server farms that run many of today’s Internet and business applications. A data center can comprise many thousands of servers and can use as much energy as a small city. The massive amounts of computation power required to drive these systems results in many challenging and interesting distributed systems and resource management problems. In this thesis I investigate challenges related to data centers, with a particular emphasis on how new virtualization technologies can be used to simplify deployment, improve resource efficiency, and reduce the cost of reliability. I first study problems that relate the initial capacity planning required when deploying applications into a virtualized data center. I demonstrate how models iv of virtualization overheads can be utilized to accurately predict the resource needs of virtualized applications, allowing them to be smoothly transitioned into a data center. I next study how memory similarity can be used to guide placement when
Prophecy: Using History for High-Throughput Fault Tolerance
- 7TH USENIX SYMPOSIUM ON NETWORK DESIGN AND IMPLEMENTATION (NSDI ’10)
, 2010
"... Byzantine fault-tolerant (BFT) replication has enjoyed a series of performance improvements, but remains costly due to its replicated work. We eliminate this cost for read-mostly workloads through Prophecy, a system that interposes itself between clients and any replicated service. At Prophecy’s cor ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
(Show Context)
Byzantine fault-tolerant (BFT) replication has enjoyed a series of performance improvements, but remains costly due to its replicated work. We eliminate this cost for read-mostly workloads through Prophecy, a system that interposes itself between clients and any replicated service. At Prophecy’s core is a trusted sketcher component, designed to extend the semi-trusted load balancer that mediates access to an Internet service. The sketcher performs fast, load-balanced reads when results are historically consistent, and slow, replicated reads otherwise. Despite its simplicity, Prophecy provides a new form of consistency called delay-once consistency. Along the way, we derive a distributed variant of Prophecy that achieves the same consistency but without any trusted components. A prototype implementation demonstrates Prophecy’s high throughput compared to BFT systems. We also describe and evaluate Prophecy’s ability to scale-out to support large replica groups or multiple replica groups. As Prophecy is most effective when state updates are rare, we finally present a measurement study of popular websites that demonstrates a large proportion of static data.
Designing distributed systems using approximate synchrony in data center networks
- In NSDI
, 2015
"... Distributed systems are traditionally designed indepen-dently from the underlying network, making worst-case assumptions (e.g., complete asynchrony) about its behav-ior. However, many of today’s distributed applications are deployed in data centers, where the network is more re-liable, predictable, ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
Distributed systems are traditionally designed indepen-dently from the underlying network, making worst-case assumptions (e.g., complete asynchrony) about its behav-ior. However, many of today’s distributed applications are deployed in data centers, where the network is more re-liable, predictable, and extensible. In these environments, it is possible to co-design distributed systems with their network layer, and doing so can offer substantial benefits. This paper explores network-level mechanisms for pro-viding Mostly-Ordered Multicast (MOM): a best-effort ordering property for concurrent multicast operations. Us-ing this primitive, we design Speculative Paxos, a state machine replication protocol that relies on the network to order requests in the normal case. This approach leads to substantial performance benefits: under realistic data cen-ter conditions, Speculative Paxos can provide 40 % lower latency and 2.6 × higher throughput than the standard Paxos protocol. It offers lower latency than a latency-optimized protocol (Fast Paxos) with the same throughput as a throughput-optimized protocol (batching). 1
PipeCloud: Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery
"... Disaster Recovery (DR) is a desirable feature for all enterprises, and a crucial one for many. However, adoption of DR remains limited due to the stark tradeoffs it imposes. To recover an application to the point of crash, one is limited by financial considerations, substantial application overhead, ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
Disaster Recovery (DR) is a desirable feature for all enterprises, and a crucial one for many. However, adoption of DR remains limited due to the stark tradeoffs it imposes. To recover an application to the point of crash, one is limited by financial considerations, substantial application overhead, or minimal geographical separation between the primary and recovery sites. In this paper, we argue for cloud-based DR and pipelined synchronous replication as an antidote to these problems. Cloud hosting promises economies of scale and on-demand provisioning that are a perfect fit for the infrequent yet urgent needs of DR. Pipelined synchrony addresses the impact of WAN replication latency on performance, by efficiently overlapping replication with application processing for multi-tier servers. By tracking the consequences of the disk modifications that are persisted to a recovery site all the way to client-directed messages, applications realize forward progress while retaining full consistency guarantees for client-visible state in the event of a disaster. PipeCloud, our prototype, is able to sustain these guarantees for multi-node servers composed of black-box VMs, with no need of application modification, resulting in a perfect fit for the arbitrary nature of VM-based cloud hosting. We demonstrate disaster failover to the Amazon EC2 platform, and show that PipeCloud can increase throughput by an order of magnitude and reduce response times by more than half compared to synchronous replication, all while providing the same zero data loss consistency guarantees.
Privacy and Integrity in the Untrusted
"... Cloud computing has become increasingly popular because it offers users the illusion of having infinite computing resources, of which they can use as much as they need, without having to worry about how those resources are provided. It also provides greater scalability, availability, and reliability ..."
Abstract
- Add to MetaCart
(Show Context)
Cloud computing has become increasingly popular because it offers users the illusion of having infinite computing resources, of which they can use as much as they need, without having to worry about how those resources are provided. It also provides greater scalability, availability, and reliability than users could achieve with their own resources. Unfortunately, adopting cloud computing has required users to cede control of their data to cloud providers, and a malicious provider could compromise the data’s confidentiality and integrity. Furthermore, the history of leaks, breaches, and misuse of customer information at providers has highlighted the failure of government regulation and market incentives to fully mitigate this threat. Thus, users have had to choose between trusting providers or forgoing cloud computing’s benefits entirely. This dissertation aims to overcome this trade-off. We present two systems, SPORC and Frientegrity, that enable users to benefit from cloud deployment without having to trust the cloud provider. Their security is rooted not in the provider’s good behavior, but in the users ’ cryptographic keys. In both systems, the provider only observes encrypted data and cannot deviate from correct execution without detection. Moreover,
Education
, 2012
"... I am seeking a tenure-track position as an assistant professor in your department. I am a Ph.D. ..."
Abstract
- Add to MetaCart
(Show Context)
I am seeking a tenure-track position as an assistant professor in your department. I am a Ph.D.