Results 1 - 10
of
156
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
, 2001
"... We propose a new design for highly concurrent Internet services, whichwe call the staged event-driven architecture (SEDA). SEDA is intended ..."
Abstract
-
Cited by 522 (10 self)
- Add to MetaCart
(Show Context)
We propose a new design for highly concurrent Internet services, whichwe call the staged event-driven architecture (SEDA). SEDA is intended
Skipnet: A scalable overlay network with practical locality properties
, 2003
"... Abstract: Scalable overlay networks such as Chord, Pastry, and Tapestry have recently emerged as a flexible infrastructure for building large peer-to-peer systems. In practice, two disadvantages of such systems are that it is difficult to control where data is stored and difficult to guarantee that ..."
Abstract
-
Cited by 359 (5 self)
- Add to MetaCart
(Show Context)
Abstract: Scalable overlay networks such as Chord, Pastry, and Tapestry have recently emerged as a flexible infrastructure for building large peer-to-peer systems. In practice, two disadvantages of such systems are that it is difficult to control where data is stored and difficult to guarantee that routing paths remain within an administrative domain. SkipNet is a scalable overlay network that provides controlled data placement and routing locality guarantees by organizing data primarily by lexicographic key ordering. SkipNet also allows for both fine-grained and coarsegrained control over data placement, where content can be placed either on a pre-determined node or distributed uniformly across the nodes of a hierarchical naming subtree. An additional useful consequence of SkipNet’s locality properties is that partition failures, in which an entire organization disconnects from the rest of the system, result in two disjoint, but well-connected overlay networks. 1
Viceroy: A Scalable and Dynamic Emulation of the Butterfly
, 2002
"... We propose a family of constant-degree routing networks of logarithmic diameter, with the additional property that the addition or removal of a node to the network requires no global coordination, only a constant number of linkage changes in expectation, and a logarithmic number with high probabilit ..."
Abstract
-
Cited by 343 (16 self)
- Add to MetaCart
(Show Context)
We propose a family of constant-degree routing networks of logarithmic diameter, with the additional property that the addition or removal of a node to the network requires no global coordination, only a constant number of linkage changes in expectation, and a logarithmic number with high probability. Our randomized construction improves upon existing solutions, such as balanced search trees, by ensuring that the congestion of the network is always within a logarithmic factor of the optimum with high probability. Our construction derives from recent advances in the study of peer-to-peer lookup networks, where rapid changes require e#cient and distributed maintenance, and where the lookup e#ciency is impacted both by the lengths of paths to requested data and the presence or elimination of bottlenecks in the network.
Symphony: Distributed Hashing in a Small World
- IN PROCEEDINGS OF THE 4TH USENIX SYMPOSIUM ON INTERNET TECHNOLOGIES AND SYSTEMS
, 2003
"... We present Symphony, a novel protocol for maintaining distributed hash tables in a wide area network. The key idea is to arrange all participants along a ring and equip them with long distance contacts drawn from a family of harmonic distributions. Through simulation, we demonstrate that our constr ..."
Abstract
-
Cited by 211 (13 self)
- Add to MetaCart
(Show Context)
We present Symphony, a novel protocol for maintaining distributed hash tables in a wide area network. The key idea is to arrange all participants along a ring and equip them with long distance contacts drawn from a family of harmonic distributions. Through simulation, we demonstrate that our construction is scalable, flexible, stable in the presence of frequent updates and offers small average latency with only a handful of long distance links per node. The cost of updates when hosts join and leave is small.
The Ninja architecture for robust Internet-scale systems and services
- Computer Networks
, 2001
"... ..."
(Show Context)
Dynamic virtual clusters in a grid site manager
- In Proceedings of the Twelfth International Symposium on High Performance Distributed Computing (HPDC-12
, 2003
"... This paper presents new mechanisms for dynamic resource management in a cluster manager called Clusteron-Demand (COD). COD allocates servers from a common pool to multiple virtual clusters (vclusters), with independently configured software environments, name spaces, user access controls, and networ ..."
Abstract
-
Cited by 154 (28 self)
- Add to MetaCart
(Show Context)
This paper presents new mechanisms for dynamic resource management in a cluster manager called Clusteron-Demand (COD). COD allocates servers from a common pool to multiple virtual clusters (vclusters), with independently configured software environments, name spaces, user access controls, and network storage volumes. We present experiments using the popular Sun GridEngine batch scheduler to demonstrate that dynamic virtual clusters are an enabling abstraction for advanced resource management in computing utilities and grids. In particular, they support dynamic, policy-based cluster sharing between local users and hosted grid services, resource reservation and adaptive provisioning, scavenging of idle resources, and dynamic instantiation of grid services. These goals are achieved in a direct and general way through a new set of fundamental cluster management functions, with minimal impact on the grid middleware itself. 1
Sinfonia: a new paradigm for building scalable distributed systems
- In SOSP
, 2007
"... We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols—a major complication in existing distributed systems. Instead, developers just design and manipulate data structures within our service called Sinfonia. Sinfonia ..."
Abstract
-
Cited by 153 (12 self)
- Add to MetaCart
(Show Context)
We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols—a major complication in existing distributed systems. Instead, developers just design and manipulate data structures within our service called Sinfonia. Sinfonia keeps data for applications on a set of memory nodes, each exporting a linear address space. At the core of Sinfonia is a novel minitransaction primitive that enables efficient and consistent access to data, while hiding the complexities that arise from concurrency and failures. Using Sinfonia, we implemented two very different and complex applications in a few months: a cluster file system and a group communication service. Our implementations perform well and scale to hundreds of machines.
Adaptive Overload Control for Busy Internet Servers
, 2003
"... As Internet services become more popular and pervasive, a critical problem that arises is managing the performance of services under extreme overload. This paper presents a set of techniques for managing overload in complex, dynamic Internet services. These techniques are based on an adaptive admiss ..."
Abstract
-
Cited by 150 (2 self)
- Add to MetaCart
As Internet services become more popular and pervasive, a critical problem that arises is managing the performance of services under extreme overload. This paper presents a set of techniques for managing overload in complex, dynamic Internet services. These techniques are based on an adaptive admission control mechanism that attempts to bound the 90th-percentile response time of requests flowing through the service. This is accomplished by internally monitoring the performance of the service, which is decomposed into a set of event-driven stages connected with request queues. By controlling the rate at which each stage admits requests, the service can perform focused overload management, for example, by filtering only those requests that lead to resource bottlenecks. We present two extensions of this basic controller that provide class-based service differentiation as well as application-specific service degradation. We evaluate these mechanisms using a complex Webbased e-mail service that is subjected to a realistic user load, as well as a simpler Web server benchmark.
Fault-scalable Byzantine fault-tolerant services
- In Proceedings of the 20th ACM Symposium on Operating Systems Principles
, 2005
"... A fault-scalable service can be configured to tolerate increasing numbers of faults without significant decreases in performance. The Query/Update (Q/U) protocol is a new tool that enables construction of fault-scalable Byzantine faulttolerant services. The optimistic quorum-based nature of the Q/U ..."
Abstract
-
Cited by 144 (6 self)
- Add to MetaCart
(Show Context)
A fault-scalable service can be configured to tolerate increasing numbers of faults without significant decreases in performance. The Query/Update (Q/U) protocol is a new tool that enables construction of fault-scalable Byzantine faulttolerant services. The optimistic quorum-based nature of the Q/U protocol allows it to provide better throughput and fault-scalability than replicated state machines using agreement-based protocols. A prototype service built using the Q/U protocol outperforms the same service built using a popular replicated state machine implementation at all system sizes in experiments that permit an optimistic execution. Moreover, the performance of the Q/U protocol decreases by only 36 % as the number of Byzantine faults tolerated increases from one to five, whereas the performance of the replicated state machine decreases by 83%.
Boxwood: Abstractions as the Foundation for Storage Infrastructure
, 2004
"... Writers of complex storage applications such as distributed file systems and databases are faced with the challenges of building complex abstractions over simple storage devices like disks. These challenges are exacerbated due to the additional requirements for faulttolerance and scaling. This paper ..."
Abstract
-
Cited by 132 (8 self)
- Add to MetaCart
Writers of complex storage applications such as distributed file systems and databases are faced with the challenges of building complex abstractions over simple storage devices like disks. These challenges are exacerbated due to the additional requirements for faulttolerance and scaling. This paper explores the premise that high-level, fault-tolerant abstractions supported directly by the storage infrastructure can ameliorate these problems. We have built a system called Boxwood to explore the feasibility and utility of providing high-level abstractions or data structures as the fundamental storage infrastructure. Boxwood currently runs on a small cluster of eight machines. The Boxwood abstractions perform very close to the limits imposed by the processor, disk, and the native networking subsystem. Using these abstractions directly, we have implemented an NFSv2 file service that demonstrates the promise of our approach.