Results 1 - 10
of
296
An Integrated Experimental Environment for Distributed Systems and Networks
- In Proc. of the Fifth Symposium on Operating Systems Design and Implementation
, 2002
"... Three experimental environments traditionally support network and distributed systems research: network emulators, network simulators, and live networks. The continued use of multiple approaches highlights both the value and inadequacy of each. Netbed, a descendant of Emulab, provides an experimenta ..."
Abstract
-
Cited by 688 (41 self)
- Add to MetaCart
Three experimental environments traditionally support network and distributed systems research: network emulators, network simulators, and live networks. The continued use of multiple approaches highlights both the value and inadequacy of each. Netbed, a descendant of Emulab, provides an experimentation facility that integrates these approaches, allowing researchers to configure and access networks composed of emulated, simulated, and wide-area nodes and links. Netbed's primary goals are ease of use, control, and realism, achieved through consistent use of virtualization and abstraction.
Handling Churn in a DHT
- In Proceedings of the USENIX Annual Technical Conference
, 2004
"... This paper addresses the problem of churn---the continuous process of node arrival and departure---in distributed hash tables (DHTs). We argue that DHTs should perform lookups quickly and consistently under churn rates at least as high as those observed in deployed P2P systems such as Kazaa. We then ..."
Abstract
-
Cited by 450 (22 self)
- Add to MetaCart
(Show Context)
This paper addresses the problem of churn---the continuous process of node arrival and departure---in distributed hash tables (DHTs). We argue that DHTs should perform lookups quickly and consistently under churn rates at least as high as those observed in deployed P2P systems such as Kazaa. We then show through experiments on an emulated network that current DHT implementations cannot handle such churn rates. Next, we identify and explore three factors affecting DHT performance under churn: reactive versus periodic failure recovery, message timeout calculation, and proximity neighbor selection. We work in the context of a mature DHT implementation called Bamboo, using the ModelNet network emulator, which models in-network queuing, cross-traffic, and packet loss. These factors are typically missing in earlier simulationbased DHT studies, and we show that careful attention to them in Bamboo's design allows it to function effectively at churn rates at or higher than that observed in P2P file-sharing applications, while using lower maintenance bandwidth than other DHT implementations.
Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh
, 2003
"... In recent years, overlay networks have become an effective alternative to IP multicast for efficient point to multipoint communication across the Internet. Typically, nodes self-organize with the goal of forming an efficient overlay tree, one that meets performance targets without placing undue burd ..."
Abstract
-
Cited by 424 (22 self)
- Add to MetaCart
In recent years, overlay networks have become an effective alternative to IP multicast for efficient point to multipoint communication across the Internet. Typically, nodes self-organize with the goal of forming an efficient overlay tree, one that meets performance targets without placing undue burden on the underlying network. In this paper, we target high-bandwidth data distribution from a single source to a large number of receivers. Applications include large-file transfers and real-time multimedia streaming. For these applications, we argue that an overlay mesh, rather than a tree, can deliver fundamentally higher bandwidth and reliability relative to typical tree structures. This paper presents Bullet, a scalable and distributed algorithm that enables nodes spread across the Internet to self-organize into a high bandwidth overlay mesh. We construct Bullet around the insight that data should be distributed in a disjoint manner to strategic points in the network. Individual Bullet receivers are then responsible for locating and retrieving the data from multiple points in parallel. Key contributions of this work include: i) an algorithm that sends data to di#erent points in the overlay such that any data object is equally likely to appear at any node, ii) a scalable and decentralized algorithm that allows nodes to locate and recover missing data items, and iii) a complete implementation and evaluation of Bullet running across the Internet and in a large-scale emulation environment reveals up to a factor two bandwidth improvements under a variety of circumstances. In addition, we find that, relative to tree-based solutions, Bullet reduces the need to perform expensive bandwidth probing.
detecting the unexpected in distributed systems
- In NSDI’06: Proceedings of the 3rd conference on 3rd Symposium on Networked Systems Design & Implementation
"... Bugs in distributed systems are often hard to find. Many bugs reflect discrepancies between a system’s behavior and the programmer’s assumptions about that behavior. We present Pip 1, an infrastructure for comparing actual behavior and expected behavior to expose structural errors and performance pr ..."
Abstract
-
Cited by 141 (7 self)
- Add to MetaCart
(Show Context)
Bugs in distributed systems are often hard to find. Many bugs reflect discrepancies between a system’s behavior and the programmer’s assumptions about that behavior. We present Pip 1, an infrastructure for comparing actual behavior and expected behavior to expose structural errors and performance problems in distributed systems. Pip allows programmers to express, in a declarative language, expectations about the system’s communications structure, timing, and resource consumption. Pip includes system instrumentation and annotation tools to log actual system behavior, and visualization and query tools for exploring expected and unexpected behavior 2. Pip allows a developer to quickly understand and debug both familiar and unfamiliar systems. We applied Pip to several applications, including FAB, SplitStream, Bullet, and RanSub. We generated most of the instrumentation for all four applications automatically. We found the needed expectations easy to write, starting in each case with automatically generated expectations. Pip found unexpected behavior in each application, and helped to isolate the causes of poor performance and incorrect behavior. 1
SimGrid: a Generic Framework for Large-Scale Distributed Experiments
, 2008
"... Distributed computing is a very broad and active research area comprising fields such as cluster computing, computational grids, desktop grids and peer-to-peer (P2P) systems. Unfortunately, it is often impossible to obtain theoretical or analytical results to compare the performance of algorithms ta ..."
Abstract
-
Cited by 138 (28 self)
- Add to MetaCart
Distributed computing is a very broad and active research area comprising fields such as cluster computing, computational grids, desktop grids and peer-to-peer (P2P) systems. Unfortunately, it is often impossible to obtain theoretical or analytical results to compare the performance of algorithms targeting such systems. One possibility is to conduct large numbers of back-to-back experiments on real platforms. While this is possible on tightlycoupled platforms, it is infeasible on modern distributed platforms as experiments are labor-intensive and results typically not reproducible. Consequently, one must resort to simulations, which enable reproducible results and also make it possible to explore wide ranges of platform and application scenarios. In this paper we describe the SimGrid framework, a simulation-based framework for evaluating cluster, grid and P2P algorithms and heuristics. This paper focuses on SimGrid v3, which greatly improves on previous versions thanks to a novel and validated modular simulation engine that achieves higher simulation speed without hindering simulation accuracy. Also, two new user interfaces were added to broaden the targeted research community. After surveying existing tools and methodologies we describe the key features and benefits of SimGrid.
A Solver for the Network Testbed Mapping Problem
- SIGCOMM Computer Communications Review
, 2002
"... this paper, we explore this problem, which we call the network testbed mapping problem. We describe the interesting challenges that characterize this problem, and explore its application to other spaces, such as distributed simulation. We present the design, implementation, and evaluation of a solve ..."
Abstract
-
Cited by 113 (12 self)
- Add to MetaCart
this paper, we explore this problem, which we call the network testbed mapping problem. We describe the interesting challenges that characterize this problem, and explore its application to other spaces, such as distributed simulation. We present the design, implementation, and evaluation of a solver for this problem, which is currently in use on the Netbed network testbed. It builds on simulated annealing to find very good solutions in a few seconds for our historical workload, and scales gracefully on large well-connected synthetic topologies
Design and Implementation Tradeoffs for Wide-Area Resource Discovery
- In Proceedings of 14th IEEE Symposium on High Performance, Research Triangle Park
, 2005
"... We describe the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intra-group, inter-group, and per-node c ..."
Abstract
-
Cited by 98 (13 self)
- Add to MetaCart
We describe the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intra-group, inter-group, and per-node characteristics, along with the utility that the application derives from specified ranges of metric values. This design gives users the flexibility to find geographically distributed resources for applications that are sensitive to both node and network characteristics, and allows the system to rank acceptable configurations based on their quality for that application. Rather than evaluating a single implementation of SWORD, we explore a variety of architectural designs that deliver the required functionality in a scalable and highly-available manner. We discuss the tradeoffs of using a centralized architecture as compared to a fully decentralized design to perform wide-area resource discovery. To summarize our results, we found that a centralized architecture based on 4-node server cluster sites at network peering facilities outperforms a decentralized DHT-based resource discovery infrastructure with respect to query latency for all but the smallest number of sites. However, although a centralized architecture shows significant promise in stable environments, we find that our decentralized implementation has acceptable performance and also benefits from the DHT’s self-healing properties in more volatile environments. We evaluate the advantages and disadvantages of centralized and distributed resource discovery architectures on 1000 hosts in emulation and on approximately 200 PlanetLab nodes spread across the Internet.
In VINI veritas: realistic and controlled network experimentation
- in Proc. ACM SIGCOMM
, 2006
"... This paper describes VINI, a virtual network infrastructure that allows network researchers to evaluate their protocols and services in a realistic environment that also provides a high degree of control over network conditions. VINI allows researchers to deploy and evaluate their ideas with real ro ..."
Abstract
-
Cited by 96 (4 self)
- Add to MetaCart
(Show Context)
This paper describes VINI, a virtual network infrastructure that allows network researchers to evaluate their protocols and services in a realistic environment that also provides a high degree of control over network conditions. VINI allows researchers to deploy and evaluate their ideas with real routing software, traffic loads, and network events. To provide researchers flexibility in designing their experiments, VINI supports simultaneous experiments with arbitrary network topologies on a shared physical infrastructure. This paper tackles the following important design question: What set of concepts and techniques facilitate flexible, realistic, and controlled experimentation (e.g., multiple topologies and the ability to tweak routing algorithms) on a fixed physical infrastructure? We first present VINI’s high-level design and the challenges of virtualizing a single network. We then present PL-VINI, an implementation of VINI on PlanetLab, running the “Internet In a Slice”. Our evaluation of PL-VINI shows that it provides a realistic and controlled environment for evaluating new protocols and services.
MACEDON: Methodology for Automatically Creating, Evaluating, and Designing Overlay Networks
- In NSDI
, 2004
"... Currently, researchers designing and implementing largescale overlay services employ disparate techniques at each stage in the production cycle: design, implementation, experimentation, and evaluation. As a result, complex and tedious tasks are often duplicated leading to ine#ective resource use and ..."
Abstract
-
Cited by 84 (10 self)
- Add to MetaCart
(Show Context)
Currently, researchers designing and implementing largescale overlay services employ disparate techniques at each stage in the production cycle: design, implementation, experimentation, and evaluation. As a result, complex and tedious tasks are often duplicated leading to ine#ective resource use and di#culty in fairly comparing competing algorithms. In this paper, we present MACEDON, an infrastructure that provides facilities to: i) specify distributed algorithms in a concise domainspecific language; ii) generate code that executes in popular evaluation infrastructures and in live networks; iii) leverage an overlay-generic API to simplify the interoperability of algorithm implementations and applications; and iv) enable consistent experimental evaluation. We have used MACEDON to implement and evaluate a number of algorithms, including AMMO, Bullet, Chord, NICE, Overcast, Pastry, Scribe, and SplitStream, typically with only a few hundred lines of MACEDON code. Using our infrastructure, we are able to accurately reproduce or exceed published results and behavior demonstrated by current publicly available implementations.
Using PlanetLab for network research: Myths, realities, and best practices
- ACM SIGOPS OSR
, 2006
"... PlanetLab is a research testbed that supports 428 experiments on 276 sites, with 583 nodes in 30 countries. It has lowered the barrier to distributed experimentation in network measurement, peer-to-peer networks, content distribution, ..."
Abstract
-
Cited by 73 (5 self)
- Add to MetaCart
(Show Context)
PlanetLab is a research testbed that supports 428 experiments on 276 sites, with 583 nodes in 30 countries. It has lowered the barrier to distributed experimentation in network measurement, peer-to-peer networks, content distribution,