Results 1 - 10
of
1,251
A Scalable Content-Addressable Network
- IN PROC. ACM SIGCOMM 2001
, 2001
"... Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infra ..."
Abstract
-
Cited by 3371 (32 self)
- Add to MetaCart
(Show Context)
Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales. The CAN is scalable, fault-tolerant and completely self-organizing, and we demonstrate its scalability, robustness and low-latency properties through simulation.
Wide-area cooperative storage with CFS
, 2001
"... The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers pr ..."
Abstract
-
Cited by 999 (53 self)
- Add to MetaCart
(Show Context)
The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers provide a distributed hash table (DHash) for block storage. CFS clients interpret DHash blocks as a file system. DHash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness, and decreases latency with server selection. DHash finds blocks using the Chord location protocol, which operates in time logarithmic in the number of servers. CFS is implemented using the SFS file system toolkit and runs on Linux, OpenBSD, and FreeBSD. Experience on a globally deployed prototype shows that CFS delivers data to clients as fast as FTP. Controlled tests show that CFS is scalable: with 4,096 servers, looking up a block of data involves contacting only seven servers. The tests also demonstrate nearly perfect robustness and unimpaired performance even when as many as half the servers fail.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
, 2002
"... We describe a peer-to-peer system which has provable consistency and performance in a fault-prone environment. Our system routes queries and locates nodes using a novel XOR-based metric topology that simplifies the algorithm and facilitates our proof. The topology has the property that every message ..."
Abstract
-
Cited by 834 (3 self)
- Add to MetaCart
(Show Context)
We describe a peer-to-peer system which has provable consistency and performance in a fault-prone environment. Our system routes queries and locates nodes using a novel XOR-based metric topology that simplifies the algorithm and facilitates our proof. The topology has the property that every message exchanged conveys or reinforces useful contact information. The system exploits this information to send parallel, asynchronous query messages that tolerate node failures without imposing timeout delays on users.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications
- ACM SIGCOMM
, 2001
"... A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto ..."
Abstract
-
Cited by 809 (15 self)
- Add to MetaCart
(Show Context)
A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis and simulations show that Chord is scalable: communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.
Scalable Application Layer Multicast
, 2002
"... We describe a new scalable application-layer multicast protocol, specifically designed for low-bandwidth, data streaming applications with large receiver sets. Our scheme is based upon a hierarchical clustering of the application-layer multicast peers and can support a number of different data deliv ..."
Abstract
-
Cited by 731 (21 self)
- Add to MetaCart
(Show Context)
We describe a new scalable application-layer multicast protocol, specifically designed for low-bandwidth, data streaming applications with large receiver sets. Our scheme is based upon a hierarchical clustering of the application-layer multicast peers and can support a number of different data delivery trees with desirable properties. We present extensive simulations of both our protocol and the Narada application-layer multicast protocol over Internet-like topologies. Our results show that for groups of size 32 or more, our protocol has lower link stress (by about 25%), improved or similar endto-end latencies and similar failure recovery properties. More importantly, it is able to achieve these results by using orders of magnitude lower control traffic. Finally, we present results from our wide-area testbed in which we experimented with 32-100 member groups distributed over 8 different sites. In our experiments, averagegroup members established and maintained low-latency paths and incurred a maximum packet loss rate of less than 1 % as members randomly joined and left the multicast group. The average control overhead during our experiments was less than 1 Kbps for groups of size 100.
Search and replication in unstructured peer-to-peer networks
, 2002
"... Abstract Decentralized and unstructured peer-to-peer networks such as Gnutella are attractive for certain applicationsbecause they require no centralized directories and no precise control over network topologies and data placement. However, the flooding-based query algorithm used in Gnutella does n ..."
Abstract
-
Cited by 692 (6 self)
- Add to MetaCart
(Show Context)
Abstract Decentralized and unstructured peer-to-peer networks such as Gnutella are attractive for certain applicationsbecause they require no centralized directories and no precise control over network topologies and data placement. However, the flooding-based query algorithm used in Gnutella does not scale; each individual query gener-ates a large amount of traffic and, as it grows, the system quickly becomes overwhelmed with the query-induced load. This paper explores, through simulation, various alternatives to gnutella's query algorithm, data replicationmethod, and network topology. We propose a query algorithm based on multiple random walks that resolves queries almost as quickly as gnutella's flooding method while reducing the network traffic by two orders of mag-nitude in many cases. We also present a distributed replication strategy that yields close-to-optimal performance. Finally, we find that among the various network topologies we consider, uniform random graphs yield the bestperformance. 1 Introduction The computer science community has become accustomed to the Internet's continuing rapid growth, but even tosuch jaded observers the explosive increase in Peer-to-Peer (P2P) network usage has been astounding. Within a few months of Napster's [12] introduction in 1999 the system had spread widely, and recent measurement data suggeststhat P2P applications are having a very significant and rapidly growing impact on Internet traffic [11, 15]. Therefore, it is important to study the performance and scalability of these P2P networks. Currently, there are several different architectures for P2P networks:
Tapestry: A Resilient Global-scale Overlay for Service Deployment
- IEEE Journal on Selected Areas in Communications
, 2004
"... We present Tapestry, a peer-to-peer overlay routing infrastructure offering efficient, scalable, locationindependent routing of messages directly to nearby copies of an object or service using only localized resources. Tapestry supports a generic Decentralized Object Location and Routing (DOLR) API ..."
Abstract
-
Cited by 598 (14 self)
- Add to MetaCart
(Show Context)
We present Tapestry, a peer-to-peer overlay routing infrastructure offering efficient, scalable, locationindependent routing of messages directly to nearby copies of an object or service using only localized resources. Tapestry supports a generic Decentralized Object Location and Routing (DOLR) API using a self-repairing, softstate based routing layer. This paper presents the Tapestry architecture, algorithms, and implementation. It explores the behavior of a Tapestry deployment on PlanetLab, a global testbed of approximately 100 machines. Experimental results show that Tapestry exhibits stable behavior and performance as an overlay, despite the instability of the underlying network layers. Several widely-distributed applications have been implemented on Tapestry, illustrating its utility as a deployment infrastructure.
Bayeux: An architecture for scalable and fault-tolerant wide-area data dissemination
, 2001
"... The demand for streaming multimedia applications is growing at an incredible rate. In this paper, we propose Bayeux, an efficient application-level multicast system that scales to arbitrarily large receiver groups while tolerating failures in routers and network links. Bayeux also includes specific ..."
Abstract
-
Cited by 465 (12 self)
- Add to MetaCart
(Show Context)
The demand for streaming multimedia applications is growing at an incredible rate. In this paper, we propose Bayeux, an efficient application-level multicast system that scales to arbitrarily large receiver groups while tolerating failures in routers and network links. Bayeux also includes specific mechanisms for load-balancing across replicate root nodes and more efficient bandwidth consumption. Our simulation results indicate that Bayeux maintains these properties while keeping transmission overhead low. To achieve these properties, Bayeux leverages the architecture of Tapestry, a fault-tolerant, wide-area overlay routing and location network.
Astrolabe: A Robust and Scalable Technology for Distributed System Monitoring, Management, and Data Mining
- ACM Transactions on Computer Systems
, 2001
"... this paper, we describe a new information management service called Astrolabe. Astrolabe monitors the dynamically changing state of a collection of distributed resources, reporting summaries of this information to its users. Like DNS, Astrolabe organizes the resources into a hierarchy of domains, wh ..."
Abstract
-
Cited by 452 (27 self)
- Add to MetaCart
this paper, we describe a new information management service called Astrolabe. Astrolabe monitors the dynamically changing state of a collection of distributed resources, reporting summaries of this information to its users. Like DNS, Astrolabe organizes the resources into a hierarchy of domains, which we call zones to avoid confusion, and associates attributes with each zone. Unlike DNS, zones are not bound to specific servers, the attributes may be highly dynamic, and updates propagate quickly; typically, in tens of seconds
Handling Churn in a DHT
- In Proceedings of the USENIX Annual Technical Conference
, 2004
"... This paper addresses the problem of churn---the continuous process of node arrival and departure---in distributed hash tables (DHTs). We argue that DHTs should perform lookups quickly and consistently under churn rates at least as high as those observed in deployed P2P systems such as Kazaa. We then ..."
Abstract
-
Cited by 450 (22 self)
- Add to MetaCart
(Show Context)
This paper addresses the problem of churn---the continuous process of node arrival and departure---in distributed hash tables (DHTs). We argue that DHTs should perform lookups quickly and consistently under churn rates at least as high as those observed in deployed P2P systems such as Kazaa. We then show through experiments on an emulated network that current DHT implementations cannot handle such churn rates. Next, we identify and explore three factors affecting DHT performance under churn: reactive versus periodic failure recovery, message timeout calculation, and proximity neighbor selection. We work in the context of a mature DHT implementation called Bamboo, using the ModelNet network emulator, which models in-network queuing, cross-traffic, and packet loss. These factors are typically missing in earlier simulationbased DHT studies, and we show that careful attention to them in Bamboo's design allows it to function effectively at churn rates at or higher than that observed in P2P file-sharing applications, while using lower maintenance bandwidth than other DHT implementations.