Results 1 - 10
of
94
An experimental study of the skype peer-to-peer voip system
, 2006
"... Despite its popularity, relatively little is known about the traf-fic characteristics of the Skype VoIP system and how they differ from other P2P systems. We describe an experimen-tal study of Skype VoIP traffic conducted over a one month period, where over 30 million datapoints were collected re-ga ..."
Abstract
-
Cited by 193 (0 self)
- Add to MetaCart
(Show Context)
Despite its popularity, relatively little is known about the traf-fic characteristics of the Skype VoIP system and how they differ from other P2P systems. We describe an experimen-tal study of Skype VoIP traffic conducted over a one month period, where over 30 million datapoints were collected re-garding the population of online clients, the number of su-pernodes, and their traffic characteristics. The results indi-cate that although the structure of the Skype system appears to be similar to other P2P systems, particularly KaZaA, there are several significant differences in traffic. The number of active clients shows diurnal and work-week behavior, corre-lating with normal working hours regardless of geography. The population of supernodes in the system tends to be rela-tively stable; thus node churn, a significant concern in other systems, seems less problematic in Skype. The typical band-width load on a supernode is relatively low, even if the su-pernode is relaying VoIP traffic. The paper aims to aid further understanding of a signifi-cant, successful P2P VoIP system, as well as provide exper-imental data that may be useful for design and modeling of such systems. These results also imply that the nature of a VoIP P2P system like Skype differs fundamentally from ear-lier P2P systems that are oriented toward file-sharing, and music and video download applications, and deserves more attention from the research community. 1
Design and Implementation Tradeoffs for Wide-Area Resource Discovery
- In Proceedings of 14th IEEE Symposium on High Performance, Research Triangle Park
, 2005
"... We describe the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intra-group, inter-group, and per-node c ..."
Abstract
-
Cited by 98 (13 self)
- Add to MetaCart
We describe the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intra-group, inter-group, and per-node characteristics, along with the utility that the application derives from specified ranges of metric values. This design gives users the flexibility to find geographically distributed resources for applications that are sensitive to both node and network characteristics, and allows the system to rank acceptable configurations based on their quality for that application. Rather than evaluating a single implementation of SWORD, we explore a variety of architectural designs that deliver the required functionality in a scalable and highly-available manner. We discuss the tradeoffs of using a centralized architecture as compared to a fully decentralized design to perform wide-area resource discovery. To summarize our results, we found that a centralized architecture based on 4-node server cluster sites at network peering facilities outperforms a decentralized DHT-based resource discovery infrastructure with respect to query latency for all but the smallest number of sites. However, although a centralized architecture shows significant promise in stable environments, we find that our decentralized implementation has acceptable performance and also benefits from the DHT’s self-healing properties in more volatile environments. We evaluate the advantages and disadvantages of centralized and distributed resource discovery architectures on 1000 hosts in emulation and on approximately 200 PlanetLab nodes spread across the Internet.
On Unbiased Sampling for Unstructured Peer-to-Peer Networks
- in Proc. ACM IMC
, 2006
"... This paper addresses the difficult problem of selecting representative samples of peer properties (e.g., degree, link bandwidth, number of files shared) in unstructured peer-to-peer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer ..."
Abstract
-
Cited by 81 (8 self)
- Add to MetaCart
(Show Context)
This paper addresses the difficult problem of selecting representative samples of peer properties (e.g., degree, link bandwidth, number of files shared) in unstructured peer-to-peer systems. Due to the large size and dynamic nature of these systems, measuring the quantities of interest on every peer is often prohibitively expensive, while sampling provides a natural means for estimating system-wide behavior efficiently. However, commonly-used sampling techniques for measuring peer-to-peer systems tend to introduce considerable bias for two reasons. First, the dynamic nature of peers can bias results towards short-lived peers, much as naively sampling flows in a router can lead to bias towards short-lived flows. Second, the heterogeneous nature of the overlay topology can lead to bias towards high-degree peers. We present a detailed examination of the ways that the behavior of peer-to-peer systems can introduce bias and suggest the Metropolized Random Walk with Backtracking (MRWB) as a viable and promising technique for collecting nearly unbiased samples. We conduct an extensive simulation study to demonstrate that the proposed technique works well for a wide variety of common peer-to-peer network conditions. Using the Gnutella network, we empirically show that our implementation of the MRWB technique yields more accurate samples than relying on commonlyused sampling techniques. Furthermore, we provide insights into the causes of the observed differences. The tool we have developed, ion-sampler, selects peer addresses uniformly at random using the MRWB technique. These addresses may then be used as input to another measurement tool to collect data on a particular property.
Minimizing churn in distributed systems
, 2006
"... A pervasive requirement of distributed systems is to deal with churn — change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of ..."
Abstract
-
Cited by 80 (3 self)
- Add to MetaCart
(Show Context)
A pervasive requirement of distributed systems is to deal with churn — change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of available nodes to use. First, we provide a comparison of the performance of a range of different node selection strategies in five real-world traces. Among our findings is that the simple strategy of picking a uniform-random replacement whenever a node fails performs surprisingly well. We explain its performance through analysis in a stochastic model. Second, we show that a class of strategies, which we call “Preference List ” strategies, arise commonly as a result of optimizing for a metric other than churn, and produce high churn relative to more randomized strategies under realistic node failure patterns. Using this insight, we demonstrate and explain differences in performance for designs that incorporate varying degrees of randomization. We give examples from a variety of protocols, including anycast, overlay multicast, and distributed hash tables. In many cases, simply adding some randomization can go a long way towards reducing churn.
Bandwidth-efficient management of DHT routing tables
, 2005
"... Today an application developer using a distributed hash table (DHT) with n nodes must choose a DHT protocol from the spectrum between O(1) lookup protocols [9, 18] and O(log n) protocols [20–23,25,26]. O(1) protocols achieve low latency lookups on small or low-churn networks because lookups take onl ..."
Abstract
-
Cited by 64 (3 self)
- Add to MetaCart
Today an application developer using a distributed hash table (DHT) with n nodes must choose a DHT protocol from the spectrum between O(1) lookup protocols [9, 18] and O(log n) protocols [20–23,25,26]. O(1) protocols achieve low latency lookups on small or low-churn networks because lookups take only a few hops, but incur high maintenance traffic on large or high-churn networks. O(log n) protocols incur less maintenance traffic on large or highchurn networks but require more lookup hops in small networks. Accordion is a new routing protocol that does not force the developer to make this choice: Accordion adjusts itself to provide the best performance across a range of network sizes and churn rates while staying within a bounded bandwidth budget. The key challenges in the design of Accordion are the algorithms that choose the routing table’s size and content. Each Accordion node learns of new neighbors opportunistically, in a way that causes the density of its neighbors to be inversely proportional to their distance in ID space from the node. This distribution allows Accordion to vary the table size along a continuum while still guaranteeing at most O(log n) lookup hops. The user-specified bandwidth budget controls the rate at which a node learns about new neighbors. Each node limits its routing table size by evicting neighbors that it judges likely to have failed. High churn (i.e., short node lifetimes) leads to a high eviction rate. The equilibrium between the learning and eviction processes determines the table size. Simulations show that Accordion maintains an efficient lookup latency versus bandwidth tradeoff over a wider range of operating conditions than existing DHTs.
Non-Transitive Connectivity and DHTs
- PROCEEDINGS OF THE SECOND WORKSHOP ON REAL, LARGE DISTRIBUTED SYSTEMS (WORLDS ’05)
, 2005
"... ..."
(Show Context)
Proling a million user dht
- In Proc. of Internet Measurement Conference
, 2007
"... Distributed hash tables (DHTs) provide scalable, key-based lookup of objects in dynamic network environments. Although DHTs have been studied extensively from an analytical perspective, only recently have wide deployments enabled empirical examination. This paper reports measurement results obtained ..."
Abstract
-
Cited by 47 (7 self)
- Add to MetaCart
(Show Context)
Distributed hash tables (DHTs) provide scalable, key-based lookup of objects in dynamic network environments. Although DHTs have been studied extensively from an analytical perspective, only recently have wide deployments enabled empirical examination. This paper reports measurement results obtained from profiling the Azureus BitTorrent client’s DHT, which is in active use by more than 1 million nodes on a daily basis. The Azureus DHT operates on untrusted, unreliable end-hosts, offering a glimpse into the implementation challenges associated with making structured overlays work in practice. Our measurements provide characterizations of churn, overhead, and performance in this environment. We leverage these measurements to drive the design of a modified DHT lookup algorithm that reduces median DHT lookup time by an order of magnitude for a nominal increase in overhead. 1.
Structured and unstructured overlays under the microscope - a measurement-based view of two p2p systems that people use
- In Proceedings of the USENIX Annual Technical Conference
, 2006
"... measurement-based view of two P2P systems that people use ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
(Show Context)
measurement-based view of two P2P systems that people use
A Distributed Hash Table
, 2005
"... DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintain ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintains a mapping of objects to servers, replicates data for durability, and balances load across participating servers. Applications access data stored in DHash through a familiar hash-table interface: put stores data in the system under a key; get retrieves the data. DHash has proven useful to a number of application builders and has been used to build a content-distribution system [34], a Usenet replacement [118], and new Internet naming architectures [133, 132]. These applications demand low-latency, high-throughput access
An Analysis of BitTorrent’s Two Kademlia-Based DHTs
, 2007
"... Despite interest in structured peer-to-peer overlays and their scalability to millions of nodes, few, if any, overlays operate at that scale. This paper considers the distributed hash table extensions supported by modern BitTorrent clients, which implement a Kademlia-style structured overlay network ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Despite interest in structured peer-to-peer overlays and their scalability to millions of nodes, few, if any, overlays operate at that scale. This paper considers the distributed hash table extensions supported by modern BitTorrent clients, which implement a Kademlia-style structured overlay network among millions of BitTorrent users. As there are two disjoint Kademlia-based DHTs in use, we collected two weeks of traces from each DHT. We examine churn, reachability, latency, and liveness of nodes in these overlays, and identify a variety of problems, such as median lookup times of over a minute. We show that Kademlia’s choice of iterative routing and its lack of a preferential refresh of its local neighborhood cause correctness problems and poor performance. We also identify implementation bugs, design issues, and security concerns that limit the effectiveness of these DHTs and we offer possible solutions for their improvement. 1