Results 1 - 10
of
361
The structure and function of complex networks
- SIAM REVIEW
, 2003
"... Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, ..."
Abstract
-
Cited by 2600 (7 self)
- Add to MetaCart
(Show Context)
Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Modeling and performance analysis of bittorrentlike peer-to-peer networks
- In SIGCOMM
, 2004
"... In this paper, we develop simple models to study the performance of BitTorrent, a second generation peerto-peer (P2P) application. We first present a simple fluid model and study the scalability, performance and efficiency of such a file-sharing mechanism. We then consider the built-in incentive mec ..."
Abstract
-
Cited by 574 (3 self)
- Add to MetaCart
(Show Context)
In this paper, we develop simple models to study the performance of BitTorrent, a second generation peerto-peer (P2P) application. We first present a simple fluid model and study the scalability, performance and efficiency of such a file-sharing mechanism. We then consider the built-in incentive mechanism of Bit-Torrent and study its effect on network performance. We also provide numerical results based on both simulations and real traces obtained from the Internet. 1
Making Gnutella-like P2P Systems Scalable
, 2003
"... Napster pioneered the idea of peer-to-peer file sharing, and supported it with a centralized file search facility. Subsequent P2P systems like Gnutella adopted decentralized search algorithms. However, Gnutella's notoriously poor scaling led some to propose distributed hash table solutions to t ..."
Abstract
-
Cited by 429 (1 self)
- Add to MetaCart
(Show Context)
Napster pioneered the idea of peer-to-peer file sharing, and supported it with a centralized file search facility. Subsequent P2P systems like Gnutella adopted decentralized search algorithms. However, Gnutella's notoriously poor scaling led some to propose distributed hash table solutions to the wide-area file search problem. Contrary to that trend, we advocate retaining Gnutella's simplicity while proposing new mechanisms that greatly improve its scalability. Building upon prior research [1, 12, 22], we propose several modifications to Gnutella's design that dynamically adapt the overlay topology and the search algorithms in order to accommodate the natural heterogeneity present in most peer-to-peer systems. We test our design through simulations and the results show three to five orders of magnitude improvement in total system capacity. We also report on a prototype implementation and its deployment on a testbed. Categories and Subject Descriptors C.2 [Computer Communication Networks]: Distributed Systems General Terms Algorithms, Design, Performance, Experimentation Keywords Peer-to-peer, distributed hash tables, Gnutella 1.
An Analysis of Internet Content Delivery Systems
, 2002
"... In the span of only a few years, the Internet has experienced an astronomical increase in the use of specialized content delivery systems, such as content delivery networks and peer-to-peer file sharing systems. Therefore, an understanding of content delivery on the Internet now requires a detailed ..."
Abstract
-
Cited by 318 (9 self)
- Add to MetaCart
In the span of only a few years, the Internet has experienced an astronomical increase in the use of specialized content delivery systems, such as content delivery networks and peer-to-peer file sharing systems. Therefore, an understanding of content delivery on the Internet now requires a detailed understanding of how these systems are used in practice. This paper examines content delivery from the point of view of four content delivery systems: HTTP web traffic, the Akamai content delivery network, and Kazaa and Gnutella peer-to-peer file sharing traffic. We collected a trace of all incoming and outgoing network traffic at the University of Washington, a large university with over 60,000 students, faculty, and staff. From this trace, we isolated and characterized traffic belonging to each of these four delivery classes. Our results (1) quantify the rapidly increasing importance of new content delivery systems, particularly peerto-peer networks, (2) characterize the behavior of these systems from the perspectives of clients, objects, and servers, and (3) derive implications for caching in these systems. 1
Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems
, 2003
"... Locating content in decentralized peer-to-peer systems is a challenging problem. Gnutella, a popular file-sharing application, relies on flooding queries to all peers. Although flooding is simple and robust, it is not scalable. In this paper, we explore how to retain the simplicity of Gnutella, whil ..."
Abstract
-
Cited by 290 (2 self)
- Add to MetaCart
Locating content in decentralized peer-to-peer systems is a challenging problem. Gnutella, a popular file-sharing application, relies on flooding queries to all peers. Although flooding is simple and robust, it is not scalable. In this paper, we explore how to retain the simplicity of Gnutella, while addressing its inherent weakness: scalability. We propose a content location solution in which peers loosely organize themselves into an interest-based structure on top of the existing Gnutella network. Our approach exploits a simple, yet powerful principle called interest-based locality, which posits that if a peer has a particular piece of content that one is interested in, it is very likely that it will have other items that one is interested in as well. When using our algorithm, called interest-based shortcuts,asignificant amount of flooding can be avoided, making Gnutella a more competitive solution. In addition, shortcuts are modular and can be used to improve the performance of other content location mechanisms including distributed hash table schemes.
Peer-to-Peer Architecture Case Study: Gnutella Network
, 2001
"... Despite recent excitement generated by the P2P paradigm and despite surprisingly fast deployment of some P2P applications, there are few quantitative evaluations of P2P systems behavior. Due to its' open architecture and achieved scale, Gnutella is an interesting P2P architecture case study. Gn ..."
Abstract
-
Cited by 274 (1 self)
- Add to MetaCart
(Show Context)
Despite recent excitement generated by the P2P paradigm and despite surprisingly fast deployment of some P2P applications, there are few quantitative evaluations of P2P systems behavior. Due to its' open architecture and achieved scale, Gnutella is an interesting P2P architecture case study. Gnutella, like most other P2P applications, builds' at the application level a virtual network with its' own routing mechanisms. The topology of this virtual network and the routing mechanisms used have a significant influence on application properties such as performance, reliability, and scalability. We built a 'crawler' to extract the topology of Gnutella's application level network. In this' paper we analyze the topology graph and evaluate generated network traffic. We find that although Gnutella is' not a pure power-law network, its' current configuration has the benefits' and drawbacks' of a power-law structure. These findings lead us to propose changes to Gnutella protocol and implementations that bring significant performance and scalability improvements'.
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract
-
Cited by 246 (14 self)
- Add to MetaCart
(Show Context)
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse real-world networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large real-world networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “community-like.” This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are well-embeddable in a low-dimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract
-
Cited by 208 (17 self)
- Add to MetaCart
(Show Context)
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large real-world networks, ranging from traditional and on-line social networks, to technological and information networks and
RDFPeers: A Scalable Distributed RDF Repository Based on a Structured Peer-to-Peer Network
, 2004
"... Centralized Resource Description Framework (RDF) repositories have limitations both in their failure tolerance and in their scalability. Existing Peer-to-Peer (P2P) RDF repositories either cannot guarantee to find query results, even if these results exist in the network, or require up-front definit ..."
Abstract
-
Cited by 169 (2 self)
- Add to MetaCart
Centralized Resource Description Framework (RDF) repositories have limitations both in their failure tolerance and in their scalability. Existing Peer-to-Peer (P2P) RDF repositories either cannot guarantee to find query results, even if these results exist in the network, or require up-front definition of RDF schemas and designation of super peers. We present a scalable distributed RDF repository ("RDFPeers") that stores each triple at three places in a multi-attribute addressable network by applying globally known hash functions to its subject, predicate, and object. Thus, all nodes know which node is responsible for storing triple values they are looking for, and both exact-match and range queries can be efficiently routed to those nodes. RDFPeers has no single point of failure nor elevated peers, and does not require the prior definition of RDF schemas. Queries are guaranteed to find matched triples in the network if the triples exist. In RDFPeers, both the number of neighbors per node and the number of routing hops for inserting RDF triples and for resolving most queries are logarithmic to the number of nodes in the network. We further performed experiments that show that the triple-storing load in RDFPeers differs by less than an order of magnitude between the most and the least loaded nodes for real-world RDF data.
Epidemic Spreading in Real Networks: An Eigenvalue Viewpoint
- In SRDS
, 2003
"... Abstract How will a virus propagate in a real network?Does an epidemic threshold exist for a finite powerlaw graph, or any finite graph? How long does ittake to disinfect a network given particular values of infection rate and virus death rate? We answer the first question by providing equa-tions th ..."
Abstract
-
Cited by 167 (19 self)
- Add to MetaCart
(Show Context)
Abstract How will a virus propagate in a real network?Does an epidemic threshold exist for a finite powerlaw graph, or any finite graph? How long does ittake to disinfect a network given particular values of infection rate and virus death rate? We answer the first question by providing equa-tions that accurately model virus propagation in any network including real and synthesized networkgraphs. We propose a general epidemic threshold condition that applies to arbitrary graphs: weprove that, under reasonable approximations, the epidemic threshold for a network is closely relatedto the largest eigenvalue of its adjacency matrix. Finally, for the last question, we show that infec-tions tend to zero exponentially below the epidemic threshold. We show that our epidemic threshold modelsubsumes many known thresholds for special-case graphs (e.g., Erd"os-R'enyi, BA power-law, homoge-neous); we show that the threshold tends to zero for infinite power-law graphs. Finally, we illustrate thepredictive power of our model with extensive experiments on real and synthesized graphs. We show thatour threshold condition holds for arbitrary graphs.