Results 1 - 10
of
406
A Scalable Content-Addressable Network
- IN PROC. ACM SIGCOMM 2001
, 2001
"... Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infra ..."
Abstract
-
Cited by 3371 (32 self)
- Add to MetaCart
(Show Context)
Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales. The CAN is scalable, fault-tolerant and completely self-organizing, and we demonstrate its scalability, robustness and low-latency properties through simulation.
The structure and function of complex networks
- SIAM REVIEW
, 2003
"... Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, ..."
Abstract
-
Cited by 2600 (7 self)
- Add to MetaCart
(Show Context)
Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Search and replication in unstructured peer-to-peer networks
, 2002
"... Abstract Decentralized and unstructured peer-to-peer networks such as Gnutella are attractive for certain applicationsbecause they require no centralized directories and no precise control over network topologies and data placement. However, the flooding-based query algorithm used in Gnutella does n ..."
Abstract
-
Cited by 692 (6 self)
- Add to MetaCart
(Show Context)
Abstract Decentralized and unstructured peer-to-peer networks such as Gnutella are attractive for certain applicationsbecause they require no centralized directories and no precise control over network topologies and data placement. However, the flooding-based query algorithm used in Gnutella does not scale; each individual query gener-ates a large amount of traffic and, as it grows, the system quickly becomes overwhelmed with the query-induced load. This paper explores, through simulation, various alternatives to gnutella's query algorithm, data replicationmethod, and network topology. We propose a query algorithm based on multiple random walks that resolves queries almost as quickly as gnutella's flooding method while reducing the network traffic by two orders of mag-nitude in many cases. We also present a distributed replication strategy that yields close-to-optimal performance. Finally, we find that among the various network topologies we consider, uniform random graphs yield the bestperformance. 1 Introduction The computer science community has become accustomed to the Internet's continuing rapid growth, but even tosuch jaded observers the explosive increase in Peer-to-Peer (P2P) network usage has been astounding. Within a few months of Napster's [12] introduction in 1999 the system had spread widely, and recent measurement data suggeststhat P2P applications are having a very significant and rapidly growing impact on Internet traffic [11, 15]. Therefore, it is important to study the performance and scalability of these P2P networks. Currently, there are several different architectures for P2P networks:
Power-law distributions in empirical data
- ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
"... Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract
-
Cited by 607 (7 self)
- Add to MetaCart
Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the tail of the distribution. In particular, standard methods such as least-squares fitting are known to produce systematically biased estimates of parameters for power-law distributions and should not be used in most circumstances. Here we describe statistical techniques for making accurate parameter estimates for power-law data, based on maximum likelihood methods and the Kolmogorov-Smirnov statistic. We also show how to tell whether the data follow a power-law distribution at all, defining quantitative measures that indicate when the power law is a reasonable fit to the data and when it is not. We demonstrate these methods by applying them to twentyfour real-world data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law is ruled out.
BRITE: An approach to universal topology generation,”
- in Proceedings of the IEEE Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems,
, 2001
"... Abstract Effective engineering of the Internet is predicated upon a detailed understanding of issues such as the large-scale structure of its underlying physical topology, the manner in which it evolves over time, and the way in which its constituent components contribute to its overall function. U ..."
Abstract
-
Cited by 448 (12 self)
- Add to MetaCart
(Show Context)
Abstract Effective engineering of the Internet is predicated upon a detailed understanding of issues such as the large-scale structure of its underlying physical topology, the manner in which it evolves over time, and the way in which its constituent components contribute to its overall function. Unfortunately, developing a deep understanding of these issues has proven to be a challenging task, since it in turn involves solving difficult problems such as mapping the actual topology, characterizing it, and developing models that capture its emergent behavior. Consequently, even though there are a number of topology models, it is an open question as to how representative the generated topologies they generate are of the actual Internet. Our goal is to produce a topology generation framework which improves the state of the art and is based on the design principles of representativeness, inclusiveness, and interoperability. Representativeness leads to synthetic topologies that accurately reflect many aspects of the actual Internet topology (e.g. hierarchical structure, node degree distribution, etc.). Inclusiveness combines the strengths of as many generation models as possible in a single generation tool. Interoperability provides interfaces to widely-used simulation applications such as ns and SSF and visualization tools like otter. We call such a tool a universal topology generator.
A Brief History of Generative Models for Power Law and Lognormal Distributions
- INTERNET MATHEMATICS
"... Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying ..."
Abstract
-
Cited by 414 (7 self)
- Add to MetaCart
Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying
Power laws, Pareto distributions and Zipf’s law
"... Many of the things that scientists measure have a typical size or “scale”—a typical value around which individual measurements are centred. A simple example would be the heights of human beings. Most adult human beings are about 180cm tall. There is some variation around this figure, notably dependi ..."
Abstract
-
Cited by 413 (0 self)
- Add to MetaCart
(Show Context)
Many of the things that scientists measure have a typical size or “scale”—a typical value around which individual measurements are centred. A simple example would be the heights of human beings. Most adult human beings are about 180cm tall. There is some variation around this figure, notably depending on sex, but we never see people who are 10cm tall, or 500cm. To make this observation more quantitative, one can plot a histogram of people’s heights, as I have done in Fig. 1a. The figure shows the heights in centimetres of adult men in the United States measured between 1959 and 1962, and indeed the distribution is relatively narrow and peaked around 180cm. Another telling observation is the ratio of the heights of the tallest and shortest people.
Topologically-aware overlay construction and server selection
, 2002
"... A number of large-scale distributed Internet applications could potentially benefit from some level of knowledge about the relative proximity between its participating host nodes. For example, the performance of large overlay networks could be improved if the application-level connectivity between ..."
Abstract
-
Cited by 341 (3 self)
- Add to MetaCart
(Show Context)
A number of large-scale distributed Internet applications could potentially benefit from some level of knowledge about the relative proximity between its participating host nodes. For example, the performance of large overlay networks could be improved if the application-level connectivity between the nodes in these networks is congruent with the underlying IP-level topology. Similarly, in the case of replicated web content, client nodes could use topological information in selecting one of multiple available servers. For such applications, one need not find the optimal solution in order to achieve significant practical benefits. Thus, these applications, and presumably others like them, do not require exact topological information and can instead use sufficiently informative hints about the relative positions of Internet hosts. In this paper, we present a binning scheme whereby nodes partition themselves into bins such that nodes that fall within a given bin are relatively close to one another in terms of network latency. Our binning strategy is simple (requiring minimal support from any measurement infrastructure), scalable (requiring no form of global knowledge, each node only needs knowledge of a small number of well-known landmark nodes) and completely distributed (requiring no communication or cooperation between the nodes being binned). We apply this binning strategy to the two applications mentioned above: overlay network construction and server selection. We test our binning strategy and its application using simulation and Internet measurement traces. Our results indicate that the performance of these applications can be significantly improved by even the rather coarse-grained knowledge of topology offered by our binning scheme.
Graph structure in the web
, 2003
"... The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution. We report on experiments on local and global properties of ..."
Abstract
-
Cited by 304 (7 self)
- Add to MetaCart
The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution. We report on experiments on local and global properties of the web graph using two Altavista crawls each with over 200 million pages and 1.5 billion links. Our study indicates that the macroscopic structure of the web is considerably more intricate than suggested by earlier experiments on a smaller scale.
Stochastic Models for the Web Graph
, 2000
"... The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph m ..."
Abstract
-
Cited by 291 (12 self)
- Add to MetaCart
(Show Context)
The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph models differ from the traditional Gn;p models in two ways: 1. Independently chosen edges do not result in the statistics (degree distributions, clique multitudes) observed on the web. Thus, edges in our model are statistically dependent on each other. 2. Our model introduces new vertices in the graph as time evolves. This captures the fact that the web is changing with time. Our results are two fold: we show that graphs generated using our model exhibit the statistics observed on the web graph, and additionally, that natural graph models proposed earlier do not exhibit them. This remains true even when these earlier models are generalized to account for the arrival of vertices over time. In particular, the sparse random graphs in our models exhibit properties that do not arise in far denser random graphs generated by Erdos-R'enyi models.