Results 1 - 10
of
37
Network Topology Generators: Degree-Based vs. Structural
, 2002
"... Following the long-held belief that the Internet is hierarchical, the network topology generators most widely used by the Internet research community, Transit-Stub and Tiers, create networks with a deliberately hierarchical structure. However, in 1999 a seminal paper by Faloutsos et al. revealed tha ..."
Abstract
-
Cited by 140 (12 self)
- Add to MetaCart
Following the long-held belief that the Internet is hierarchical, the network topology generators most widely used by the Internet research community, Transit-Stub and Tiers, create networks with a deliberately hierarchical structure. However, in 1999 a seminal paper by Faloutsos et al. revealed that the Internet's degree distribution is a power-law. Because the degree distributions produced by the Transit-Stub and Tiers generators are not power-laws, the research community has largely dismissed them as inadequate and proposed new network generators that attempt to generate graphs with power-law degree distributions.
Indexing and Mining Free Trees
- Proceedings of the 2003 IEEE International Conference on Data Mining (ICDM’03
, 2003
"... Tree structures are used extensively in domains such as computational biology, pattern recognition, computer networks, and so on. In this paper, we present an indexing technique for free trees and apply this indexing technique to the problem of mining frequent subtrees. We first define a novel re ..."
Abstract
-
Cited by 34 (7 self)
- Add to MetaCart
Tree structures are used extensively in domains such as computational biology, pattern recognition, computer networks, and so on. In this paper, we present an indexing technique for free trees and apply this indexing technique to the problem of mining frequent subtrees. We first define a novel representation, the canonical form, for rooted trees and extend the definition to free trees. We also introduce another concept, the canonical string, as a simpler representation for free trees in their canonical forms. We then apply our tree indexing technique to the frequent subtree mining problem and present FreeTreeMiner, a computationally e#cient algorithm that discovers all frequently occurring subtrees in a database of free trees. Our mining algorithm is a variation of the traditional a priori method for mining frequent itemsets. We study the performance and the scalability of our algorithms through extensive experiments based on both synthetic data and datasets from two real applications: a dataset of chemical compounds and a dataset of Internet multicast trees. The experiments show that our algorithm scales linearly in the cardinality of the database.
Network topologies, power laws, and hierarchy
- Comput. Commun. Rev
"... It has long been thought that the Internet, and its constituent networks, are hierarchical in nature. Consequently, the network topology generators most widely used by the Internet research community, GT-ITM [7] and Tiers [11], create networks with a deliberately hierarchical structure. However, rec ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
It has long been thought that the Internet, and its constituent networks, are hierarchical in nature. Consequently, the network topology generators most widely used by the Internet research community, GT-ITM [7] and Tiers [11], create networks with a deliberately hierarchical structure. However, recent work by Faloutsos et al. [13] revealed that the Internet’s degree distribution — the distribution of the number of connections routers or Autonomous Systems (ASs) have — is a power-law. The degree distributions produced by the GT-ITM and Tiers generators are not power-laws. To rectify this problem, several new network generators have recently been proposed that produce more realistic degree distributions; these new generators do not attempt to create a hierarchical structure but instead focus solely on the degree distribution. There are thus two families of network generators, structural generators that treat hierarchy as fundamental and degree-based generators that treat the degree distribution as fundamental. In this paper we use several topology metrics to compare the networks produced by these two families of generators to current measurements of the Internet graph. We find that the degree-based generators produce better models, at least according to our topology metrics, of both the AS-level and router-level Internet graphs. We then seek to resolve the seeming paradox that while the Internet certainly has hierarchy, it appears that the Internet graphs are better modeled by generators that do not explicitly construct hierarchies. We conclude our paper with a brief study of other network structures, such as the pointer structure in the web and the set of airline routes, some of which turn out to have metric properties similar to that of the Internet. 1
A Novel Approach to Managing Consistency in Content Distribution Networks
, 2001
"... Content distribution network (CDN) is a technique deployed to push content from the origin server to geographically distributed replicas, usually located at the edge of the network where the clients are attached. One of the important problems in CDNs is how to manage the consistency of content at re ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Content distribution network (CDN) is a technique deployed to push content from the origin server to geographically distributed replicas, usually located at the edge of the network where the clients are attached. One of the important problems in CDNs is how to manage the consistency of content at replicas with that at the origin server, especially for those documents changing dynamically. In the traditional propagation approach the updated version of a document is delivered to all replicas whenever a change is made to the document at the origin server. It may generate significant levels of unnecessary traffic if documents are updated more frequently than accessed. Another approach is invalidation, in which an invalidation message is sent to all replicas when a document is changed at the origin server. This approach does not make full use of the distribution network for content delivery and each replica needs to fetch an updated version individually at a later time. This can also lead to inefficiency in managing consistency at replicas. In this paper, we propose a novel hybrid approach that will generate less traffic than the propagation approach, and the invalidation approach. The origin server makes the decision of using either propagation or invalidation method for each document, based on the statistics about the update frequency at the origin server and the request rates collected by replicas. We develop a technique that can reduce the burden of request rate collection at replicas and get rid of the implosion problem when replicas send the statistics to the origin server. Extensive simulations are performed to examine how the traffic generated and freshness rate at replicas are affected by various parameters. We experiment with a wide range of request rate, update fr...
On the Efficiency of Multicast
- IEEE/ACM Transactions on Networking
, 2001
"... The average number of joint hops in a shortest-path multicast tree from a root to arbitrary chosen group member nodes is studied. A general theory for all graphs, hence including the graph representation of the Internet, is presented which quantifies the multicast reduction in network links compared ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
The average number of joint hops in a shortest-path multicast tree from a root to arbitrary chosen group member nodes is studied. A general theory for all graphs, hence including the graph representation of the Internet, is presented which quantifies the multicast reduction in network links compared to times unicast. For two special types of graphs, the random graph ( ) and the-ary tree, exact and asymptotic results are derived. Comparing these explicit results with previously published Internet measurements [13] indicates that the number of routers in the Internet that can be reached from a root grows exponentially in the number of hops with an effective degree of approximately 3.2.
Supporting Multicast Deployment Efforts: A Survey of Tools for Multicast Monitoring
- Journal of High Speed Networking--Special Issue on Management of Multimedia Networking
, 2000
"... As the Internet is expected to better support multimedia applications, new services will need to be deployed. An example of one of these next-generation services is multicast communication, the one-to-many delivery of data. Over the last ten years, multicast research as well as deployment efforts ..."
Abstract
-
Cited by 23 (9 self)
- Add to MetaCart
As the Internet is expected to better support multimedia applications, new services will need to be deployed. An example of one of these next-generation services is multicast communication, the one-to-many delivery of data. Over the last ten years, multicast research as well as deployment efforts have both been major areas of interest. In order to bridge the gap between the initial deployment experiments and the availability of multicast as a robust network service, there needs to be a full complement of multicast monitoring tools. In this paper we first survey the debugging, modeling, and management tools that have evolved along side the Internet's multicast infrastructure. Through this survey, we have observed important generalizations in three areas: (1) the challenges unique to monitoring multicast, (2) a methodology common to many multicast monitoring tools/systems, and (3) a set of considerations important to the development of new tools/systems. Using these generalizations we present two of our efforts to evaluate multicast reachability in the Internet. We also use these generalizations to evaluate some of the more recent efforts to develop large-scale management platforms.
Frequent Subtree Mining - An Overview
, 2005
"... Mining frequent subtrees from databases of labeled trees is a new research field that has many practical applications in areas such as computer networks, Web mining, bioinformatics, XML document mining, etc. These applications share a requirement for the more expressive power of labeled trees to ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Mining frequent subtrees from databases of labeled trees is a new research field that has many practical applications in areas such as computer networks, Web mining, bioinformatics, XML document mining, etc. These applications share a requirement for the more expressive power of labeled trees to capture the complex relations among data entities. Although frequent subtree mining is a more difficult task than frequent itemset mining, most existing frequent subtree mining algorithms borrow techniques from the relatively mature association rule mining area. This paper provides an overview of a broad range of tree mining algorithms. We focus on the common theoretical foundations of the current frequent subtree mining algorithms and their relationship with their counterparts in frequent itemset mining. When comparing the algorithms, we categorize them according to their problem definitions and the techniques employed for solving various subtasks of the subtree mining problem. In addition, we also present a thorough performance study for a representative family of algorithms.
Cache-and-Relay Streaming Media Delivery for Asynchronous Clients Shudong
- in International Workshop on Networked Group Communication (NGC
, 2002
"... We consider the problem of delivering popular streaming media to a large number of asynchronous clients. We propose and evaluate a cache-and-relay end-system multicast approach, whereby a client joining a multicast session caches the stream, and if needed, relays that stream to neighboring clients w ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
We consider the problem of delivering popular streaming media to a large number of asynchronous clients. We propose and evaluate a cache-and-relay end-system multicast approach, whereby a client joining a multicast session caches the stream, and if needed, relays that stream to neighboring clients which may join the multicast session at some later time. This cache-and-relay approach is fully distributed, scalable, and efficient in terms of network link cost. In this paper we analytically derive bounds on the network link cost of our cache-and-relay approach, and we evaluate its performance under assumptions of limited client bandwidth and limited client cache capacity. When client bandwidth is limited, we show that although finding an optimal solution is NP-hard, a simple greedy algorithm performs surprisingly well in that it incurs network link costs that are very close to a theoretical lower bound. When client cache capacity is limited, we show that our cache-and-relay approach can still significantly reduce network link cost. We have evaluated our cache-and-relay approach using simulations over large, synthetic random networks, power-law degree networks, and small-world networks, as well as over large real router-level Internet maps.
Characterizing Overlay Multicast Networks
- IN PROCEEDINGS OF IEEE ICNP
, 2003
"... Overlay networks among cooperating hosts have recently emerged as a viable solution to several challenging problems, including multicasting, routing, content distribution, and peer-to-peer services. Application-level overlays, however, incur a performance penalty over router-level solutions. This pa ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
Overlay networks among cooperating hosts have recently emerged as a viable solution to several challenging problems, including multicasting, routing, content distribution, and peer-to-peer services. Application-level overlays, however, incur a performance penalty over router-level solutions. This paper characterizes this performance penalty for overlay multicast trees via experimental data, simulations, and theoretical models. Experimental data and simulations illustrate that (i) the average delay and the number of hops between parent and child hosts in overlay trees generally decrease, and (ii) the degree of hosts generally decreases, as the level of the host in the overlay tree increases. Overlay multicast routing strategies, together with powerlaw and small-world Internet topology characteristics, are causes of the observed phenomena. We compare three overlay multicast protocols with respect to latency, bandwidth, router degrees, and host degrees. We also quantify the overlay tree cost. Results reveal that for small n, where L(n) is the total number of hops in all overlay links, U(n) is the average number of hops on the source to receiver unicast paths, and n is the number of members in the overlay multicast session.
Small-World Internet Topologies: Possible Causes and Implications on Scalability of End-System Multicast
, 2002
"... Recent work has shown the prevalence of small-world phenomena [28] in many networks. Small-world graphs exhibit a high degree of clustering, yet have typically short path lengths between arbitrary vertices. Internet AS-level graphs have been shown to exhibit small-world behaviors [9]. In this paper, ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
Recent work has shown the prevalence of small-world phenomena [28] in many networks. Small-world graphs exhibit a high degree of clustering, yet have typically short path lengths between arbitrary vertices. Internet AS-level graphs have been shown to exhibit small-world behaviors [9]. In this paper, we show that both Internet AS-level and routerlevel graphs exhibit small-world behavior. We attribute such behavior to two possible causes - namely the high variability of vertex degree distributions (which were found to follow approximately a power law [15]) and the preference of vertices to have local connections. We show that both factors contribute with different relative degrees to the small-world behavior of AS-level and router-level topologies. Our findings underscore the inefficacy of the Barabasi-Albert model [6] in explaining the growth process of the Internet, and provide a basis for more promising approaches to the development of Internet topology generators. We present such a generator and show the resemblance of the synthetic graphs it generates to real Internet AS-level and router-level graphs. Using these graphs, we have examined how small-world behaviors affect the scalability of end-system multicast. Our findings indicate that lower variability of vertex degree and stronger preference for local connectivity in small-world graphs results in slower network neighborhood expansion, and in longer average path length between two arbitrary vertices, which in turn results in better scaling of end system multicast.

