Results 1  10
of
2,969
On the Selfsimilar Nature of Ethernet Traffic (Extended Version)
, 1994
"... We demonstrate that Ethernet LAN traffic is statistically selfsimilar, that none of the commonly used traffic models is able to capture this fractallike behavior, that such behavior has serious implications for the design, control, and analysis of highspeed, cellbased networks, and that aggrega ..."
Abstract

Cited by 2213 (46 self)
 Add to MetaCart
We demonstrate that Ethernet LAN traffic is statistically selfsimilar, that none of the commonly used traffic models is able to capture this fractallike behavior, that such behavior has serious implications for the design, control, and analysis of highspeed, cellbased networks, and that aggregating streams of such traffic typically intensifies the selfsimilarity (“burstiness”) instead of smoothing it. Our conclusions are supported by a rigorous statistical analysis of hundreds of millions of high quality Ethernet traffic measurements collected between 1989 and 1992, coupled with a discussion of the underlying mathematical and statistical properties of selfsimilarity and their relationship with actual network behavior. We also present traffic models based on selfsimilar stochastic processes that provide simple, accurate, and realistic descriptions of traffic scenarios expected during BISDN deployment.
Statistical mechanics of complex networks
 Rev. Mod. Phys
"... Complex networks describe a wide range of systems in nature and society, much quoted examples including the cell, a network of chemicals linked by chemical reactions, or the Internet, a network of routers and computers connected by physical links. While traditionally these systems were modeled as ra ..."
Abstract

Cited by 2148 (11 self)
 Add to MetaCart
(Show Context)
Complex networks describe a wide range of systems in nature and society, much quoted examples including the cell, a network of chemicals linked by chemical reactions, or the Internet, a network of routers and computers connected by physical links. While traditionally these systems were modeled as random graphs, it is increasingly recognized that the topology and evolution of real
WideArea Traffic: The Failure of Poisson Modeling
 IEEE/ACM TRANSACTIONS ON NETWORKING
, 1995
"... Network arrivals are often modeled as Poisson processes for analytic simplicity, even though a number of traffic studies have shown that packet interarrivals are not exponentially distributed. We evaluate 24 widearea traces, investigating a number of widearea TCP arrival processes (session and con ..."
Abstract

Cited by 1775 (24 self)
 Add to MetaCart
Network arrivals are often modeled as Poisson processes for analytic simplicity, even though a number of traffic studies have shown that packet interarrivals are not exponentially distributed. We evaluate 24 widearea traces, investigating a number of widearea TCP arrival processes (session and connection arrivals, FTP data connection arrivals within FTP sessions, and TELNET packet arrivals) to determine the error introduced by modeling them using Poisson processes. We find that userinitiated TCP session arrivals, such as remotelogin and filetransfer, are wellmodeled as Poisson processes with fixed hourly rates, but that other connection arrivals deviate considerably from Poisson; that modeling TELNET packet interarrivals as exponential grievously underestimates the burstiness of TELNET traffic, but using the empirical Tcplib [Danzig et al, 1992] interarrivals preserves burstiness over many time scales; and that FTP data connection arrivals within FTP sessions come bunched into “connection bursts,” the largest of which are so large that they completely dominate FTP data traffic. Finally, we offer some results regarding how our findings relate to the possible selfsimilarity of widearea traffic.
SelfSimilarity in World Wide Web Traffic: Evidence and Possible Causes
, 1996
"... Recently the notion of selfsimilarity has been shown to apply to widearea and localarea network traffic. In this paper we examine the mechanisms that give rise to the selfsimilarity of network traffic. We present a hypothesized explanation for the possible selfsimilarity of traffic by using a p ..."
Abstract

Cited by 1416 (26 self)
 Add to MetaCart
(Show Context)
Recently the notion of selfsimilarity has been shown to apply to widearea and localarea network traffic. In this paper we examine the mechanisms that give rise to the selfsimilarity of network traffic. We present a hypothesized explanation for the possible selfsimilarity of traffic by using a particular subset of wide area traffic: traffic due to the World Wide Web (WWW). Using an extensive set of traces of actual user executions of NCSA Mosaic, reflecting over half a million requests for WWW documents, we examine the dependence structure of WWW traffic. While our measurements are not conclusive, we show evidence that WWW traffic exhibits behavior that is consistent with selfsimilar traffic models. Then we show that the selfsimilarity insuch traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in le transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network. To do this we rely on empirically measured distributions both from our traces and from data independently collected at over thirty WWW sites.
SelfSimilarity Through HighVariability: Statistical Analysis of Ethernet LAN Traffic at the Source Level
 IEEE/ACM TRANSACTIONS ON NETWORKING
, 1997
"... A number of recent empirical studies of traffic measurements from a variety of working packet networks have convincingly demonstrated that actual network traffic is selfsimilar or longrange dependent in nature (i.e., bursty over a wide range of time scales)  in sharp contrast to commonly made tr ..."
Abstract

Cited by 743 (24 self)
 Add to MetaCart
A number of recent empirical studies of traffic measurements from a variety of working packet networks have convincingly demonstrated that actual network traffic is selfsimilar or longrange dependent in nature (i.e., bursty over a wide range of time scales)  in sharp contrast to commonly made traffic modeling assumptions. In this paper, we provide a plausible physical explanation for the occurrence of selfsimilarity in LAN traffic. Our explanation is based on new convergence results for processes that exhibit high variability (i.e., infinite variance) and is supported by detailed statistical analyses of realtime traffic measurements from Ethernet LAN's at the level of individual sources. This paper is an extended version of [53] and differs from it in significant ways. In particular, we develop here the mathematical results concerning the superposition of strictly alternating ON/OFF sources. Our key mathematical result states that the superposition of many ON/OFF sources (also k...
Singularity Detection And Processing With Wavelets
 IEEE Transactions on Information Theory
, 1992
"... Most of a signal information is often found in irregular structures and transient phenomena. We review the mathematical characterization of singularities with Lipschitz exponents. The main theorems that estimate local Lipschitz exponents of functions, from the evolution across scales of their wavele ..."
Abstract

Cited by 595 (13 self)
 Add to MetaCart
(Show Context)
Most of a signal information is often found in irregular structures and transient phenomena. We review the mathematical characterization of singularities with Lipschitz exponents. The main theorems that estimate local Lipschitz exponents of functions, from the evolution across scales of their wavelet transform are explained. We then prove that the local maxima of a wavelet transform detect the location of irregular structures and provide numerical procedures to compute their Lipschitz exponents. The wavelet transform of singularities with fast oscillations have a different behavior that we study separately. We show that the size of the oscillations can be measured from the wavelet transform local maxima. It has been shown that one and twodimensional signals can be reconstructed from the local maxima of their wavelet transform [14]. As an application, we develop an algorithm that removes white noises by discriminating the noise and the signal singularities through an analysis of their ...
Analysis, Modeling and Generation of SelfSimilar VBR Video Traffic
, 1994
"... We present a detailed statistical analysis of a 2hour long empirical sample of VBR video. The sample was obtained by applying a simple intraframe video compression code to an action movie. The main findings of our analysis are (1) the tail behavior of the marginal bandwidth distribution can be accu ..."
Abstract

Cited by 548 (6 self)
 Add to MetaCart
(Show Context)
We present a detailed statistical analysis of a 2hour long empirical sample of VBR video. The sample was obtained by applying a simple intraframe video compression code to an action movie. The main findings of our analysis are (1) the tail behavior of the marginal bandwidth distribution can be accurately described using "heavytailed" distributions (e.g., Pareto); (2) the autocorrelation of the VBR video sequence decays hyperbolically (equivalent to longrange dependence) and can be modeled using selfsimilar processes. We combine our findings in a new (nonMarkovian) source model for VBR video and present an algorithm for generating synthetic traffic. Tracedriven simulations show that statistical multiplexing results in significant bandwidth efficiency even when longrange dependence is present. Simulations of our source model show longrange dependence and heavytailed marginals to be important components which are not accounted for in currently used VBR video traffic models. 1 I...
Efficient similarity search in sequence databases
, 1994
"... We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation being that, for most sequences of practical interest, only the first few frequencies are strong. Anot ..."
Abstract

Cited by 515 (19 self)
 Add to MetaCart
We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation being that, for most sequences of practical interest, only the first few frequencies are strong. Another important observation is Parseval's theorem, which specifies that the Fourier transform preserves the Euclidean distance in the time or frequency domain. Having thus mapped sequences to a lowerdimensionality space by using only the first few Fourier coe cients, we use Rtrees to index the sequences and e ciently answer similarity queries. We provide experimental results which show that our method is superior to search based on sequential scanning. Our experiments show that a few coefficients (13) are adequate to provide good performance. The performance gain of our method increases with the number and length of sequences.
Recursive Distributed Representations
 Artificial Intelligence
, 1990
"... A longstanding difficulty for connectionist modeling has been how to represent variablesized recursive data structures, such as trees and lists, in fixedwidth patterns. This paper presents a connectionist architecture which automatically develops compact distributed representations for such compo ..."
Abstract

Cited by 414 (9 self)
 Add to MetaCart
(Show Context)
A longstanding difficulty for connectionist modeling has been how to represent variablesized recursive data structures, such as trees and lists, in fixedwidth patterns. This paper presents a connectionist architecture which automatically develops compact distributed representations for such compositional structures, as well as efficient accessing mechanisms for them. Patterns which stand for the internal nodes of fixedvalence trees are devised through the recursive use of backpropagation on threelayer autoassociative encoder networks. The resulting representations are novel, in that they combine apparently immiscible aspects of features, pointers, and symbol structures. They form a bridge between the data structures necessary for highlevel cognitive tasks and the associative, pattern recognition machinery provided by neural networks. 2 J. B. Pollack 1. Introduction One of the major stumbling blocks in the application of Connectionism to higherlevel cognitive tasks, such as Na...
Concept Decompositions for Large Sparse Text Data using Clustering
 Machine Learning
, 2000
"... . Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95 to 99 ..."
Abstract

Cited by 407 (27 self)
 Add to MetaCart
(Show Context)
. Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95 to 99% is typical. In this paper, we study a certain spherical kmeans algorithm for clustering such document vectors. The algorithm outputs k disjoint clusters each with a concept vector that is the centroid of the cluster normalized to have unit Euclidean norm. As our first contribution, we empirically demonstrate that, owing to the highdimensionality and sparsity of the text data, the clusters produced by the algorithm have a certain "fractallike" and "selfsimilar" behavior. As our second contribution, we introduce concept decompositions to approximate the matrix of document vectors; these decompositions are obtained by taking the leastsquares approximation onto the linear subspace spanned...