| M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In DIMACS series in Discrete Mathematics and Theoretical Computer Science, volume 50, pages 107--118, 1999. |
....have been developed for comparing data streams under various L p distances, or clustering them. The stream model from the database perspective is investigated in the Stanford stream data management Project [82] see [10] for an overview and algorithmic considerations) In the interesting work of [52] where the stream model was formalized, a data stream is de ned as a sequence of data items v 1 ; v 2 ; v n which are assumed to be read by an algorithm only once (or very few times) in increasing order of the indices i. The number P of passes over the data stream and the workspace W (in ....
....a large space in one pass and a small space in two passes. ii) There can be an exponential gap in space bounds between Monte Carlo and LasVegas algorithms. iii) For some problems, an algorithm for an approximate solution, requires substantially less space than an exact solution algorithm. In [52], the lower bounds on the workspace of limited pass algorithms are shown using tools from the communication complexity area. The space complexity of estimating the frequency moments of a sequence of elements in one pass was studied in [7] where communication complexity was also used to prove ....
M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. available at: http://www.research.digital.com/SRC/, 1998.
....approximation algorithm which maintains an approximate k median solution for the last N data points using O( N) memory, where # 1 2 is a parameter which trades o# the space bound with the approximation factor of O(2 ) 1. INTRODUCTION The data stream model of computation [14] is useful for modeling massive data sets (much larger than available main memory) that need to be processed in a single pass. Motivating applications include networking (tra#c engineering, network monitoring, intrusion detection) telecommunications (fraud detection, data mining) financial ....
M.R. Henzinger, P. Raghavan, and S. Rajagopalan "Computing on Data Streams.", Technical Report 1998-011, Compaq Systems Research Center, Palo Alto, CA, May, 1998.
....all windows, of the ratio of the diameter to the minimum non zero distance between any two points in the window. 1 Introduction In recent years, massive data sets have become increasingly important in a wide range of applications. In many applications, the input can be viewed as a data stream [12, 7] that the algorithm reads in one pass. The algorithm should take little time to process each data element and should use little space in comparison to the input size. In some scenarios, the input stream may be infinite, and the application may only care about recent data. In this case, the ....
....algorithm in the streaming model that uses O(1 #) space and processes each point in O(log(1 #) time. We also present an approximate sliding window algorithm to maintain the diameter in 2 d using 3 2 log # ) bits of space. 2 Models and Related Work The streaming model was introduced in [12, 7]. A data stream is a sequence of data elements a 1 , a 2 , an . We will denote by n the number of data elements in the stream. In this paper, the data elements are points. A streaming algorithm is an algorithm that computes some function over a data stream and has the following ....
Henzinger, M.R., Raghavan, P., Rajagopalan, S.: Computing on data streams. Technique Report 1998-001, DEC Sys. Research. (1998)
....Data Streams (CDS) has been stimulating increasing interests in the database com munity lately. Most current research efforts are either on Database Management System (DBMS) support, such as Stream [2] Fjords [17] and NiagaraCQ [4] or on query processing and data mining issues, such as [8,12, 13, 14]. Data sequences have been used in many applications, suchasstock prices, biomedical measurements, weather data, DNA sequences, and sensor data from robotics. New emerging applications, such as data mining and information retrieval by content, require the capability of finding similar patterns, ....
M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. Technical report tr-1998.
....detailed call data every day or 27TB a year. With this huge amount of source data, even the specialized aggregation index under a single time granularity will soon grow too large. Saving storage space is especially useful for applications under the stream model, which was formalized recently by [HRR98]. A stream is an ordered sequence of points that are read in increasing order. The performance of an algorithm that operates on streams is measured by the number of passes the algorithm must make over the stream, when constrained by the size of available storage space. The model is very ....
M. R. Henziger, P. Raghavan and S. Rajagopalan, "Computing on Data Streams", TechReport 1998-011, DEC, May 1998.
....detailed call data every day or 27TB a year. With this huge amount of source data, even the specialized aggregation index under a single time granularity will soon grow too large. Saving storage space is especially useful for applications under the stream model, which was formalized recently by [17]. A stream is an ordered sequence of points that are read in increasing order. The performance of an algorithm that operates on streams is measured by the number of passes the algorithm must make over the stream, when constrained by the size of available storage space. The model is very ....
M. R. Henziger, P. Raghavan and S. Rajagopalan, \Computing on Data Streams", TechReport 1998-011, DEC, May 1998.
....detailed call data every day or 27TB a year. With this huge amount of source data, even the specialized aggregation index under a single time granularity will soon grow too large. Saving storage space is especially useful for applications under the stream model, which was formalized recently by [17]. A stream is an ordered sequence of points that are read in increasing order. The performance of an algorithm that operates on streams is measured by the number of passes the algorithm must make over the stream, when constrained by the size of available storage space. The model is very ....
M. R. Henziger, P. Raghavan and S. Rajagopalan, \Computing on Data Streams", TechReport 1998-011, DEC, May 1998.
....the communication by sending a single message to the second player; the second sends one message to the third, and so forth; the last player announces the output of the protocol. This model arises very naturally in the context of proving lower bounds for algorithms in the data stream model [HRR99, AMS99, FKSV99], time space tradeoffs [Bor93] and data structure complexity [MNSW98] Indeed, specific functions for which we study lower bound questions emerge from the data stream model, where data arrives in a stream, and an algorithm is efficient if it uses very little space. Since these algorithms are ....
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In DIMACS series in Discrete Mathematics and Theoretical Computer Science, volume 50, pages 107--118, 1999.
....ravi, sivag almaden.ibm.com Abstract Inversions are used as a fundamental quantity to measure the sortedness of data, to evaluate different ranking methods for databases, and in the context of rank aggregation. Considering the volume of the data sets in these applications, the data stream model [16, 2] is a natural setting to design efficient algorithms. We obtain a suite of space efficient streaming algorithms for approximating the number of inversions in a permutation to within a factor of ffl. The best space bound we achieve for this problem is O(log n log log n) through a deterministic ....
....Recent interest in computing with massive data sets has led to much emphasis on restricted computational models and where the usual notions of time and space efficiency have been refined to become much sharper. A model that has become quite well established is that of computing with data streams [16, 2, 12]. In this model, an array of data arrives in a stream (often in arbitrary order) and an algorithm is considered efficient if it uses very little space and very little processing time per data item. Especially attractive are algorithms whose space and processing time per data item are both ....
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. DIMACS series in Discr. Math. & Theor. Comp. Sc., 50:107--118, 1999.
....her message for this round in just a single pass over her input, with little space and little time required to process each item. In theory, this allows the parties to regard their inputs as unbuffered data feeds that need not be stored at all, i.e. the inputs may be regarded as data streams (cf. [20]) Our protocols indeed satisfy these properties, requiring a single pass over the raw data, and sublinear storage, proportional to the communication complexity. 3 Sublinear Private Approximation for the Hamming Distance In this section we present a two party private protocol for the ....
M. Rauch Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. Technical Report
....continuously and it is either unnecessary or impractical to store the data in some form of memory. Data streams are also appropriate as a model of access to large data sets stored in secondary memory where performance requirements necessitate access via linear scans. In the data stream model [17], the data points can only be accessed in the order in which they arrive. Random access to the data is not allowed; memory is assumed to be small relative to the number of points, and so only a limited amount of information can be stored. In general, algorithms operating on streams will be ....
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams, 1998.
....also applications where traditional (nonstreaming) data is treated as a stream due to performance constraints. In data mining applications, for example, the volume of data stored on disk is so large that it is only possible to make one pass (or perhaps a very small number of passes) over the data [10, 9]. The objective is to perform the required computations using the stream generated by a single scan of the data, using only a bounded amount of memory and without recourse to indexes, hash tables, or other precomputed summaries of the data. Another example is that data streams are generated as ....
M. R. Henzinger, P. Raghavan, S. Rajagopalan. Computing on data streams. Technical Report TR
....detailed call data every day or 27TB a year. With this huge amount of source data, even the specialized aggregation index under a single time granularity will soon grow too large. Saving storage space is especially useful for applications under the stream model, which was formalized recently by [17]. A stream is an ordered sequence of points that are read in increasing order. The performance of an algorithm that operates on streams is measured by the number of passes the algorithm must make over the stream, when constrained by the size of available storage space. The model is very ....
M. R. Henziger, P. Raghavan and S. Rajagopalan, \Computing on Data Streams", TechReport
....detailed call data every day or 27TB a year. With this huge amount of source data, even the specialized aggregation index under a single time granularity will soon grow too large. Saving storage space is especially useful for applications under the stream model, which was formalized recently by [HRR98]. A stream is an ordered sequence of points that are read in increasing order. The performance of an algorithm that operates on streams is measured by the number of passes the algorithm must make over the stream, when constrained by the size of available storage space. The model is very ....
M. R. Henziger, P. Raghavan and S. Rajagopalan, \Computing on Data Streams", TechReport
....arbitrary approximation factors. Independently, Trevisan [Tre01] has solved this problem via a di erent approach; our algorithm has the advantage of being list ecient. 1 Introduction In the context of computing with massive data sets, algorithms designed to work in the streaming model [HRR99, AMS99, FKSV99] are gaining popularity, both for their theoretical signi cance and for their usefulness in practice. In this model, data arrives in a stream, one item at a time, and algorithms have fairly stringent requirements to be considered ecient: they are required to use very little space and per item ....
....counting triangles) Counting triangles in graphs. Finally, in Section 6, we present streaming algorithms to compute the number of triangles in a graph. To the best of our knowledge, these are the rst algorithms for any natural graph problem in the streaming model of computation (see also [HRR99]) We consider two models: the adjacency stream, where the graph is presented as a sequence of edges in arbitrary order, and there is no bound on the degree of any vertex, and the incidence stream, where we consider bounded degree graphs and where all edges incident to a vertex are presented ....
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In DIMACS series in Discrete Mathematics and Theoretical Computer Science, volume 50, pages 107-118, 1999.
....must be built in order to gather this information at very high speed with high accuracy. This type of data input also creates problems from a data summary and analysis perspective as well. A summary method must be able to work with data presented in a number of streaming forms. Prabhakar et al. [20] and Gilbert, et al. 13] formally define different data stream models, including the cash register format, which is most relevant given the data collection we have described above. In this data input model, packets (or flows) from different IP sources (or to different IP destinations) arrive in ....
P. Raghavan M. Henzinger and S. Rajagopalan. Computing on data streams. SRC Technical Note,
....in monitoring Internet Network elements such as routers, web servers etc. where traffic is potentially far more voluminous. The need for processing data streams is beginning to be understood, and, consequently, there is effort underway in the data mining [8, 10] database [32] and algorithms [21] communities to address the outstanding problems that arise. Within the database community, it is understood that . Today s database systems and data processing algorithms (e.g. data mining) are ill equipped to handle data streams effectively, and many aspects of data management and processing ....
....Section 7 we present concluding remarks. The proofs of many of our formal claims will be available in the full version of this paper. 2 Related Work Streaming or one pass algorithms have been studied in different areas. In the area of theoretical algorithms, streaming models have been studied in [21, 3, 12, 13, 22, 20], where methods have been developed for comparing data streams under various L p distances, or clustering them. Within the database community, one pass algorithms have been designed for getting median, quantiles and other order statistics [25, 17] correlated aggregate queries [15] mining [14] ....
M. Henzinger, P. Raghavan and S. Rajagopalan. Computing on Data Streams. DEC SRC TR
No context found.
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In DIMACS series in Discrete Mathematics and Theoretical Computer Science, volume 50, pages 107--118, 1999.
No context found.
M. R. Henziger, P. Raghavan and S. Rajagopalan, "Computing on Data Streams", TechReport 1998-011, DEC, May 1998.
No context found.
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In TR1998-011 Compaq System Research Center, 1998.
No context found.
M. R. Henziger, P. Raghavan and S. Rajagopalan, "Computing on Data Streams", TechReport 1998-011, DEC, May 1998.
No context found.
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In Technical Note 1998-011, Digital Systems Research.
No context found.
M. R. Henzinger, P. Raghavan, and S. Rajagopalan, "Computing on data streams", SRC Technical Note 1998-011, Digital Research Center, May 26, 1998.
No context found.
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In Technical Note 1998-011, Digital Systems Research.
No context found.
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. In DIMACS series in Discrete Mathematics and Theoretical Computer Science, volume 50, pages 107-- 118, 1999.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC