| P. B. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. M. andf S. Muthukrishnan, S. Ramaswamy, and T. Suel. AQUA: System and techniques for approximate query answering. Technical report, Bell Laboratories, Murray Hill, NJ, Feb. 1998. |
.... 1 Introduction Maintaining compact and accurate statistics on data distributions is of crucial importance for a number of tasks: 1) traditional query optimization that aims to find a good execution plan for a given query [5, 21] 2) approximate query answering and initial data exploration [13, 1, 18, 4, 12], 3) prediction of run times and result sizes of complex data extraction and data analysis tasks on data mining platforms, where absolute predictions with decent accuracy are mandatory for prioritization and scheduling of long running tasks (sometimes including the decision whether a given data ....
P. B. Gibbons, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, V. Poosala, S. Ramaswamy, and T. Suel. Aqua: System and techniques for approximate query answering. Technical report, Bell Labs, 1998.
....estimation algorithms also nd a use in large data recording and warehousing environments. There, the goal is to provide an approximate response in time that is orders of magnitude less than what computing an exact answer would require: see the description of the Aqua Project by Gibbons et al. in [8]. The analysis of trac in routers, as already mentioned, bene ts greatly of cardinality estimators this is lucidly exposed by Estan et al. in [2, 3] Certain types of attacks (e.g. denial of service and port scans ) are betrayed by alarmingly high counts of certain characteristic events at ....
....very large number of events that take place even in a relatively small time window. Probabilistic counting algorithms can also be used within other algorithms whenever the nal answer is the cardinality of a large set and a small tolerance on the quality of the answer is acceptable. Palmer et al. [8] describe the use of such algorithms in an extensive connectivity analysis of the internet topology. For instance, one of the tasks needed there is to determine, for each distance h, the number of pairs of nodes that are at distance at most h in the internet graph. Since the graph studied by [8] ....
[Article contains additional citation context not shown here]
Gibbons, P. B., Poosala, V., Acharya, S., Bartal, Y., Matias, Y., Muthukrishnan, S., Ramaswamy, S., and Suel, T. AQUA: System and techniques for approximate query answering. Tech. report, Bell Laboratories, Murray Hill, New Jersey, Feb. 1998.
....the online time taken to answer a query, and the accuracy of the resulting estimate. Previous work on the query approximation problem has mainly investigated focused on nonprobabilistic approaches to the query selectivity problem [20, 31, 7] multidimensional histograms ( 22, 27] and sampling ([16, 13]) Mixtures of Gaussian independence models were proposed for selectivity query estimation on real valued data sets from relatively low dimensional data cubes (5 or fewer dimensions) 30] Generalized queries were considered by [28] in the context of language modeling using context free grammars. ....
P. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, S. Ramaswamy, and T. Suel. Aqua: System and techniques for approximate query answering. In Technical report, Bell Laboratories, 1998.
....of each approach in terms of memory requirements, the online time taken to answer a query, and the accuracy of the resulting estimate. Previous work on this problem has mainly investigated the use of wavelets (see, e.g. 4] 5] 6] multidimensional histograms ( 7] 1] and sampling ( 8] [9]) However, there have been no systematic investigations of using different probabilistic modeling techniques for the query selectivity problem. In this paper we explore several variants of probabilistic models and investigate different aspects of their performance. A specific class of ....
P. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, S. Ramaswamy, and T. Suel, "Aqua: System and techniques for approximate query answering," in Technical report, Bell Laboratories, 1998.
.... sampling that take samples from the database to statistically estimate the intermediate result sizes for the query at hand (for survey material see [20, 1, 24, 2] From a generalized perspective, all these approaches can be viewed as constructing an approximate representation, or synopsis [11, 9], of the data for the purpose of estimation (or even giving approximative query answers, which is not considered in this paper, however) With modern OLAP tools and other forms of decision support query generators, the query optimization takes place within the critical path of the query execution ....
Phillip B. Gibbons, S. Acharya, Y. Bartal, Y. Matias, S. Muthukishnan, V. Poosala, S. Ramaswamy, and T. Suel. AQUA: System and techniques for approximate query answering. Technical report, Bell Labs, 1998.
No context found.
P. B. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, S. Ramaswamy, and T. Suel, AQUA: System and techniques for approximate query answering, Tech. report, Bell Laboratories, Murray Hill, New Jersey, February 1998.
No context found.
P. B. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, S. Ramaswamy, and T. Suel. AQUA: System and techniques for approximate query answering. Technical report, Bell Laboratories, Murray Hill, New Jersey, February 1998.
No context found.
P. B. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, S. Ramaswamy, and T. Suel. AQUA: System and techniques for approximate query answering. Technical report, Bell Laboratories, Murray Hill, New Jersey, February 1998.
No context found.
P. B. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, S. Ramaswamy, and T. Suel. AQUA: System and techniques for approximate query answering. Technical report, Bell Laboratories, Murray Hill, New Jersey, February 1998.
....of the art in data reduction techniques. Our work on synopsis data structures also includes the use of multifractals and wavelets for synopsis data structures [4, 13] and join synopses for queries on the join of multiple sets. This work is part of the Approximate query answering (Aqua) project [9, 11] at Bell Labs; Aqua seeks to provide fast, approximate answers to queries using synopsis data structures. While synopsis data structures have been proposed and studied for a number of query problems (see the full paper for additional examples) many more open questions remain, and we hope that ....
P. B. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. Matias, S. Muthukrishnan, S. Ramaswamy, and T. Suel, AQUA: System and techniques for approximate query answering, tech. rep., Bell Laboratories, Murray Hill, New Jersey, Feb. 1998.
No context found.
P. B. Gibbons, V. Poosala, S. Acharya, Y. Bartal, Y. M. andf S. Muthukrishnan, S. Ramaswamy, and T. Suel. AQUA: System and techniques for approximate query answering. Technical report, Bell Laboratories, Murray Hill, NJ, Feb. 1998.
No context found.
P. B. Gibbons, V. Poosala, S. Acharya, Y. Matias Y. Bartal, S. Muthukrishnan, S. Ramaswamy, and T. Suel. AQUA: System and Techniques for Approximate Query Answering. Technical report, Bell Laboratories, Murray Hill, NJ, February 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC