| S. Acharya, P.B. Gibbons, V. Poosala, and S. Ramaswamy (1999 ), Join synopses for approximate query answering, Proc. of the ACM SIGMOD Conference. |
....the length of the segments they join. 72 5.8 Definition of cmax i and cmin i for computing MBRs . 76 5.9 The M Regions associated with a 2M dimensional MBR. The boundary of a region G is denoted by G = G[1], G[2] G[3] G[4] 78 5.10 Computation of MINDIST . 80 5.11 The time taken (in seconds) to build an index using various transformations over a range of query lengths and ....
....concurrency control in multidimensional AMs. 2. 8 Approximate Query Answering Techniques Approximate query processing has recently emerged as a viable, cost effective solution for dealing with the huge data volumes and stringent response time requirements of today s Decision Support Systems (DSS) [1, 51, 53, 61, 64, 70, 115, 144, 145]. The general approach is to first construct compact synopses of the interesting relations in the database (using a data reduction technique) and then answering the user queries Figure 2.10: Data reduction techniques for approximate query answering. by using just the synopsis. Data reduction ....
[Article contains additional citation context not shown here]
Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, and Sridhar Ramaswamy. "Join Synopses for Approximate Query Answering". In Proceedings of the 1999.
....and guide the subsequent clustering e ort. In data mining, Toivonen [Toi96] examined the problem of using sampling during the discovery of association rules. Sampling has also been recently successfully applied in query optimization [CMN99, CMN98] as well as approximate query answering [GM98, AGPR99] Independently Palmer and Faloutsos [PF00] developed an algorithm to sample for clusters by using density information, under the assumption that clusters have a zip an distribution. Their technique is designed to nd clusters when they di er a lot in size and density, and there is no noise. ....
S. Acharya, P. Gibbons, V. Poosala, and S. Ramaswamy. Join Synopses For Approximate Query Answering. Proceedings of SIGMOD, pages 275-286, June 1999.
.... 1 Introduction Maintaining compact and accurate statistics on data distributions is of crucial importance for a number of tasks: 1) traditional query optimization that aims to find a good execution plan for a given query [5, 21] 2) approximate query answering and initial data exploration [13, 1, 18, 4, 12], 3) prediction of run times and result sizes of complex data extraction and data analysis tasks on data mining platforms, where absolute predictions with decent accuracy are mandatory for prioritization and scheduling of long running tasks (sometimes including the decision whether a given data ....
....for a single data distribution such as one database table with pre selected relevant attributes. The equally important problem which combination of synopses to maintain on the application s various datasets and how to divide the available memory between them has received only little attention [1, 8, 23], putting the burden of selecting and tuning appropriate synopses on the database administrator. This creates a physical design problem for data synopses, which can be very di#cult in advanced settings such as predicting run times of data analysis tasks or information wealth of Web sources by a ....
[Article contains additional citation context not shown here]
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join Synopses for Approximate Query Answering. In Proceedings of the ACM SIGMOD Conference, pages 275--286. ACM Press, 1999.
....of processing general, possibly multi join, aggregate queries over continuous data streams. On the other hand, efficient ap proximate multi join processing has received considerable attention in the context of approximate query answering, a very active area of database research in recent years [1, 6, 12, 19, 20, 24]. The vast majority of existing proposals, however, rely on the assumption of a static data set which enables either several passes over the data to construct effective, multi dimensional data synopses (e.g. histograms [20] and Haar wavelets [6, 24] or intelligent strategies for randomizing the ....
....approximate query processing tools inapplicable in a data stream setting. Note that, even though random sample data summaries can be easily constructed in a single pass [23] it is well known that such summaries typically give very poor result estimates for queries involving one or more joins [1, 6, 2] ) Our Contributions. In this paper, we tackle the hard technical problems involved in the approximate processing of complex (possibly multi join) aggregate decision support queries over continuous data streams with limited memory. Our approach is based on randomizing techniques that compute ....
[Article contains additional citation context not shown here]
S. Acharya, P.B. Gibbons, V. Poosala, and S. Ramaswamy. "Join Synopses for Approximate Query Answering". In Proc. of the 1999.
....6] Much less work has been done on estimating the selectivity of joins. Commercial DBMSs commonly make the uniform join assumption. One approach that has been suggested is based on random sampling: randomly sample the two tables, and compute their join. This approach is flawed in several ways [1], and some work has been devoted to alternative approaches that generate samples in a more targeted way [20] An alternative recent approach is the work of Acharya et al. 1] on join synopses, which maintains statistics for a few distinguished joins. To our knowledge, no work has been done on ....
....is based on random sampling: randomly sample the two tables, and compute their join. This approach is flawed in several ways [1] and some work has been devoted to alternative approaches that generate samples in a more targeted way [20] An alternative recent approach is the work of Acharya et al. [1] on join synopses, which maintains statistics for a few distinguished joins. To our knowledge, no work has been done on approaches that support selectivity estimation for queries containing both select and join operations in real world do mains. In this paper, we propose an alternative approach ....
S. Acharya, P. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In SIGMOD. ACM Press, 1999.
....this architecture to derive the benefits that the architecture provides, while at the same time addressing some of its limitations. One of the important limitations addressed in our work is their assumption that there is little variability in the data. Acharya, Gibbons, Poosala, and Ramaswamy [2] proposed the use of synopses (i.e. precomputed samples of relations) for answering aggregation queries. Gibbons and Matias [9] developed techniques for the fast incremental maintenance of summary statistics, and considered their application to providing approximate query answers. A key ....
Acharya S., Gibbons P., Poosala V., and Ramaswamy S. Join synopses for approximate query answering. In Proceedings of the ACM SIGMOD Conference, 1999.
....the work of Faloutsos et al. in multiple dimensions. Maximum entropy has also been used for the identification of interesting correlations in data [Tho98] There exists a sizeable bibliography in histogramming techniques and approximate query answering [IP95] PIHS96] JKM 98] VWI98] AGPR99] SFB99] BS97] BW00] Our approach is fundamentally different. Previous work focused on the problem of data reconstruction by constructing specialized summarized representations (typically histograms) of the data. We argue, that since 4 data are already stored in an aggregated form in the ....
Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, and Sridhar Ramaswamy. Join Synopses for Approximate Query Answering. In ACM SIGMOD, pages 275--286, Philadelphia, PA, USA, June 1999.
....1. INTRODUCTION Histograms capture distribution statistics in a space ef cient fashion. They have been designed to work well for numeric value domains, and have long been used to support cost based query optimization [22, 11, 12, 25, 27, 26, 23, 14, 13, 15, 20, 17] approximate query answering [7, 2, 1, 29, 28, 24], data mining [16] and map simpli cation [3] Query optimization is a problem of central interest to database systems. A database query is translated by a parser into a tree of physical database operators (denoting the dependencies between operators) that have to be executed and form the query ....
....is that for very large data sets on which execution of complex queries is time consuming, is much better to provide a fast approximate answer. This is very useful for quick and approximate analysis of large data sets [2] Research has been conducted on the construction of histograms for this task [7, 1] as well as ecient approximations of datacubes [8] via histograms [28, 29, 24] An additional application of histograms is data mining of large time series datasets. Histograms are an alternate way to compress time series information. Through the application of the minimum description length ....
S. Acharya, P. Gibbons, V. Poosala, and S. Ramaswamy. Join Synopses For Approximate Query Answering. Proceedings of ACM SIGMOD, pages 275-286, June 1999.
....[BDF 97] for a recent survey. GM99] presented a formal framework for evaluating such sublinear space synopsis data structures, and a survey of some of the results in this area. There has been a flurry of recent work in approximate query answering (e.g. VL93, Olk93, BDF 97, HHW97, GM98, AGPR99, HH99, VW99, IP99, AGP00, GLR00, CCMN00, CGRS00, MVW00, CDN01, LM01, Gib01, GKS01] The work in [HHW97, AGPR99, HH99, IP99, CGRS00] looked at the problem of providing approximate answers to queries seeking aggregates (e.g. count, sum, avg) of attribute values for the tuples satisfying a ....
....data structures, and a survey of some of the results in this area. There has been a flurry of recent work in approximate query answering (e.g. VL93, Olk93, BDF 97, HHW97, GM98, AGPR99, HH99, VW99, IP99, AGP00, GLR00, CCMN00, CGRS00, MVW00, CDN01, LM01, Gib01, GKS01] The work in [HHW97, AGPR99, HH99, IP99, CGRS00] looked at the problem of providing approximate answers to queries seeking aggregates (e.g. count, sum, avg) of attribute values for the tuples satisfying a predicate that occur in the join of multiple relations. The count aggregate (over joins but with no other predicates) ....
[Article contains additional citation context not shown here]
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In Proc. ACM SIGMOD International Conf. on Management of Data, pages 275--286, June 1999.
....queries. Approximate query answering is becoming an indispensable means for providing fast response times to decision support queries over large data warehouses. Fast, approximate answers are often provided from small synopses of the data (such as samples, histograms, wavelet decompositions, etc. [14, 37, 3, 25, 33, 36, 1, 6, 12, 8]. Commercial data warehouses are approaching 100 terabytes, and new decision support arenas such as click stream analysis and IP traffic analysis only increase the demand for high speed query processing over terabytes of data. Thus it is crucial to provide highly accurate approximate answers to an ....
....approximate answers to an increasingly rich set of queries. Distinct values queries are an important class of decision support queries, and good quality estimates for such queries may be returned to users as part of an online aggregation system [20, 17] or an approximate query answering system [14, 37, 2, 3, 25, 33, 36, 1, 6, 12, 8, 26]. Because the answers are returned to the users, the estimates must be highly accurate (say within 10 or better with 95 confi select count(distinct target attr) from rel where P Figure 1: Distinct Values Query template. select count(distinct o custkey) from orders where o orderdate = ....
[Article contains additional citation context not shown here]
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In Proc. ACM SIGMOD International Conf. on Management of Data, pages 275--286, June 1999.
....to the function being estimated, e.g. the number of occurrences of the label within the stream. We could also store e(i) if desired. Maintaining a distinct labels sample in the presence of new data is useful for approximate query answering systems for data warehouses, such as the Aqua system [1, 2]. Average interarrival gap. Our relative error approximation of U permits a relative error approximation of the average interarrival gap I in the union of multiple streams, for the common case where time is discretized, i.e. I = number of time slots size of union Set resemblance. The set ....
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In Proc. ACM SIGMOD International Conf. on Management of Data, pages 275--286, June 1999.
....no accumulation value, and the hash table is used merely to ensure that each distinct label is stored only once in the sample at a party. Maintaining a distinct labels sample in the presence of new data is useful for approximate query answering systems for data warehouses, such as the Aqua system [AGPR99a, AGPR99b]. As discussed in Section 1, industry benchmarks have many queries and reports over distinct values. Average interarrival gap. Our relative error approximation of U permits a relative error approximation of the average interarrival gap G in the union of multiple streams, for the common case where ....
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In Proc. ACM SIGMOD International Conf. on Management of Data, pages 275--286, June 1999.
No context found.
S. Acharya, P.B. Gibbons, V. Poosala, and S. Ramaswamy (1999 ), Join synopses for approximate query answering, Proc. of the ACM SIGMOD Conference.
No context found.
Acharya, S., Gibbons, P. B., Poosala, V., Ramaswamy, S., Join Synopses for Approximate Query Answering, In Proc. of the 1999.
No context found.
S. Acharya, P.B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In Proceedings of the ACM SIGMOD International Conference on Managment of Data, pages 275--286, Philadelphia, PA, June 1999.
No context found.
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. "Join Synopses for Approximate Query Answering". In Proc. of the 1999 ACMSIGMOD Intl. Conf. on Management of Data.
No context found.
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join Synopses for Approximate Query Answering. In Proc. of the 1999.
No context found.
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 275--286, 1999.
No context found.
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In SIGMOD Conference, 1999.
No context found.
S. Acharya, P. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In SIGMOD. ACM Press, 1999.
No context found.
S.Acharya,P.B.Gibbons,V.Poosala,and S. Ramaswamy. Join synopses for approximate query answering. In Proc. of the ACM SIGMOD 1999.
No context found.
S. Acharya, P. Gibbons, V. Poosala, and S. Ramaswamy, "Join Synopses for Approximate Query Answering," Proc. SIGMOD, pp. 275-286, June 1999.
No context found.
S.Acharya,P.B.Gibbons,V.Poosala,and S. Ramaswamy. Join synopses for approximate query answering. In SIGMOD 1999.
No context found.
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In SIGMOD Proceedings, pages 275-- 286, 1999.
No context found.
Acharya S., Gibbons P., Poosala V., Ramaswamy S.: Join Synopses for Approximate Query Answering. SIGMOD Conf. (1998) 275-286
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC