| Yossi Matias, Jeffrey Scott Vitter, and Min Wang. Wavelet-based histograms for selectivity estimation. In Proc. of SIGMOD, pages 448--459, Seattle, WA, 1998. |
....low time and memory requirements [25] However, the independence model has long been criticized for being an inappropriately simple model for real world data sets [8] and, in addition, it is not exible in a sense of the de nition above. More sophisticated techniques, such as wavelet models [18, 5] and multidimensional histograms [25, 20] have troubles scaling up to high dimensions because of the phenomenon, known as the curse of dimensionality . For this reason all papers mentioned above only conduct experiments on low dimensional (up to 15 attributes) data sets. Random sampling ....
Y. Matias, J. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998.
....are addressed in StatStream[22] In this paper we extend the sliding window model to the elastic sliding window model, making the choice of sliding window size more automatically. Wavelets are heavily used in the context of data management and data mining, including selectivity estimation[17], approximate query processing[20, 5] dimensionality reduction [6] and streaming data analysis[12] However, its use in elastic burst detection is innovative. We achieve e#cient detection of subsequences with burst in a time series by filtering lots of subsequences that are unlikely to have ....
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In L. M. Haas and A. Tiwary, editors, SIGMOD 1998.
....# buckets to histogram h # , 1 M ) e # ) where e # is the error of histogram h # and p # is the probability of attribute X # being queried. 2. 1 Error Metrics From an application perspective, the error metric we most care about are absolute error and relative error[14] in the estimate returned by a histogram. Let S # be the actual size of a query q # and let S # be the estimated size of the query by a histogram. The absolute error of the query is de ned as: And the relative error of the query is de ned as: #S Absolute error and relative ....
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of the ACM SIGMOD conference, pages 448-459, 1998.
.... 113 Wavelet based techniques provide a mathematical tool for the hierarchical decomposition of functions, with a long history of successful applications in signal and image processing [74, 100, 137] Recent studies have demonstrated the applicability of wavelets to selectivity estimation [91] and the approximation of range sum queries over OLAP data cubes [144, 145] The idea is to apply wavelet decomposition to the input data collection (attribute column(s) or OLAP cube) and retain the best few wavelet coefficients as a compact synopsis of the input data. The results of Vitter et al. ....
....based on the multi dimensional Haar wavelet decomposition. Haar wavelets are conceptually simple, very fast to compute, and have been found to perform well in practice for a variety of applications ranging from image editing and querying [100, 137] to selectivity estimation and OLAP approximations [91, 144]. Recent work has also investigated methods for dynamically maintaining Haar based data representations [92] In this section, we discuss Haar wavelets in both one and multiple dimensions. One Dimensional Haar Wavelets. Suppose we are given a one dimensional data vector A containing the following ....
Yossi Matias, Jeffrey Scott Vitter, and Min Wang. "Wavelet-Based Histograms for Selectivity Estimation ". In Proceedings of the 1998.
....important role in spatial query processing optimization by providing selectivity estimation [1, 2, 14, 19] A variety of techniques [7, 8] have been recently developed for e#ectively summarizing data in relational datasets. The most common techniques are samples [16] histograms [18, 7] wavelets [17], and sketches [7] In contrast, techniques for summarizing topological relations against spatial datasets are relatively a little. In this paper, we will investigate the problem of summarizing rectangular objects for range (window) queries. Several histogram based summarization techniques have ....
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In SIGMOD 1998.
....There is a number of methods suggested in the literature employing di erent ways to nd the density estimator for multi dimensional datasets. These include, computing multi dimensional histograms [PI97] CMN99] KJJ99] APR99] using various transforms, like the wavelet transformation [VWI98] MVW98] SVD [PI97] or the discrete cosine transform [LKC99] on the data, using kernel estimators [BKS99] Sco92] Sil86] as well as sampling [OR90] LNS90] HS92] Although our biased sampling technique can use any density estimation method, using kernel density estimators is a good choice. Work on ....
Y. Matias, J. Scott Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. Proc. of the 1998 ACM SIGMOD Intern. Conf. on Management of Data, 1998.
....best known and most effective multiresolution coding techniques. In this section, we sketch a TAG aggregate function for encoding a set of readings in a sensor network using Haar wavelets, the simplest and most widely used wavelet encoding . Our discussion here focuses on wavelet histograms [20], which capture information about the statistical distribution of sensor values, without placing significance on any ordering of the values. We drop coefficients with low absolute values ( threshholding ) to keep the communication costs down, but always retain the value of coefficient 0; in Haar ....
.... : In this case, we do not compress, but simply store all the values. We concatenate the values from r2.data to the end of r1.data, and In the interest of brevity, we do not overview wavelets here; the interested reader is referred to [23] for a good practical overview of wavelets, or to [20] for a simple introduction to Haar wavelets. Note that the choice of ordering r1 before r2 is rather arbitrary: for now, we assume that the network topology and scheduling determines which input is first, and which is second. update the offsets and count of r1 accordingly. The loglen variable ....
[Article contains additional citation context not shown here]
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In SIGMOD, pages 448--459, Seattle, Washington, June 1998.
.... [15] have coined the general term data synopses : advanced forms of histograms [30, 16, 20] spline synopses [22, 23] sampling [6, 17, 14] and parametric curve fitting techniques [34, 9] all the way to highly sophisticated methods based on kernel estimators [2] or Wavelets and other transforms [26, 25, 4]. However, most of these techniques take the local viewpoint of optimizing the approximation error for a single data distribution such as one database table with pre selected relevant attributes. The equally important problem which combination of synopses to maintain on the application s various ....
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of the ACM SIGMOD Conference, pages 448--459, 1998.
.... a histogram bucket In addition to the previous techniques, window query selectivity on non uniform data can be estimated using fractals and power laws [BF95, PF98, PF01] sampling [OR90, PK00, CDD 01, WAE01] kernel estimation [BKS99] single value decomposition [PI97] compressed histograms [MVW98, LKC99, MVW00, TGIK02], maximal independence [DGR01] Euler formula [SAE02b] etc. Furthermore, Aboulnaga and Naughton [AN00] discuss the problem on general polygon objects. Nearest neighbor distance The cost of a kNN query is closely related to the nearest distance ND k between the query and its k th NN. Figure ....
Matias, Y., Vitter, J., Wang, M. Wavelet-Based Histograms for Selectivity Estimation. ACM SIGMOD, 1998.
....a relation for a tuple, to evaluate one or more selection predicates. Most selectivity estimation techniques proposed so far for multidimensional feature spaces work well only for low dimensional spaces, but are not accurate in the high dimensional spaces commonly used to represent images features [87, 69, 1]. More suitable techniques (based, for instance, on fractals) are beginning to appear in the literature [8, 36] Work on cost models for range and k NN searches on multidimensional index structures includes earlier proposals for low dimensional index structures (i.e. R tree) in [34, 109] and more ....
Yossi Matias, Jeffrey Scott Vitter, and Min Wang. Wavelet-based histograms for selectivity estimation. In Proc. 1998.
....quantities of data. The most popular approximate processing techniques include histograms, random sampling and wavelets. In recent years there has been a flurry of research on the application of these techniques to such areas as selectivity estimation and approximate query processing. The work in [4, 10, 18] demonstrated that wavelets can achieve increased accuracy to queries over histograms and random sampling. However, one of the shortcomings of wavelets is that they cannot easily extend to datasets containing multiple measures. Such datasets are very common in many database applications. For ....
....algorithms for multiple measures, to which we refer to as the Individual and Combined algorithms, can be used for compression of colored images. More recently, wavelets have been applied successfully in answering range sum aggregate queries over data cubes [18, 19] in selectivity estimation [10] and in approximate query processing [4] In the aforementioned work, wavelets were shown to produce results with increased accuracy to queries when compared to histograms and random sampling. The work in [4, 10, 18] clearly demonstrated that wavelets can be accurate even in high dimensional ....
[Article contains additional citation context not shown here]
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In ACM SIGMOD 1998.
....techniques and the differences among them. Finally, we discuss related work for the adaptive query estimation problem. 2.1 Preliminaries Given a relation R with an attribute X, we are interested in estimating the size of a range query on attribute X. We adopt the notation develop in [PIHS96, MVW98] where information about the attribute domain is summarized and used in estimating the size of the query. We assume that the discrete domain D of attribute X is fd 0 ; d 1 ; d 2 ; dN Gamma1 g which contains all the possible values of X. A range query is specified using a range predicate of ....
Yossi Matias, Jeffrey Scott Vitter, and Min Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 448--459, Seattle, 1998.
....study allows us to draw a variety of conclusions on the general characteristics of each approach in terms of memory requirements, the online time taken to answer a query, and the accuracy of the resulting estimate. Previous work on this problem has investigated the use of wavelets (see, e.g. [20], 33] 7] multidimensional histograms ( 23] 30] and sampling ( 16] 1] for query approximation. The use of inclusion exclusion for query approximation has been previously mentioned in [18] Probabilistic models of various forms have also been investigated in limited contexts. Mix tures ....
Y. Matias, J. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD'98), pages 448-459. New York, NY: ACM Press, 1998.
....the attribute Ai is partitioned into) The value in each slot is the number of times this value combination appears in R. One approach to find an approximation to the joint data distribution is to approximate the array D directly. The Singular Value Decomposition (SVD) 38] the wavelet transform [55, 32] and the Discrete Cosine Transform (DCT) 31] have been proposed to find such an approximation. The basic operation of all decomposition techniques is to essentially perform a change of bases. This paper, prepared for a committee of the Computer Science and Telecommunications Board, should not be ....
Y. Matias, J. Scott Vitter, M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proc. of the 1998.
....7. 2. RELATED WORK Extensive research in the literature has studied static datasets, especially for query selectivity estimation. As our approach uses summary representation of a data stream and provides approximation answers, it is related to histograms [23, 15] sampling [27] and wavelets [19]. However, because these techniques work in static settings, they are not applicable to data streams. As the need for dynamic datasets arises for an increasing number of applications, much research has been directed to incrementally maintaining summary representations. Babu et al. 2] presented ....
Y. Matias, J.S. Vitter, and M. Wang. Wavelet-Based histograms for Selectivity Estimation. In Proceeding of the ACM SIGMOD Conference, 1998.
....Many other algorithms have been proposed for query estimation with different precision, e.g. histogram [8] random sampling [9] Quasi Cubes [10] wavelet transform [11] etc. Vitter et al. 11] are the first to propose using wavelet transforms on data cubes to answer range sum queries. In [11, 12, 13], wavelet transforms are used to provide quick estimations of range sum queries. By storing only a few wavelet transform coefficients to approximate the entire data cubes, Vitter et al. achieve low space requirement and short query response time. However, the estimation techniques proposed in [11, ....
....information of the original cubes. For example sales data may have very small variations from dayto day values within a month. In such cases, the different values are likely to be very small. Proposals have been made to ignore small values and compress the space needed to store the data cubes [12]. In this paper, however, we do not consider the general problem of data compression using wavelets. 2.2 Progressive Query Processing and Updates The wavelet decomposition results in a hierarchical structure of wavelet coefficients at increasing resolution levels. Given a range sum query, we can ....
[Article contains additional citation context not shown here]
Yossi Matias, Jeffrey Scott Vitter, and Min Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of the 1998.
....constructing multidimensional histograms and study the effectiveness of different techniques. They also propose an approach based on singular value decomposition, applicable only in the two dimensional case. A newer approach is the use of wavelets to approximate the underlying joint distribution [21, 27, 6]. Much less work has been done on estimating the selectivity of joins. Commercial DBMSs commonly make the uniform join assumption. One approach that has been suggested is based on random sampling: randomly sample the two tables, and compute their join. This approach is flawed in several ways [1] ....
....grows exponentially in the number of attributes, so that explicitly representing the joint distribution Pw is almost always intractable. Several approaches have been proposed to circumvent this issue by approximating the joint distribution (or projections of it) using a more compact structure [25, 21]. We also propose the use of statistical models that approximate the full joint distribution. However in order to represent the distribution in a compact manner, we exploit the conditional independence that often holds in a joint distribution over real world data. By decomposing the representation ....
Y. Matias, J. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In SIGMOD. ACM Press, 1998.
.... [13] Some variations over histograms include the use of parametric curve fitting techniques inside buckets [10] self tuning histograms [1] and lately, multidimensional histograms for dealing with real valued attributes [9] Other multidimensional density estimation techniques are wavelets [12] and fractal dimension concepts [8, 2] 7 Conclusions In this paper, we have presented a new robust scheme for answering multiattribute top k queries by mapping them to relational selection queries. We have reported the first evaluation of the performance of top k mapping techniques over a ....
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In SIGMOD, pages 448--459, Seattle, Washington, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Waveletbased histograms for selectivity estimation. In Proceedings of the 1998.
No context found.
Yossi Matias, Jerey Scott Vitter, and Min Wang. Wavelet-based histograms for selectivity estimation. In Procs. of ACM-SIGMOD, pages 448-459, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 448--459, Seattle, WA, June 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Waveletbased histograms for selectivity estimation. In Proceedings of the 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 448#459, Seattle, WA, June 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 448#459, Seattle, WA, June 1998.
....has attracted much attention recently [GMP97, AC99, LKC99, DIR99] 3. Extension to multidimensional data. For multidimensional histograms that capture joint distribution of correlated attributes, achieving good accuracy and designing efficient maintenance methods becomes particularly difficult [MD88, PI97, MVW98, AC99, VWI98, VW99]. Equi depth histograms [PSC84] are the most popular histograms and are used in many commercial DBMSs. They are also easy to implement. In some cases they provide good guidance in selectivity estimation and other data processing tasks. Several recent works have dealt with their maintenance ....
....maintained by simple mechanisms to keep partition structures. It is very difficult to capture complex multidimensional data distributions with simple partitions. If more advanced partition methods are used, the accuracy becomes better, but the implementation and maintenance become difficult. In [MVW98], we introduce a new type of histogram that is based upon the powerful mathemat2 ical tool of wavelets and multiresolution analysis. The wavelet based histogram is fundamentally different from traditional approaches and offers noticeable improvements in accuracy over traditional equi depth ....
[Article contains additional citation context not shown here]
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the
No context found.
Yossi Matias, Jeffrey Scott Vitter, and Min Wang. Wavelet-based histograms for selectivity estimation. In Proc. of SIGMOD, pages 448--459, Seattle, WA, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of ACM SIGMOD Conference, 1998.
No context found.
Y. Matias, J.S. Vitter, M. Wang. Wavelet-Based Histograms for Selectivity Estimation. SIGMOD, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proc. of SIGMOD Conf., pages 448--459, 1998.
No context found.
Y. Matias, J. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings ACM SIGMOD Conference, pages 448--459, 1998.
No context found.
Y. Matias, J. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In L. M. Haas and A. Tiwary, editors, SIGMOD 1998.
No context found.
Matias, Y., Vitter, J. S., Wang, M., Wavelet-based histograms for selectivity estimation, Proc. of the 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. WaveletBased Histograms for Selectivity Estimation. Proc. of ACM SIGMOD, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of the ACM SIGMOD conference, pages 448-459, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. SIGMOD Conference, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. "Wavelet-Based Histograms for Selectivity Estimation". In Proc. of the 1998 ACM SIGMOD Intl. Conf. on Management of Data.
No context found.
Y. Matias, J. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In SIGMOD. ACM Press, 1998.
No context found.
Y. Matias, J. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, 1998.
No context found.
Y. Matias, J.S. Vitter, and M. Wang, "Wavelet-Based Histograms for Selectivity Estimation," Proc. ACM SIGMOD, pp. 448-459, 1998.
No context found.
Y. Matias, J.S. Vitter, and M. Wang, "Wavelet-Based Histograms for Selectivity Estimation," Proc. 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In SIGMOD 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Waveletbased histograms for selectivity estimation. In SIGMOD Proceedings, pages 448--459, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang, "Wavelet-based histograms for selectivity estimation," in Proc. of ACM SIGMOD '98, 1998, pp. 448--459.
No context found.
Matias Y., Vitter J., Wang M.: Wavelet-Based Histograms for Selectivity Estimation. SIGMOD Conf. (1998) 448-459
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of ACM SIGMOD Conference, 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In SIGMOD, pages 448--459, Seattle, Washington, June 1998.
No context found.
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In L. M. Haas and A. Tiwary, editors, SIGMOD 1998.
No context found.
Y. Matias, J. S. vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. SIGMOD pages 448-459, 1998.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC