by Ashraf Aboulnaga, Jeffrey F. Naughton
In ICDE'00, Proceedings of the 16th International Conference on Data Engineering
http://www.cs.wisc.edu/~ashraf/pubs/spatselec.ps
Add To MetaCart
Abstract:
Optimizing queries that involve operations on spatial data requires estimating the selectivity and cost of these operations. In this paper, we focus on estimating the cost of spatial selections, or window queries, where the query windows and data objects are general polygons. Cost estimation techniques previously proposed in the literature only handle rectangular query windows over rectangular data objects, thus ignoring the very significant cost of exact geometry comparison (the refinement step in a "filter and refine " query processing strategy). The cost of the exact geometry comparison depends on the selectivity of the filtering step and the average number of vertices in the candidate objects identified by this step. In this paper, we introduce a new type of histogram for spatial data that captures the complexity and size of the spatial objects as well as their location. Capturing these attributes makes this type of histogram useful for accurate estimation, as we experimentally demonstrate. We also investigate sampling-based estimation approaches. Sampling can yield better selectivity estimates than histograms for polygon data, but at the high cost of performing exact geometry comparisons for all the sampled objects.
Citations
|
1651
|
R-trees: A dynamic index structure for spatial searching
– Guttman
- 1984
|
|
566
|
O.: Computational Geometry. Algorithms and Applications
– Berg, Kreveld, et al.
- 2000
|
|
403
|
Computational geometry in C
– O'Rourke
- 1998
|
|
302
|
The Quadtree and related hierarchical data structures. ACM computing surveys
– Samet
- 1984
|
|
191
|
On Packing R-trees
– KAMEL, FALOUTSOS
- 1993
|
|
184
|
A:Pfeffer, Generalized Search Trees for Database Systems
– Hellerstein
- 1995
|
|
182
|
Improved histograms for selectivity estimation of range predicates
– Poosala, Ioannidis, et al.
- 1996
|
|
170
|
Spatial query processing in an object-oriented database system
– Orenstein
- 1986
|
|
152
|
Equi-depth histograms for estimating selectivity factors for multi-dimensional queries
– Muralikrishna, DeWitt
- 1988
|
|
150
|
Partition based spatial-merge join
– Patel, DeWitt
- 1996
|
|
149
|
Selectivity estimation without the attribute value independence assumption
– Poosala, Ioannidis
- 1997
|
|
137
|
Practical selectivity estimation through adaptive sampling
– Lipton, Naughton, et al.
|
|
132
|
Beyond uniformity and independence: Analysis of r-trees using the concept of fractal dimension
– Faloutsos, Kamel
- 1994
|
|
130
|
Multi-step processing of spatial joins
– Brinkhoff, Kriegel, et al.
- 1994
|
|
126
|
T.K.: A model for the prediction of R-tree performance
– Theodoridis
- 1996
|
|
99
|
The Sequoia 2000 storage benchmark
– Stonebraker, Frew, et al.
- 1993
|
|
97
|
Estimating the selectivity of spatial queries using the ‘correlation’ fractal dimension
– Belussi, Faloutsos
- 1995
|
|
69
|
Towards an analysis of range query performance in spatial data structures
– Pagel, Six, et al.
- 1993
|
|
58
|
Selectivity estimation in spatial databases
– Acharya, Poosala, et al.
- 1999
|
|
17
|
A Cost Model for Query Optimization Using R-trees
– Aref, Samet
- 1994
|
|
10
|
How to avoid building datablades that know the value of everything and the cost of nothing
– Aoki
- 1999
|
|
7
|
Filter trees for managing spatial data over a range of size granularities
– Sevcik, Koudas
- 1996
|
|
6
|
et al. Building a scalable geo-spatial database system: Technology, implementation, and evaluation
– Patel
- 1997
|