by Swarup Acharya, Viswanath Poosala, Sridhar Ramaswamy
In Proc. ACM SIGMOD Int. Conf. on Management of Data
http://www.bell-labs.com/user/acharya/papers/spatial.ps.gz
Add To MetaCart
Abstract:
Selectivity estimation of queries is an important and wellstudied problem in relational database systems. In this paper, we examine selectivity estimation in the context of Geographic Information Systems, which manage spatial data such as points, lines, poly-lines and polygons. In particular, we focus on point and range queries over two-dimensional rectangular data. We propose several techniques based on using spatial indices, histograms, binary space partitionings (BSPs), and the novel notion of spatial skew. Our techniques carefully partition the input rectangles into subsets and approximate each partition accurately. We present a detailed experimental study comparing the proposed techniques and the best known sampling and parametric techniques. We evaluate them using synthetic as well as real-life TIGER datasets. Based on our experiments, we identify a BSP based partitioning that we call Min-Skew which consistently provides the most accurate selectivity estimates for spatial queries. The MinSkew partitioning can be constructed efficiently, occupies very little space, and provides accurate selectivity estimates over a broad range of spatial queries.
Citations
|
1651
|
R-trees: A dynamic index structure for spatial searching
– Guttman
- 1984
|
|
706
|
The r*-tree: An efficient and robust access method for points and rectangles
– Beckmann, Kriegel, et al.
- 1990
|
|
303
|
The R+ Tree: A Dynamic Index for Multidimensional Objects
– Sellis, Roussopoulos, et al.
- 1987
|
|
195
|
Human behaviour and the principle of least effort
– Zipf
- 1949
|
|
182
|
Improved histograms for selectivity estimation of range predicates
– Poosala, Ioannidis, et al.
- 1996
|
|
149
|
Selectivity estimation without the attribute value independence assumption
– Poosala, Ioannidis
- 1997
|
|
137
|
Practical selectivity estimation through adaptive sampling
– Lipton, Naughton, et al.
|
|
126
|
T.K.: A model for the prediction of R-tree performance
– Theodoridis
- 1996
|
|
121
|
Accurate estimation of the number of tuples satisfying a condition
– Piatetsky-Shapiro, Connell
- 1984
|
|
99
|
The Sequoia 2000 storage benchmark
– Stonebraker, Frew, et al.
- 1993
|
|
97
|
Estimating the selectivity of spatial queries using the ‘correlation’ fractal dimension
– Belussi, Faloutsos
- 1995
|
|
90
|
Sampling-based estimation of the number of distinct values of an attribute
– Haas, Naughton, et al.
- 1995
|
|
78
|
Adaptive selectivity estimation using query feedback
– Chen, Roussopoulos
- 1994
|
|
70
|
Histogram-based estimation techniques in databases
– Poosala
- 1997
|
|
64
|
The Optimization of Queries in Relational Databases
– Kooi
- 1980
|
|
61
|
Client-server Paradise
– DeWitt, Kabra, et al.
- 1994
|
|
32
|
The montage extensible datablade architecture
– Ubell
- 1994
|
|
28
|
On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity, and Applications
– Muthukrishnan, Poosala, et al.
- 1999
|
|
24
|
Random Sampling from Pseudo-Ranked B + Trees
– Antoshenkov
- 1992
|
|
24
|
The Design and Analyses of Spatial Data Structures
– Samet
- 1989
|
|
20
|
DEDALE, A spatial constraint database
– Grumbach, Rigaux, et al.
- 1997
|
|
20
|
Efficient array partitioning
– Khanna, Muthukrishnan, et al.
- 1997
|
|
17
|
Technical Documentation
– Files
- 1997
|
|
16
|
Generalizing "Search" in Generalized Search Trees
– Aoki
- 1998
|
|
16
|
Understanding GIS---the ARC/INFO method. ARC/INFO
– ARCINFO
- 1993
|
|
6
|
The size of projections of relations satisfying a functional dependency
– Gelenbe, Gardy
- 1982
|
|
1
|
The MapInfo Story, "http://www.mapinfo.com/mapinfo/mapinfostory.html
– Corp
- 1998
|