MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Trading Quality for Time with Nearest-Neighbor Search

Download:
pdf | ps
by Roger Weber Klemens B Ohm
http://mercator.inf.ethz.ch/paper/EDBT00Long.ps.gz
Add To MetaCart

Abstract:

In many situations, users would readily accept an approximate query result if evaluation of the query becomes faster. In particular, this holds true for Nearest-Neighbor Search (NN-Search), a typical implementation of similarity search. In this article, we investigate approximate NNquery evaluation techniques based on the VA-File. This data structure efficiently supports NNquery evaluation in high dimensions. The VA-File contains approximations of each point. VA-File based NN-query evaluation computes bounds on the distance between each point and the query to filter out the vast majority of points. Then, a second phase identifies the NN by computing exact distances of all remaining points. To develop approximate query-evaluation techniques, we proceed in two steps: first, we derive an analytic model for VA-File based NN-search. This is to investigate the relationship between approximation granularity, effectiveness of the filtering step and search performance. In more detail, we develop formulae for the distribution of the error of the bounds and the duration of the different phases of query evaluation. Based on these results, we develop different approximate query evaluation techniques. The first one adapts the bounds to have a more rigid filtering, the second one skips computation of the exact distances. Experiments show that these techniques have the desired effect: for instance, when allowing for a small but specific reduction of result quality, we observed a speedup of 7 in 50-NN search. 1

Citations

706 The r*-tree: An efficient and robust access method for points and rectangles – Beckmann, Kriegel, et al. - 1990
416 ªThe X-Tree: An Index Structure for High-Dimensional Data,º – Berchtold, Keim, et al. - 1996
102 Join synopses for approximate query answering – Acharya, Gibbons, et al. - 1999
72 The new jersey data reduction report – Barbara, DuMouchel, et al. - 1997
56 Fast parallel similarity search in multimedia databases – Berchtold, Böhm, et al. - 1997
7 et al. An optimal algorithm for approximate nearest neighbor searching – Arya - 1998
5 When is "nearest neighbour" meaningful – Beyer, Goldstein, et al. - 1999