@MISC{Indyk04nearestneighbors, author = {Piotr Indyk}, title = {Nearest Neighbors In High-Dimensional Spaces}, year = {2004} }

Share

OpenURL

Abstract

In this chapter we consider the following problem: given a set P of points in a high-dimensional space, construct a data structure which given any query point q nds the point in P closest to q. This problem, called nearest neighbor search is of significant importance to several areas of computer science, including pattern recognition, searching in multimedial data, vector compression [GG91], computational statistics [DW82], and data mining. Many of these applications involve data sets which are very large (e.g., a database containing Web documents could contain over one billion documents). Moreover, the dimensionality of the points is usually large as well (e.g., in the order of a few hundred). Therefore, it is crucial to design algorithms which scale well with the database size as well as with the dimension. The nearest-neighbor problem is an example of a large class of proximity problems, which, roughly speaking, are problems whose definitions involve the notion of...