Results 1  10
of
34
An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions
 ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS
, 1994
"... Consider a set S of n data points in real ddimensional space, R d , where distances are measured using any Minkowski metric. In nearest neighbor searching we preprocess S into a data structure, so that given any query point q 2 R d , the closest point of S to q can be reported quickly. Given any po ..."
Abstract

Cited by 984 (32 self)
 Add to MetaCart
Consider a set S of n data points in real ddimensional space, R d , where distances are measured using any Minkowski metric. In nearest neighbor searching we preprocess S into a data structure, so that given any query point q 2 R d , the closest point of S to q can be reported quickly. Given any positive real ffl, a data point p is a (1 + ffl)approximate nearest neighbor of q if its distance from q is within a factor of (1 + ffl) of the distance to the true nearest neighbor. We show that it is possible to preprocess a set of n points in R d in O(dn log n) time and O(dn) space, so that given a query point q 2 R d , and ffl ? 0, a (1 + ffl)approximate nearest neighbor of q can be computed in O(c d;ffl log n) time, where c d;ffl d d1 + 6d=ffle d is a factor depending only on dimension and ffl. In general, we show that given an integer k 1, (1 + ffl)approximations to the k nearest neighbors of q can be computed in additional O(kd log n) time.
Data structures for mobile data
 JOURNAL OF ALGORITHMS
, 1997
"... A kinetic data structure (KDS) maintains an attribute of interest in a system of geometric objects undergoing continuous motion. In this paper we develop a conceptual framework for kinetic data structures, propose a number of criteria for the quality of such structures, and describe a number of fund ..."
Abstract

Cited by 257 (53 self)
 Add to MetaCart
(Show Context)
A kinetic data structure (KDS) maintains an attribute of interest in a system of geometric objects undergoing continuous motion. In this paper we develop a conceptual framework for kinetic data structures, propose a number of criteria for the quality of such structures, and describe a number of fundamental techniques for their design. We illustrate these general concepts by presenting kinetic data structures for maintaining the convex hull and the closest pair of moving points in the plane; these structures behavewell according to the proposed quality criteria for KDSs.
Incremental Distance Join Algorithms for Spatial Databases
, 1998
"... Two new spatial join operations, distance join and distance semijoin, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these operations, which can be used in a pipelined fashi ..."
Abstract

Cited by 145 (12 self)
 Add to MetaCart
(Show Context)
Two new spatial join operations, distance join and distance semijoin, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these operations, which can be used in a pipelined fashion, thereby obviating the need to wait for their completion when only a few tuples are needed. The algorithms can be used with a large class of hierarchical spatial data structures and arbitrary spatial data types in any dimensions. In addition, any distance metric may be employed. A performance study using Rtrees shows that the incremental algorithms outperform nonincremental approaches by an order of magnitude if only a small part of the result is needed, while the penalty, if any, for the incremental processing is modest if the entire join result is required.
Similarity Indexing: Algorithms and Performance
 In Proceedings SPIE Storage and Retrieval for Image and Video Databases
, 1996
"... Efficient indexing support is essential to allow contentbased image and video databases using similaritybased retrieval to scale to large databases (tens of thousands up to millions of images). In this paper, we take an in depth look at this problem. One of the major difficulties in solving this pr ..."
Abstract

Cited by 125 (1 self)
 Add to MetaCart
(Show Context)
Efficient indexing support is essential to allow contentbased image and video databases using similaritybased retrieval to scale to large databases (tens of thousands up to millions of images). In this paper, we take an in depth look at this problem. One of the major difficulties in solving this problem is the high dimension (6100) of the feature vectors that are used to represent objects. We provide an overview of the work in computational geometry on this problem and highlight the results we found are most useful in practice, including the use of approximate nearest neighbor algorithms. We also present a variant of the optimized kd tree we call the VAM kd tree, and provide algorithms to create an optimized Rtree we call the VAMSplit Rtree. We found that the VAMSplit Rtree provided better overall performance than all competing structures we tested for main memory and secondary memory applications. We observed large improvements in performance relative to the R*tree and SStree in secondary memory applications, and modest improvements relative to optimized kd tree variants.Nearest Neighbor Search
ClosestPoint Problems in Computational Geometry
, 1997
"... This is the preliminary version of a chapter that will appear in the Handbook on Computational Geometry, edited by J.R. Sack and J. Urrutia. A comprehensive overview is given of algorithms and data structures for proximity problems on point sets in IR D . In particular, the closest pair problem, th ..."
Abstract

Cited by 73 (14 self)
 Add to MetaCart
This is the preliminary version of a chapter that will appear in the Handbook on Computational Geometry, edited by J.R. Sack and J. Urrutia. A comprehensive overview is given of algorithms and data structures for proximity problems on point sets in IR D . In particular, the closest pair problem, the exact and approximate postoffice problem, and the problem of constructing spanners are discussed in detail. Contents 1 Introduction 1 2 The static closest pair problem 4 2.1 Preliminary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Algorithms that are optimal in the algebraic computation tree model . 5 2.2.1 An algorithm based on the Voronoi diagram . . . . . . . . . . . 5 2.2.2 A divideandconquer algorithm . . . . . . . . . . . . . . . . . . 5 2.2.3 A plane sweep algorithm . . . . . . . . . . . . . . . . . . . . . . 6 2.3 A deterministic algorithm that uses indirect addressing . . . . . . . . . 7 2.3.1 The degraded grid . . . . . . . . . . . . . . . . . . ...
Closestpoint problems simplified on the RAM
 IN PROC. 13RD ACMSIAM SYMPOS. ON DISCRETE ALGORITHMS
, 2002
"... Basic proximity problems for lowdimensional point sets, such as closest pair (CP) and approximate nearest neighbor (ANN), have been studied extensively in the computational geometry literature, with well over a hundred papers published (we merely cite the survey by Smid [10] and omit most reference ..."
Abstract

Cited by 37 (5 self)
 Add to MetaCart
Basic proximity problems for lowdimensional point sets, such as closest pair (CP) and approximate nearest neighbor (ANN), have been studied extensively in the computational geometry literature, with well over a hundred papers published (we merely cite the survey by Smid [10] and omit most references). Generally, optimal algorithms designed for worstcase input require hierarchical spatial structures with sophisticated balancing conditions (we mention, for example, the BBD trees of Arya et al., balanced quadtrees, and Callahan and Kosaraju's fairsplit trees); dynamization of these structures is even more involved (relying on Sleator and Tarjan's dynamic trees or Frederickson's topology trees). In this note, we point out that much simpler algorithms with the same performance are possible using standard, though nonalgebraic, RAM operations. This is interesting, considering that nonalgebraic operations have been used before in the literature (e.g., in the original version of the BBD tree [2], as well as in various randomized CP methods). The CP algorithm can be stated completely in one paragraph. Assume coordinates are positive integers bounded by U = 2 w. Given a point p in a constant dimension d where the ith coordinate p i is the number p iw p i0 in binary, dene its shue (p) to be the number p 1w pdw p 10 p d0 in binary, and dene shifts i (p) = (p 1 + bi2
Online Discovery and Maintenance of Time Series Motifs
"... The detection of repeated subsequences, time series motifs, is a problem which has been shown to have great utility for several higherlevel data mining algorithms, including classification, clustering, segmentation, forecasting, and rule discovery. In recent years there has been significant researc ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
(Show Context)
The detection of repeated subsequences, time series motifs, is a problem which has been shown to have great utility for several higherlevel data mining algorithms, including classification, clustering, segmentation, forecasting, and rule discovery. In recent years there has been significant research effort spent on efficiently discovering these motifs in static offline databases. However, for many domains, the inherent streaming nature of time series demands online discovery and maintenance of time series motifs. In this paper, we develop the first online motif discovery algorithm which monitors and maintains motifs exactly in real time over the most recent history of a stream. Our algorithm has a worstcase update time which is linear to the window size and is extendible to maintain more complex pattern structures. In contrast, the current offline algorithms either need significant update time or require very costly preprocessing steps which online algorithms simply cannot afford. Our core ideas allow useful extensions of our algorithm to deal with arbitrary data rates and discovering multidimensional motifs. We demonstrate the utility of our algorithms with a variety of case studies in the domains of robotics, acoustic monitoring and online compression.
Topology BTrees and Their Applications
"... . The wellknown Btree data structure provides a mechanism for dynamically maintaining balanced binary trees in external memory. We present an externalmemory dynamic data structure for maintaining arbitrary binary trees. Our data structure, which we call the topology Btree, is an externalmemory ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
. The wellknown Btree data structure provides a mechanism for dynamically maintaining balanced binary trees in external memory. We present an externalmemory dynamic data structure for maintaining arbitrary binary trees. Our data structure, which we call the topology Btree, is an externalmemory analogue to the internalmemory topology tree data structure of Frederickson. It allows for dynamic expression evaluation and updates as well as various tree searching and evaluation queries. We show how to apply this data structure to a number of externalmemory dynamic problems, including approximate nearestneighbor searching and closestpair maintenance. 1 Introduction The Btree [8, 12, 14, 15] data structure is a very efficient and powerful way for maintaining balanced binary trees in external memory [1, 11, 13, 18, 19, 21, 22, 2]. Indeed, in his wellknown survey paper [8], Comer calls Btrees "ubiquitous," for they are found in a host of different applications. Nevertheless, there ar...
Computational Geometry
 in optimization 2.5D and 3D NC surface machining. Computers in Industry
, 1996
"... Introduction Computational geometry evolves from the classical discipline of design and analysis of algorithms, and has received a great deal of attention in the last two decades since its inception in 1975 by M. Shamos[108]. It is concerned with the computational complexity of geometric problems t ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Introduction Computational geometry evolves from the classical discipline of design and analysis of algorithms, and has received a great deal of attention in the last two decades since its inception in 1975 by M. Shamos[108]. It is concerned with the computational complexity of geometric problems that arise in various disciplines such as pattern recognition, computer graphics, computer vision, robotics, VLSI layout, operations research, statistics, etc. In contrast with the classical approach to proving mathematical theorems about geometryrelated problems, this discipline emphasizes the computational aspect of these problems and attempts to exploit the underlying geometric properties possible, e.g., the metric space, to derive efficient algorithmic solutions. The classical theorem, for instance, that a set S is convex if and only if for any 0 ff 1 the convex combination ffp + (1 \Gamma<F
Randomized Data Structures for the Dynamic ClosestPair Problem
, 1993
"... We describe a new randomized data structure, the sparse partition, for solving the dynamic closestpair problem. Using this data structure the closest pair of a set of n points in Ddimensional space, for any fixed D, can be found in constant time. If a frame containing all the points is known in adv ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
We describe a new randomized data structure, the sparse partition, for solving the dynamic closestpair problem. Using this data structure the closest pair of a set of n points in Ddimensional space, for any fixed D, can be found in constant time. If a frame containing all the points is known in advance, and if the floor function is available at unitcost, then the data structure supports insertions into and deletions from the set in expected O(log n) time and requires expected O(n) space. Here, it is assumed that the updates are chosen by an adversary who does not know the random choices made by the data structure. This method is more efficient than any deterministic algorithm for solving the problem in dimension D ? 1. The data structure can be modified to run in O(log 2 n) expected time per update in the algebraic computation tree model of computation. Even this version is more efficient than the currently best known deterministic algorithm for D ? 2. 1 Introduction We ...