Results 21  30
of
582
Dimensionality Reduction for Similarity Searching in Dynamic Databases
, 1998
"... Databases are increasingly being used to store multimedia objects such as maps, images, audio and video. Storage and retrieval of these objects is accomplished using multidimensional index structures such as R*trees and SStrees. As dimensionality increases, query performance in these index struc ..."
Abstract

Cited by 112 (6 self)
 Add to MetaCart
(Show Context)
Databases are increasingly being used to store multimedia objects such as maps, images, audio and video. Storage and retrieval of these objects is accomplished using multidimensional index structures such as R*trees and SStrees. As dimensionality increases, query performance in these index structures degrades. This phenomenon, generally referred to as the dimensionality curse, can be circumvented by reducing the dimensionality of the data. Such a reduction is however accompanied by a loss of precision of query results. Current techniques such as QBIC use SVD transformbased dimensionality reduction to ensure high query precision. The drawback of this approach is that SVD is expensive to compute, and therefore not readily applicable to dynamic databases. In this paper, we propose novel techniques for performing SVDbased dimensionality reduction in dynamic databases. When the data distribution changes considerably so as to degrade query precision, we recompute the SVD transform a...
The Atree: An Index Structure for HighDimensional Spaces Using Relative Approximation
, 2000
"... We propose a novel index structure, Atree (Approximation tree), for similarity search of highdimensional data. The basic idea of the Atree is the introduction of Virtual Bounding Rectangles (VBRs), which contain and approximate MBRs and data objects. VBRs can be represented rather compactly, and ..."
Abstract

Cited by 108 (0 self)
 Add to MetaCart
We propose a novel index structure, Atree (Approximation tree), for similarity search of highdimensional data. The basic idea of the Atree is the introduction of Virtual Bounding Rectangles (VBRs), which contain and approximate MBRs and data objects. VBRs can be represented rather compactly, and thus affect the tree configuration both quantitatively and qualitatively. Firstly, since tree nodes can install large number of entries of VBRs, fanout of nodes becomes large, thus leads to fast search. More importantly, we have a free hand in arranging MBRs and VBRs in tree nodes. In the Atrees, nodes contain entries of an MBR and its children VBRs. Therefore, by fetching a node of an Atree, we can obtain the information of exact position of a parent MBR and approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the Atree outperforms the SRtree and the VAFile in all range of dimensionality up to 64 dimension, which is the highest dimension in our experiments. The Atree achieves 77.3 % (77.7%, resp.) savings in page accesses compared to the SRtree (the VAFile, resp.) for 64dimensional real data.
Indexing Large Metric Spaces for Similarity Search Queries
, 1999
"... In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are propos ..."
Abstract

Cited by 93 (0 self)
 Add to MetaCart
(Show Context)
In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are proposed for applications where the distance computations between objects of the data domain are expensive (such as high dimensional data), and the distance function used is metric. In this paper, we consider using distancebased index structures for similarity queries on large metric spaces. We elaborate on the approach of using reference points (vantage points) to partition the data space into spherical shelllike regions in a hierarchical manner. We introduce the multivantage point tree structure (mvptree) that uses more than one vantage points to partition the space into spherical cuts at each level. In answering similarity based queries, the mvptree also utilizes the precomputed (at con...
iDistance: An Adaptive B+tree Based Indexing Method for Nearest Neighbor Search
, 2005
"... In this article, we present an efficient B +tree based indexing method, called iDistance, for Knearest neighbor (KNN) search in a highdimensional metric space. iDistance partitions the data based on a space or datapartitioning strategy, and selects a reference point for each partition. The data ..."
Abstract

Cited by 92 (10 self)
 Add to MetaCart
In this article, we present an efficient B +tree based indexing method, called iDistance, for Knearest neighbor (KNN) search in a highdimensional metric space. iDistance partitions the data based on a space or datapartitioning strategy, and selects a reference point for each partition. The data points in each partition are transformed into a single dimensional value based on their similarity with respect to the reference point. This allows the points to be indexed using a B +tree structure and KNN search to be performed using onedimensional range search. The choice of partition and reference points adapts the index structure to the data distribution. We conducted extensive experiments to evaluate the iDistance technique, and report results demonstrating its effectiveness. We also present a cost model for iDistance KNN search, which can be exploited in query optimization.
A Road Network Embedding Technique for kNearest Neighbor Search in Moving Object Databases
 GeoInformatica
, 2002
"... A very important class of queries in GIS applications is the class of Knearest neighbor queries. Most of the current studies on the Knearest neighbor queries utilize spatial index structures and hence are based on the Euclidean distances between the points. In realworld road networks, however, th ..."
Abstract

Cited by 90 (5 self)
 Add to MetaCart
(Show Context)
A very important class of queries in GIS applications is the class of Knearest neighbor queries. Most of the current studies on the Knearest neighbor queries utilize spatial index structures and hence are based on the Euclidean distances between the points. In realworld road networks, however, the shortest distance between two points depends on the actual path connecting the points and cannot be computed accurately using one of the Minkowski metrics. Thus, the Euclidean distance may not properly approximate the real distance. In this paper, we apply an embedding technique to transform a road network to a high dimensional space in order to utilize computationally simple Minkowski metrics for distance measurement. Subsequently, we extend our approach to dynamically transform new points into the embedding space. Finally, we propose an ef®cient technique that can ®nd the actual shortest path between two points in the original road network using only the embedding space. Our empirical experiments indicate that the Chessboard distance metric …L? † in the embedding space preserves the ordering of the distances between a point and its neighbors more precisely as compared to the Euclidean distance in the original road network.
STRIPES: An Efficient Index for Predicted Trajectories
 in SIGMOD
, 2004
"... Moving object databases are required to support queries on a large number of continuously moving objects. A key requirement for indexing methods in this domain is to efficiently support both update and query operations. Previous work on indexing such databases can be broadly divided into two categor ..."
Abstract

Cited by 84 (1 self)
 Add to MetaCart
(Show Context)
Moving object databases are required to support queries on a large number of continuously moving objects. A key requirement for indexing methods in this domain is to efficiently support both update and query operations. Previous work on indexing such databases can be broadly divided into two categories: indexing the past positions and indexing the future predicted positions. In this paper we focus on an efficient indexing method for indexing the future positions of moving objects. In this paper we propose an indexing method, called STRIPES, which indexes predicted trajectories in a dual transformed space. Trajectories for objects in ddimensional space become points in a higherdimensional 2dspace. This dual transformed space is then indexed using a regular hierarchical grid decomposition indexing structure. STRIPES can evaluate a range of queries including timeslice, window, and moving queries. We have carried out extensive experimental evaluation comparing the performance of STRIPES with the best known existing predicted trajectory index (the TPR*tree), and show that our approach is significantly faster than TPR*tree for both updates and search queries. 1.
Indexing the Distance: An Efficient Method to KNN Processing
, 2001
"... In this paper, we present an efficient method, called iDistance, for Knearest neighbor (KNN) search in a highdimensional space. iDistance partitions the data and selects a reference point for each partition. The data in each cluster are transformed into a single dimensional space based on their si ..."
Abstract

Cited by 83 (18 self)
 Add to MetaCart
In this paper, we present an efficient method, called iDistance, for Knearest neighbor (KNN) search in a highdimensional space. iDistance partitions the data and selects a reference point for each partition. The data in each cluster are transformed into a single dimensional space based on their similarity with respect to a reference point. This allows the points to be indexed using a B + tree structure and KNN search be performed using onedimensional range search. The choice of partition and reference point provides the iDistance technique with degrees of freedom most other techniques do not have. We describe how appropriate choices here can effectively adapt the index structure to the data distribution. We conducted extensive experiments to evaluate the iDistance technique, and report results demonstrating its effectiveness.
Similarity search over time series data using wavelets
 In ICDE
, 2002
"... We consider the use of wavelet transformations as a dimensionality reduction technique to permit efficient similarity search over highdimensional timeseries data. While numerous transformations have been proposed and studied, the only wavelet that has been shown to be effective for this applicatio ..."
Abstract

Cited by 82 (0 self)
 Add to MetaCart
(Show Context)
We consider the use of wavelet transformations as a dimensionality reduction technique to permit efficient similarity search over highdimensional timeseries data. While numerous transformations have been proposed and studied, the only wavelet that has been shown to be effective for this application is the Haar wavelet. In this work, we observe that a large class of wavelet transformations (not only orthonormal wavelets but also biorthonormal wavelets)can be used to support similarity search. This class includes the most popular and most effective wavelets being used in image compression. We present a detailed performance study of the effects of using different wavelets on the performance of similarity search for timeseries data. We include several wavelets that outperform both the Haar wavelet and the best known nonwavelet transformations for this application. To ensure our results are usable by an application engineer, we also show how to configure an indexing strategy for the best performing transformations. Finally, we identify classes of data that can be indexed efficiently using these wavelet transformations. 1.