Results 1 - 10
of
112
Kernelized locality-sensitive hashing for scalable image search
- IEEE International Conference on Computer Vision (ICCV
, 2009
"... Fast retrieval methods are critical for large-scale and data-driven vision applications. Recent work has explored ways to embed high-dimensional features or complex distance functions into a low-dimensional Hamming space where items can be efficiently searched. However, existing methods do not apply ..."
Abstract
-
Cited by 163 (5 self)
- Add to MetaCart
(Show Context)
Fast retrieval methods are critical for large-scale and data-driven vision applications. Recent work has explored ways to embed high-dimensional features or complex distance functions into a low-dimensional Hamming space where items can be efficiently searched. However, existing methods do not apply for high-dimensional kernelized data when the underlying feature embedding for the kernel is unknown. We show how to generalize locality-sensitive hashing to accommodate arbitrary kernel functions, making it possible to preserve the algorithm’s sub-linear time similarity search guarantees for a wide class of useful similarity functions. Since a number of successful image-based kernels have unknown or incomputable embeddings, this is especially valuable for image retrieval tasks. We validate our technique on several large-scale datasets, and show that it enables accurate and fast performance for example-based object classification, feature matching, and content-based retrieval. 1.
Fast Image Search for Learned Metrics
"... We introduce a method that enables scalable image search for learned metrics. Given pairwise similarity and dissimilarity constraints between some images, we learn a Mahalanobis distance function that captures the images’ underlying relationships well. To allow sub-linear time similarity search unde ..."
Abstract
-
Cited by 103 (11 self)
- Add to MetaCart
(Show Context)
We introduce a method that enables scalable image search for learned metrics. Given pairwise similarity and dissimilarity constraints between some images, we learn a Mahalanobis distance function that captures the images’ underlying relationships well. To allow sub-linear time similarity search under the learned metric, we show how to encode the learned metric parameterization into randomized locality-sensitive hash functions. We further formulate an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality make it infeasible to learn an explicit weighting over the feature dimensions. We demonstrate the approach applied to a variety of image datasets. Our learned metrics improve accuracy relative to commonly-used metric baselines, while our hashing construction enables efficient indexing with learned distances and very large databases.
Fast contour matching using approximate earth mover’s distance
, 2004
"... Weighted graph matching is a good way to align a pair of shapes represented by a set of descriptive local features; the set of correspondences produced by the minimum cost matching between two shapes ’ features often reveals how similar the shapes are. However, due to the complexity of computing the ..."
Abstract
-
Cited by 90 (9 self)
- Add to MetaCart
(Show Context)
Weighted graph matching is a good way to align a pair of shapes represented by a set of descriptive local features; the set of correspondences produced by the minimum cost matching between two shapes ’ features often reveals how similar the shapes are. However, due to the complexity of computing the exact minimum cost matching, previous algorithms could only run efficiently when using a limited number of features per shape, and could not scale to perform retrievals from large databases. We present a contour matching algorithm that quickly computes the minimum weight matching between sets of descriptive local features using a recently introduced low-distortion embedding of the Earth Mover’s Distance (EMD) into a normed space. Given a novel embedded contour, the nearest neighbors in a database of embedded contours are retrieved in sublinear time via approximate nearest neighbors search with Locality-Sensitive Hashing (LSH). We demonstrate our shape matching method on a database of 136,500 images of human figures. Our method achieves a speedup of four orders of magnitude over the exact method, at the cost of only a 4 % reduction in accuracy. 1.
Local discriminant embedding and its variants
- in Proc. IEEE Conf. Computer Vision and Pattern Recognition
, 2005
"... We present a new approach, called local discriminant embedding (LDE), to manifold learning and pattern classification. In our framework, the neighbor and class relations of data are used to construct the embedding for classification problems. The proposed algorithm learns the embedding for the subma ..."
Abstract
-
Cited by 85 (1 self)
- Add to MetaCart
(Show Context)
We present a new approach, called local discriminant embedding (LDE), to manifold learning and pattern classification. In our framework, the neighbor and class relations of data are used to construct the embedding for classification problems. The proposed algorithm learns the embedding for the submanifold of each class by solving an optimization problem. After being embedded into a low-dimensional subspace, data points of the same class maintain their intrinsic neighbor relations, whereas neighboring points of different classes no longer stick to one another. Via embedding, new test data are thus more reliably classified by the nearest neighbor rule, owing to the locally discriminating nature. We also describe two useful variants: twodimensional LDE and kernel LDE. Comprehensive comparisons and extensive experiments on face recognition are included to demonstrate the effectiveness of our method. 1.
LDAHash: Improved matching with smaller descriptors
, 2010
"... SIFT-like local feature descriptors are ubiquitously employed in such computer vision applications as content-based retrieval, video analysis, copy detection, object recognition, photo-tourism and 3D reconstruction. Feature descriptors can be designed to be invariant to certain classes of photometri ..."
Abstract
-
Cited by 80 (10 self)
- Add to MetaCart
(Show Context)
SIFT-like local feature descriptors are ubiquitously employed in such computer vision applications as content-based retrieval, video analysis, copy detection, object recognition, photo-tourism and 3D reconstruction. Feature descriptors can be designed to be invariant to certain classes of photometric and geometric transformations, in particular, affine and intensity scale transformations. However, real transformations that an image can undergo can only be approximately modeled in this way, and thus most descriptorsareonlyapproximatelyinvariantinpractice. Secondly, descriptors are usually high-dimensional (e.g. SIFT is represented as a 128-dimensional vector). In large-scale retrieval and matching problems, this can pose challenges in storing and retrieving descriptor data. We map the descriptor vectors into the Hamming space, in which the Hamming metric is used to compare the resulting representations. This way, we reduce the size of the descriptors by representing them as short binary strings and learn descriptor invariance from examples. We show extensive experimental validation, demonstrating the advantage of the proposed approach.
Fast similarity search for learned metrics.
- IEEE Trans. Pattern Anal. Mach. Intell.,
, 2009
"... ..."
(Show Context)
Object Recognition as Many-to-Many Feature Matching
, 2006
"... Object recognition can be formulated as matching image features to model features. When recognition is exemplar-based, feature correspondence is one-to-one. However, segmentation errors, articulation, scale difference, and within-class deformation can yield image and model features which don’t matc ..."
Abstract
-
Cited by 48 (4 self)
- Add to MetaCart
Object recognition can be formulated as matching image features to model features. When recognition is exemplar-based, feature correspondence is one-to-one. However, segmentation errors, articulation, scale difference, and within-class deformation can yield image and model features which don’t match one-to-one but rather many-tomany. Adopting a graph-based representation of a set of features, we present a matching algorithm that establishes many-to-many correspondences between the nodes of two noisy, vertex-labeled weighted graphs. Our approach reduces the problem of many-to-many matching of weighted graphs to that of many-to-many matching of weighted point sets in a normed vector space. This is accomplished by embedding the initial weighted graphs into a normed vector space with low distortion using a novel embedding technique based on a spherical encoding of graph structure. Many-to-many vector correspondences established by the Earth Mover’s Distance framework are mapped back into many-to-many correspondences between graph nodes. Empirical evaluation of the algorithm on an extensive set of recognition trials, including a comparison with two competing graph matching approaches, demonstrates both the robustness and efficacy of the overall approach.
Learning Context-Sensitive Shape Similarity by Graph Transduction
, 2010
"... Shape similarity and shape retrieval are very important topics in computer vision. The recent progress in this domain has been mostly driven by designing smart shape descriptors for providing better similarity measure between pairs of shapes. In this paper, we provide a new perspective to this probl ..."
Abstract
-
Cited by 42 (7 self)
- Add to MetaCart
Shape similarity and shape retrieval are very important topics in computer vision. The recent progress in this domain has been mostly driven by designing smart shape descriptors for providing better similarity measure between pairs of shapes. In this paper, we provide a new perspective to this problem by considering the existing shapes as a group, and study their similarity measures to the query shape in a graph structure. Our method is general and can be built on top of any existing shape similarity measure. For a given similarity measure, a new similarity is learned through graph transduction. The new similarity is learned iteratively so that the neighbors of a given shape influence its final similarity to the query. The basic idea here is related to PageRank ranking, which forms a foundation of Google Web search. The presented experimental results demonstrate that the proposed approach yields significant improvements over the state-of-art shape matching algorithms. We obtained a retrieval rate of 91.61 percent on the MPEG-7 data set, which is the highest ever reported in the literature. Moreover, the learned similarity by the proposed method also achieves promising improvements on both shape classification and shape clustering.
Pyramid match hashing: Sub-linear time indexing over partial correspondences
- In CVPR
, 2007
"... Matching local features across images is often useful when comparing or recognizing objects or scenes, and efficient techniques for obtaining image-to-image correspondences have been developed [6, 4, 11]. However, given a query image, searching a very large image database with such measures remains ..."
Abstract
-
Cited by 38 (6 self)
- Add to MetaCart
(Show Context)
Matching local features across images is often useful when comparing or recognizing objects or scenes, and efficient techniques for obtaining image-to-image correspondences have been developed [6, 4, 11]. However, given a query image, searching a very large image database with such measures remains impractical. We introduce a sublinear time randomized hashing algorithm for indexing sets of feature vectors under their partial correspondences. We develop an efficient embedding function for the normalized partial matching similarity between sets, and show how to exploit random hyperplane properties to construct hash functions that satisfy locality-sensitive constraints. The result is a bounded approximate similarity search algorithm that finds (1 + ɛ)-approximate nearest neighbor images in O(N 1/(1+ɛ) ) time for a database containing N images represented by (varying numbers of) local features. By design the indexing is robust to outlier features, as it favors strong one-to-one matchings but does not penalize for additional distant features. We demonstrate our approach applied to image retrieval for images represented by sets of local appearance features, and show that searching over correspondences is now scalable to large image databases. 1.
Learning Sparse Metrics via Linear Programming
- In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2006
"... ABSTRACT Calculation of object similarity, for example through a distance function, is a common part of data mining and machine learning algorithms. This calculation is crucial for efficiency since it is often repeated a large number of times, the classical example being query-by-example (find obje ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
(Show Context)
ABSTRACT Calculation of object similarity, for example through a distance function, is a common part of data mining and machine learning algorithms. This calculation is crucial for efficiency since it is often repeated a large number of times, the classical example being query-by-example (find objects that are similar to a given query object). Moreover, the performance of these algorithms depends critically on choosing a good distance function. However, it is often the case that (1) the correct distance is unknown or heuristically chosen, and (2) its calculation is computationally expensive (e.g., such as for large dimensional objects). In this paper, we propose a method for constructing relative-distance preserving lowdimensional mappings (sparse mappings) to allow learning unknown distance functions or approximating known functions, with the additional property of reducing distance computation time. We present an algorithm that given examples of proximity comparisons among triples of objects (e.g., object a is closer to b than to c), learns a distance function, in as few dimensions as possible, that preserves these distance relationships. The formulation is based on solving a Linear programming optimization problem that finds an optimal mapping for the given dataset and distance relationships. Unlike other popular embedding algorithms, the method can easily generalize to new points, does not have local minima, and explicitly models computational efficiency by finding a mapping that is sparse, i.e. that depends on a small subset of features or dimensions. Experimental evaluation shows that the method compares favorably with an state-of-the art method in several publicly available datasets.