• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Hamming embedding and weak geometric consistency for large scale image search. (2008)

by H Jegou, M Douze, C Schmid
Venue:In ECCV,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 330
Next 10 →

Aggregating local descriptors into a compact image representation

by Herve Jegou, Matthijs Douze, Cordelia Schmid, Patrick Perez
"... We address the problem of image search on a very large scale, where three constraints have to be considered jointly: the accuracy of the search, its efficiency, and the memory usage of the representation. We first propose a simple yet efficient way of aggregating local image descriptors into a vecto ..."
Abstract - Cited by 226 (19 self) - Add to MetaCart
We address the problem of image search on a very large scale, where three constraints have to be considered jointly: the accuracy of the search, its efficiency, and the memory usage of the representation. We first propose a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation. We then show how to jointly optimize the dimension reduction and the indexing algorithm, so that it best preserves the quality of vector comparison. The evaluation shows that our approach significantly outperforms the state of the art: the search ac-curacy is comparable to the bag-of-features approach for an image representation that fits in 20 bytes. Searching a 10 million image dataset takes about 50ms.

Product quantization for nearest neighbor search

by Hervé Jégou, Matthijs Douze, Cordelia Schmid , 2010
"... This paper introduces a product quantization based approach for approximate nearest neighbor search. The idea is to decomposes the space into a Cartesian product of low dimensional subspaces and to quantize each subspace separately. A vector is represented by a short code composed of its subspace q ..."
Abstract - Cited by 222 (31 self) - Add to MetaCart
This paper introduces a product quantization based approach for approximate nearest neighbor search. The idea is to decomposes the space into a Cartesian product of low dimensional subspaces and to quantize each subspace separately. A vector is represented by a short code composed of its subspace quantization indices. The Euclidean distance between two vectors can be efficiently estimated from their codes. An asymmetric version increases precision, as it computes the approximate distance between a vector and a code. Experimental results show that our approach searches for nearest neighbors efficiently, in particular in combination with an inverted file system. Results for SIFT and GIST image descriptors show excellent search accuracy outperforming three state-of-the-art approaches. The scalability of our approach is validated on a dataset of two billion vectors.
(Show Context)

Citation Context

...ructed using publicly available data and software. For the SIFT descriptors, the learning set is extracted from Flickr images and the database and query descriptors are from the INRIA Holidays images =-=[20]-=-. For GIST, the learning set consists of the first 100k images extracted from the tiny image set of [16]. The database set is the Holidays image set combined with Flickr1M used in [20]. The query vect...

Iterative quantization: A procrustean approach to learning binary codes

by Yunchao Gong, Svetlana Lazebnik - In Proc. of the IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR , 2011
"... This paper addresses the problem of learning similaritypreserving binary codes for efficient retrieval in large-scale image collections. We propose a simple and efficient alternating minimization scheme for finding a rotation of zerocentered data so as to minimize the quantization error of mapping t ..."
Abstract - Cited by 157 (6 self) - Add to MetaCart
This paper addresses the problem of learning similaritypreserving binary codes for efficient retrieval in large-scale image collections. We propose a simple and efficient alternating minimization scheme for finding a rotation of zerocentered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube. This method, dubbed iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). Our experiments show that the resulting binary coding schemes decisively outperform several other state-of-the-art methods. 1.

Aggregating local image descriptors into compact codes

by Herve Jegou, Florent Perronnin, Matthijs Douze, Jorge Sanchez, Patrick Pérez , Cordelia Schmid , 2011
"... ..."
Abstract - Cited by 127 (14 self) - Add to MetaCart
Abstract not found

Efficient Object Category Recognition Using

by Lorenzo Torresani, Martin Szummer, Andrew Fitzgibbon
"... Abstract. We introduce a new descriptor for images which allows the construction of efficient and compact classifiers with good accuracy on object category recognition. The descriptor is the output of a large number of weakly trained object category classifiers on the image. The trained categories a ..."
Abstract - Cited by 122 (9 self) - Add to MetaCart
Abstract. We introduce a new descriptor for images which allows the construction of efficient and compact classifiers with good accuracy on object category recognition. The descriptor is the output of a large number of weakly trained object category classifiers on the image. The trained categories are selected from an ontology of visual concepts, but the intention is not to encode an explicit decomposition of the scene. Rather, we accept that existing object category classifiers often encode not the category per se but ancillary image characteristics; and that these ancillary characteristics can combine to represent visual classes unrelated to the constituent categories ’ semantic meanings. The advantage of this descriptor is that it allows object-category queries to be made against image databases using efficient classifiers (efficient at test time) such as linear support vector machines, and allows these queries to be for novel categories. Even when the representation is reduced to 200 bytes per image, classification accuracy on object category recognition is comparable with the state of the art (36 % versus 42%), but at orders of magnitude lower computational cost.
(Show Context)

Citation Context

...near SVMs, decision trees, or tf-idf, as these can be implemented to run efficiently on large databases. Although a number of systems satisfy these desiderata for object instance or place recognition =-=[18,9]-=- or for whole scene recognition [26], we argue that no existing system has addressed these requirements in the context of object category recognition. The system we propose is a form of classifier com...

Compressed Histogram of Gradients: A Low-Bitrate Descriptor

by Vijay Chandrasekhar, Gabriel Takacs, David M. Chen, Sam S. Tsai, Yuriy Reznik, Radek Grzeszczuk, Bernd Girod - INT J COMPUT VIS , 2011
"... Establishing visual correspondences is an essential component of many computer vision problems, which is often done with local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile visual search applications. We propose a framework f ..."
Abstract - Cited by 101 (24 self) - Add to MetaCart
Establishing visual correspondences is an essential component of many computer vision problems, which is often done with local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile visual search applications. We propose a framework for computing low bit-rate feature descriptors with a 20 × reduction in bit rate compared to state-of-the-art descriptors. The framework offers low complexity and has significant speed-up in the matching stage. We show how to efficiently compute distances between descriptors in the compressed domain eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, BRIEF, MPEG-7 image signatures and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes significantly over a wide range of bitrates. We implement the descriptor in a mobile image retrieval system and for a database of 1 million CD, DVD and book covers, we achieve 96 % retrieval accuracy using only 4 KB of data per query image.

Large-scale image retrieval with compressed Fisher vectors

by Florent Perronnin, Yan Liu, Jorge Sánchez, Hervé Poirier - IN: CVPR PERRONNIN F, SÁNCHEZ J, LIU Y (2010B) LARGE-SCALE , 2010
"... The problem of large-scale image search has been traditionally addressed with the bag-of-visual-words (BOV). In this article, we propose to use as an alternative the Fisher kernel framework. We first show why the Fisher representation is well-suited to the retrieval problem: it describes an image by ..."
Abstract - Cited by 100 (8 self) - Add to MetaCart
The problem of large-scale image search has been traditionally addressed with the bag-of-visual-words (BOV). In this article, we propose to use as an alternative the Fisher kernel framework. We first show why the Fisher representation is well-suited to the retrieval problem: it describes an image by what makes it different from other images. One drawback of the Fisher vector is that it is high-dimensional and, as opposed to the BOV, it is dense. The resulting memory and computational costs do not make Fisher vectors directly amenable to large-scale retrieval. Therefore, we compress Fisher vectors to reduce their memory footprint and speed-up the retrieval. We compare three binarization approaches: a simple approach devised for this representation and two standard compression techniques. We show on two publicly available datasets that compressed Fisher vectors perform very well using as little as a few hundreds of bits per image, and significantly better than a very recent compressed BOV approach.

Geometric min-Hashing: Finding a (Thick) Needle in a Haystack

by Ondrej Chum , Michal Perd’och, Jiri Matas - CVPR 2009 , 2009
"... We propose a novel hashing scheme for image retrieval, clustering and automatic object discovery. Unlike commonly used bag-of-words approaches, the spatial extent of image features is exploited in our method. The geometric information is used both to construct repeatable hash keys and to increase th ..."
Abstract - Cited by 93 (1 self) - Add to MetaCart
We propose a novel hashing scheme for image retrieval, clustering and automatic object discovery. Unlike commonly used bag-of-words approaches, the spatial extent of image features is exploited in our method. The geometric information is used both to construct repeatable hash keys and to increase the discriminability of the description. Each hash key combines visual appearance (visual words) with semi-local geometric information. Compared with the state-of-the-art min-Hash, the proposed method has both higher recall (probability of collision for hashes on the same object) and lower false positive rates (random collisions). The advantages of Geometric min-Hashing approach are most pronounced in the presence of viewpoint and scale change, significant occlusion or small physical overlap of the viewing fields. We demonstrate the power of the proposed method on small object discovery in a large unordered collection of images and on a large scale image clustering problem.

On the burstiness of visual elements

by Hervé Jégou, Matthijs Douze, Cordelia Schmid - in CVPR
"... Figure 1. Illustration of burstiness. Features assigned to the most “bursty ” visual word of each image are displayed. Burstiness, a phenomenon initially observed in text retrieval, is the property that a given visual element appears more times in an image than a statistically independent model woul ..."
Abstract - Cited by 86 (16 self) - Add to MetaCart
Figure 1. Illustration of burstiness. Features assigned to the most “bursty ” visual word of each image are displayed. Burstiness, a phenomenon initially observed in text retrieval, is the property that a given visual element appears more times in an image than a statistically independent model would predict. In the context of image search, burstiness corrupts the visual similarity measure, i.e., the scores used to rank the images. In this paper, we propose a strategy to handle visual bursts for bag-of-features based image search systems. Experimental results on three reference datasets show that our method significantly and consistently outperforms the state of the art. 1.
(Show Context)

Citation Context

...tizing the local descriptors into the visual vocabulary, resulting in frequency vectors. This representation can be refined with a binary signature per visual word and partial geometrical information =-=[4]-=-. Given that some visual words are more frequent than others, most of the existing approaches use an inverse document frequency (idf ) word weighting scheme, similar to text retrieval [17]. It consist...

Efficient Representation of Local Geometry for Large Scale Object Retrieval

by Michal Perd'och , et al. , 2009
"... State of the art methods for image and object retrieval exploit both appearance (via visual words) and local geometry (spatial extent, relative pose). In large scale problems, memory becomes a limiting factor – local geometry is stored for each feature detected in each image and requires storage lar ..."
Abstract - Cited by 82 (7 self) - Add to MetaCart
State of the art methods for image and object retrieval exploit both appearance (via visual words) and local geometry (spatial extent, relative pose). In large scale problems, memory becomes a limiting factor – local geometry is stored for each feature detected in each image and requires storage larger than the inverted file and term frequency and inverted document frequency weights together. We propose a novel method for learning discretized local geometry representation based on minimization of average reprojection error in the space of ellipses. The representation requires only 24 bits per feature without drop in performance. Additionally, we show that if the gravity vector assumption is used consistently from the feature description to spatial verification, it improves retrieval performance and decreases the memory footprint. The proposed method outperforms state of the art retrieval algorithms in a standard image retrieval benchmark.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University