• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Iterative quantization: A procrustean approach to learning binary codes (2011)

by Y Gong, S Lazebnik
Venue:In CVPR
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 157
Next 10 →

BSupervised hashing with kernels

by Wei Liu, Jun Wang, Rongrong Ji, Yu-gang Jiang, Shih-fu Chang - in Proc. IEEE Conf. Comput. Vis. Pattern Recognit , 2012
"... Recent years have witnessed the growing popularity of hashing in large-scale vision problems. It has been shown that the hashing quality could be boosted by leveraging supervised information into hash function learning. However, the existing supervised methods either lack adequate performance or oft ..."
Abstract - Cited by 84 (24 self) - Add to MetaCart
Recent years have witnessed the growing popularity of hashing in large-scale vision problems. It has been shown that the hashing quality could be boosted by leveraging supervised information into hash function learning. However, the existing supervised methods either lack adequate performance or often incur cumbersome model training. In this paper, we propose a novel kernel-based supervised hashing model which requires a limited amount of supervised information, i.e., similar and dissimilar data pairs, and a feasible training cost in achieving high quality hashing. The idea is to map the data to compact binary codes whose Hamming distances are minimized on similar pairs and simultaneously maximized on dissimilar pairs. Our approach is distinct from prior works by utilizing the equivalence between optimizing the code inner products and the Hamming distances. This enables us to sequentially and efficiently train the hash functions one bit at a time, yielding very short yet discriminative codes. We carry out extensive experiments on two image benchmarks with up to one million samples, demonstrating that our approach significantly outperforms the state-of-the-arts in searching both metric distance neighbors and semantically similar neighbors, with accuracy gains ranging from 13 % to 46%. 1.
(Show Context)

Citation Context

...pace. In addition, compact codes are particularly useful for saving storage in gigantic databases. To design effective compact hashing, a number of methods such as projection learning for hashing [17]=-=[2]-=-[14], Spectral Hashing (SH) [18], Anchor Graph Hashing (AGH) [8], Semi-Supervised Hashing (SSH) [17], Restricted Boltzmann Machines (RBMs) (or semantic hashing) [13], Binary Reconstruction Embeddings ...

What Makes Paris Look Like Paris?

by Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic , Alexei Efros , 2012
"... Given a large repository of geotagged imagery, we seek to automatically find visual elements, e.g. windows, balconies, and street signs, that are most distinctive for a certain geo-spatial area, for example the city of Paris. This is a tremendously difficult task as the visual features distinguish ..."
Abstract - Cited by 49 (8 self) - Add to MetaCart
Given a large repository of geotagged imagery, we seek to automatically find visual elements, e.g. windows, balconies, and street signs, that are most distinctive for a certain geo-spatial area, for example the city of Paris. This is a tremendously difficult task as the visual features distinguishing architectural elements of different places can be very subtle. In addition, we face a hard search problem: given all possible patches in all images, which of them are both frequently occurring and geographically informative? To address these issues, we propose to use a discriminative clustering

Attribute Discovery via Predictable Discriminative Binary Codes

by Mohammad Rastegari, Ali Farhadi, David Forsyth - In ECCV
"... Abstract. We present images with binary codes in a way that balances discrimination and learnability of the codes. In our method, each image claims its own code in a way that maintains discrimination while being predictable from visual data. Category memberships are usually good proxies for visual s ..."
Abstract - Cited by 39 (7 self) - Add to MetaCart
Abstract. We present images with binary codes in a way that balances discrimination and learnability of the codes. In our method, each image claims its own code in a way that maintains discrimination while being predictable from visual data. Category memberships are usually good proxies for visual similarity but should not be enforced as a hard constraint. Our method learns codes that maximize separability of categories unless there is strong visual evidence against it. Simple linear SVMs can achieve state-of-the-art results with our short codes. In fact, our method produces state-of-the-art results on Caltech256 with only 128dimensional bit vectors and outperforms state of the art by using longer codes. We also evaluate our method on ImageNet and show that our method outperforms state-of-the-art binary code methods on this large scale dataset. Lastly, our codes can discover a discriminative set of attributes. 1
(Show Context)

Citation Context

...inal feature space; doing so requires long codes. Jégou et al. [15] jointly optimize for the search accuracy, search efficiency and memory requirements to obtain their binary codes. Gong and Lazebnik =-=[16]-=- iteratively minimize (ITQ) the quantization error of projecting examples from the original feature space to vertices of a binary hypercube. This method is capable of incorporating supervision by usin...

Hamming Distance Metric Learning

by Mohammad Norouzi, David J. Fleet, Ruslan Salakhutdinov
"... Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity. Binary codes are well suited to large-scale applications as they are storage efficient and permit exact sub-linear kNN search. The framework is ..."
Abstract - Cited by 36 (3 self) - Add to MetaCart
Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity. Binary codes are well suited to large-scale applications as they are storage efficient and permit exact sub-linear kNN search. The framework is applicable to broad families of mappings, and uses a flexible form of triplet ranking loss. We overcome discontinuous optimization of the discrete mappings by minimizing a piecewise-smooth upper bound on empirical loss, inspired by latent structural SVMs. We develop a new loss-augmented inference algorithm that is quadratic in the code length. We show strong retrieval performance on CIFAR-10 and MNIST, with promising classification results using no more than kNN on the binary codes. 1
(Show Context)

Citation Context

...ecent research has turned to machine learning techniques that optimize mappings for specific datasets (e.g., [20, 28, 29, 32, 3]). However, most such methods aim to preserve Euclidean structure (e.g. =-=[13, 20, 35]-=-). In metric learning, by comparison, the goal is to preserve semantic structure based on labeled attributes or parameters associated with training exemplars. There are papers on learning binary hash ...

Spherical hashing

by Jae-pil Heo, Youngwoon Lee, Junfeng He, Shih-fu Chang, Sung-eui Yoon - In Proc. IEEE Conf , 2012
"... Many binary code encoding schemes based on hashing have been actively studied recently, since they can provide efficient similarity search, especially nearest neighbor search, and compact data representations suitable for handling large scale image databases in many computer vision problems. Existin ..."
Abstract - Cited by 34 (3 self) - Add to MetaCart
Many binary code encoding schemes based on hashing have been actively studied recently, since they can provide efficient similarity search, especially nearest neighbor search, and compact data representations suitable for handling large scale image databases in many computer vision problems. Existing hashing techniques encode highdimensional data points by using hyperplane-based hashing functions. In this paper we propose a novel hyperspherebased hashing function, spherical hashing, to map more spatially coherent data points into a binary code compared to hyperplane-based hashing functions. Furthermore, we propose a new binary code distance function, spherical Hamming distance, that is tailored to our hyperspherebased binary coding scheme, and design an efficient iterative optimization process to achieve balanced partitioning of data points for each hash function and independence between hashing functions. Our extensive experiments show that our spherical hashing technique significantly outperforms six state-of-the-art hashing techniques based on hyperplanes across various image benchmarks of sizes ranging from one to 75 million of GIST descriptors. The performance gains are consistent and large, up to 100 % improvements. The excellent results confirm the unique merits of the proposed idea in using hyperspheres to encode proximity regions in high-dimensional spaces. Finally, our method is intuitive and easy to implement. 1.
(Show Context)

Citation Context

...endent techniques to consider the distribution of data points and design better hashing functions. Notable examples include spectral hashing [26], semi-supervised hashing [25], iterative quantization =-=[7]-=-, joint optimization [10], and random maximum margin hashing [15]. In all of these existing hashing techniques, hyperplanes are used to partition the data points (located in the original data space or...

Isotropic hashing,”

by Weihao Kong , Wu-Jun Li - Advances in Neural Information Processing Systems, , 2012
"... Abstract Most existing hashing methods adopt some projection functions to project the original data into several dimensions of real values, and then each of these projected dimensions is quantized into one bit (zero or one) by thresholding. Typically, the variances of different projected dimensions ..."
Abstract - Cited by 29 (2 self) - Add to MetaCart
Abstract Most existing hashing methods adopt some projection functions to project the original data into several dimensions of real values, and then each of these projected dimensions is quantized into one bit (zero or one) by thresholding. Typically, the variances of different projected dimensions are different for existing projection functions such as principal component analysis (PCA). Using the same number of bits for different projected dimensions is unreasonable because larger-variance dimensions will carry more information. Although this viewpoint has been widely accepted by many researchers, it is still not verified by either theory or experiment because no methods have been proposed to find a projection with equal variances for different dimensions. In this paper, we propose a novel method, called isotropic hashing (IsoHash), to learn projection functions which can produce projected dimensions with isotropic variances (equal variances). Experimental results on real data sets show that IsoHash can outperform its counterpart with different variances for different dimensions, which verifies the viewpoint that projections with isotropic variances will be better than those with anisotropic variances.
(Show Context)

Citation Context

...sentative data-dependent methods include spectral hashing (SH) [31], anchor graph hashing (AGH) [21], sequential projection learning (SPL) [29], principal component analysis [13] based hashing (PCAH) =-=[7]-=-, and iterative quantization (ITQ) [7, 8]. SH learns the hashing functions based on spectral graph partitioning. AGH adopts anchor graphs to speed up the computation of graph Laplacian eigenvectors, b...

Multidimensional Spectral Hashing

by Yair Weiss, Rob Fergus, Antonio Torralba
"... Abstract. With the growing availability of very large image databases, there has been a surge of interest in methods based on “semantic hashing”, i.e. compact binary codes of data-points so that the Hamming distance between codewords correlates with similarity. In reviewing and comparing existing me ..."
Abstract - Cited by 29 (0 self) - Add to MetaCart
Abstract. With the growing availability of very large image databases, there has been a surge of interest in methods based on “semantic hashing”, i.e. compact binary codes of data-points so that the Hamming distance between codewords correlates with similarity. In reviewing and comparing existing methods, we show that their relative performance can change drastically depending on the definition of ground-truth neighbors. Motivated by this finding, we propose a new formulation for learning binary codes which seeks to reconstruct the affinity between datapoints, rather than their distances. We show that this criterion is intractable to solve exactly, but a spectral relaxation gives an algorithm where the bits correspond to thresholded eigenvectors of the affinity matrix, and as the number of datapoints goes to infinity these eigenvectors converge to eigenfunctions of Laplace-Beltrami operators, similar to the recently proposed Spectral Hashing (SH) method. Unlike SH whose performance may degrade as the number of bits increases, the optimal code using our formulation is guaranteed to faithfully reproduce the affinities as the number of bits increases. We show that the number of eigenfunctions needed may increase exponentially with dimension, but introduce a “kernel trick ” to allow us to compute with an exponentially large number of bits but using only memory and computation that grows linearly with dimension. Experiments shows that MDSH outperforms the state-of-the art, especially in the challenging regime of small distance thresholds. 1
(Show Context)

Citation Context

...crementally adds bits to increase the Hamming distance between dissimilar objects while keeping the Hamming distance between similarMultidimensional Spectral Hashing 3 objects small. Yao and Lazebnik=-=[12]-=- also suggested looking at the difference between Hamming distances and Euclidean distances, and suggested an algorithm that finds a rotation of the PCA vectors so that after rotation the thresholding...

Segmentation Propagation in ImageNet

by Daniel Kuettel, Matthieu Guillaumin Vittorio Ferrari
"... Abstract. ImageNet is a large-scale hierarchical database of object classes. We propose to automatically populate it with pixelwise segmentations, by leveraging existing manual annotations in the form of class labels and bounding-boxes. The key idea is to recursively exploit images segmented so far ..."
Abstract - Cited by 26 (0 self) - Add to MetaCart
Abstract. ImageNet is a large-scale hierarchical database of object classes. We propose to automatically populate it with pixelwise segmentations, by leveraging existing manual annotations in the form of class labels and bounding-boxes. The key idea is to recursively exploit images segmented so far to guide the segmentation of new images. At each stage this propagation process expands into the images which are easiest to segment at that point in time, e.g. by moving to the semantically most related classes to those segmented so far. The propagation of segmentation occurs both (a) at the image level, by transferring existing segmentations to estimate the probability of a pixel to be foreground, and (b) at the class level, by jointly segmenting images of the same class and by importing the appearance models of classes that are already segmented. Through an experiment on 577 classes and 500k images we show that our technique (i) annotates a wide range of classes with accurate segmentations; (ii) effectively exploits the hierarchical structure of ImageNet; (iii) scales efficiently; (iv) outperforms a baseline GrabCut [1] initialized on the image center, as well as our recent segmentation transfer technique [2] on which this paper is based. Moreover, our method also delivers state-of-the-art results on the recent iCoseg dataset for co-segmentation. 1
(Show Context)

Citation Context

...in the number of training images and the number of classes. On the other hand, we have witnessed the advent of very large scale datasets for other computer vision applications, including image search =-=[8]-=- and object classification [9]. In this paper, we want to bridge the gap between these domains by automatically populating the large-scale ImageNet [10] database with foreground segmentations (fig. 1)...

Designing Category-Level Attributes for Discriminative Visual Recognition ∗

by Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-fu Chang
"... Attribute-based representation has shown great promises for visual recognition due to its intuitive interpretation and cross-category generalization property. However, human efforts are usually involved in the attribute designing process, making the representation costly to obtain. In this paper, we ..."
Abstract - Cited by 25 (1 self) - Add to MetaCart
Attribute-based representation has shown great promises for visual recognition due to its intuitive interpretation and cross-category generalization property. However, human efforts are usually involved in the attribute designing process, making the representation costly to obtain. In this paper, we propose a novel formulation to automatically design discriminative “category-level attributes”, which can be efficiently encoded by a compact category-attribute matrix. The formulation allows us to achieve intuitive and critical design criteria (category-separability, learnability) in a principled way. The designed attributes can be used for tasks of cross-category knowledge transfer, achieving superior performance over well-known attribute dataset Animals with Attributes (AwA) and a large-scale ILSVRC2010 dataset (1.2M images). This approach also leads to state-ofthe-art performance on the zero-shot learning task on AwA. 1.
(Show Context)

Citation Context

...500) 39.85 Ours (950) 42.16 Ours (2,000) 43.10 Table 2. Category-level image retrieval result on 50 classes from ILSVRC2010. The numbers in bracket are # attributes. We closely follow the settings of =-=[9]-=-. Figure 4. Multi-class classification accuracy on novel categories. The 64.6% accuracy with one-vs-all classifier using 50/50 (HALF) split is similar to the performance (65.9%) reported in [13]. The ...

A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics

by Yunchao Gong, Qifa Ke, Michael Isard, Svetlana Lazebnik - IJCV
"... This paper investigates the problem of modeling Internet images and associated text or tags for tasks such as image-to-image search, tag-to-image search, and image-to-tag search (image annotation). We start with canonical correlation analysis (CCA), a popular and successful approach for mapping vis ..."
Abstract - Cited by 20 (1 self) - Add to MetaCart
This paper investigates the problem of modeling Internet images and associated text or tags for tasks such as image-to-image search, tag-to-image search, and image-to-tag search (image annotation). We start with canonical correlation analysis (CCA), a popular and successful approach for mapping visual and textual features to the same latent space, and incorporate a third view capturing high-level image semantics, represented either by a single category or multiple non-mutually-exclusive concepts. We present two ways to train the three-view embedding: supervised, with the third view coming from ground-truth labels or search keywords; and unsupervised, with semantic themes automatically obtained by clustering the tags. To ensure high accuracy for retrieval tasks while keeping the learning process scalable, we combine multiple strong visual features
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2016 The Pennsylvania State University