Results 1 - 10
of
81
T.: Adapting visual category models to new domains. In: ECCV
, 2010
"... Abstract. Domain adaptation is an important emerging topic in computer vision. In this paper, we present one of the first studies of domain shift in the context of object recognition. We introduce a method that adapts object models acquired in a particular visual domain to new imaging conditions by ..."
Abstract
-
Cited by 163 (20 self)
- Add to MetaCart
(Show Context)
Abstract. Domain adaptation is an important emerging topic in computer vision. In this paper, we present one of the first studies of domain shift in the context of object recognition. We introduce a method that adapts object models acquired in a particular visual domain to new imaging conditions by learning a transformation that minimizes the effect of domain-induced changes in the feature distribution. The transformation is learned in a supervised manner and can be applied to categories for which there are no labeled examples in the new domain. While we focus our evaluation on object recognition tasks, the transform-based adaptation technique we develop is general and could be applied to non-image data. Another contribution is a new multi-domain object database, freely available for download. We experimentally demonstrate the ability of our method to improve recognition on categories with few or no target domain labels and moderate to large changes in the imaging conditions. 1
Adaptive Regularization of Weight Vectors
- Advances in Neural Information Processing Systems 22
, 2009
"... We present AROW, a new online learning algorithm that combines several useful properties: large margin training, confidence weighting, and the capacity to handle non-separable data. AROW performs adaptive regularization of the prediction function upon seeing each new instance, allowing it to perform ..."
Abstract
-
Cited by 71 (17 self)
- Add to MetaCart
We present AROW, a new online learning algorithm that combines several useful properties: large margin training, confidence weighting, and the capacity to handle non-separable data. AROW performs adaptive regularization of the prediction function upon seeing each new instance, allowing it to perform especially well in the presence of label noise. We derive a mistake bound, similar in form to the second order perceptron bound, that does not assume separability. We also relate our algorithm to recent confidence-weighted online learning techniques and show empirically that AROW achieves state-of-the-art performance and notable robustness in the case of non-separable data. 1
F.: PCCA: A new approach for distance learning from sparse pairwise constraints
- In: CVPR (2012) Re-identification: What Features Are Important? 401
"... This paper introduces Pairwise Constrained Component Analysis (PCCA), a new algorithm for learning distance metrics from sparse pairwise similarity/dissimilarity constraints in high dimensional input space, problem for which most existing distance metric learning approaches are not adapted. PCCA lea ..."
Abstract
-
Cited by 51 (0 self)
- Add to MetaCart
(Show Context)
This paper introduces Pairwise Constrained Component Analysis (PCCA), a new algorithm for learning distance metrics from sparse pairwise similarity/dissimilarity constraints in high dimensional input space, problem for which most existing distance metric learning approaches are not adapted. PCCA learns a projection into a low-dimensional space where the distance between pairs of data points respects the desired constraints, exhibiting good generalization properties in presence of high dimensional data. The paper also shows how to efficiently kernelize the approach. PCCA is experimentally validated on two challenging vision tasks, face verification and person re-identification, for which we obtain state-of-the-art results. 1.
Hierarchical semantic indexing for large scale image retrieval
- In CVPR
, 2011
"... This paper addresses the problem of similar image retrieval, especially in the setting of large-scale datasets with millions to billions of images. The core novel contribution is an approach that can exploit prior knowledge of a semantic hierarchy. When semantic labels and a hierarchy relating them ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
(Show Context)
This paper addresses the problem of similar image retrieval, especially in the setting of large-scale datasets with millions to billions of images. The core novel contribution is an approach that can exploit prior knowledge of a semantic hierarchy. When semantic labels and a hierarchy relating them are available during training, significant improvements over the state of the art in similar image retrieval are attained. While some of this advantage comes from the ability to use additional information, experiments exploring a special case where no additional data is provided, show the new approach can still outperform OASIS [6], the current state of the art for similarity learning. Exploiting hierarchical relationships is most important for larger scale problems, where scalability becomes crucial. The proposed learning approach is fundamentally parallelizable and as a result scales more easily than previous work. An additional contribution is a novel hashing scheme (for bilinear similarity on vectors of probabilities, optionally taking into account hierarchy) that is able to reduce the computational cost of retrieval. Experiments are performed on Caltech256 and the larger ImageNet dataset. 1.
Hamming Distance Metric Learning
"... Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity. Binary codes are well suited to large-scale applications as they are storage efficient and permit exact sub-linear kNN search. The framework is ..."
Abstract
-
Cited by 36 (3 self)
- Add to MetaCart
(Show Context)
Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity. Binary codes are well suited to large-scale applications as they are storage efficient and permit exact sub-linear kNN search. The framework is applicable to broad families of mappings, and uses a flexible form of triplet ranking loss. We overcome discontinuous optimization of the discrete mappings by minimizing a piecewise-smooth upper bound on empirical loss, inspired by latent structural SVMs. We develop a new loss-augmented inference algorithm that is quadratic in the code length. We show strong retrieval performance on CIFAR-10 and MNIST, with promising classification results using no more than kNN on the binary codes. 1
multimedia retrieval framework based on semi-supervised ranking and relevance feedback
- IEEE Trans. Pattern Anal. Mach. Intell
, 2012
"... Abstract—We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In L ..."
Abstract
-
Cited by 30 (9 self)
- Add to MetaCart
(Show Context)
Abstract—We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency. Index Terms—Content-based multimedia retrieval, semi-supervised learning, ranking algorithm, relevance feedback, cross-media retrieval, image retrieval, 3D motion data retrieval. Ç 1
Metric learning for large scale image classification: Generalizing to new classes at near-zero cost
- In ECCV. Yunchao Gong et al
, 2012
"... Abstract. We are interested in large-scale image classification and especially in the setting where images corresponding to new or existing classes are con-tinuously added to the training set. Our goal is to devise classifiers which can incorporate such images and classes on-the-fly at (near) zero c ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
(Show Context)
Abstract. We are interested in large-scale image classification and especially in the setting where images corresponding to new or existing classes are con-tinuously added to the training set. Our goal is to devise classifiers which can incorporate such images and classes on-the-fly at (near) zero cost. We cast this problem into one of learning a metric which is shared across all classes and ex-plore k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers. We learn metrics on the ImageNet 2010 challenge data set, which contains more than 1.2M training images of 1K classes. Surprisingly, the NCM classifier compares favorably to the more flexible k-NN classifier, and has comparable performance to linear SVMs. We also study the generalization performance, among others by using the learned metric on the ImageNet-10K dataset, and we obtain competi-tive performance. Finally, we explore zero-shot classification, and show how the zero-shot model can be combined very effectively with small training datasets. 1
IntentSearch:Capturing User Intention for One-Click Internet Image Search
, 2012
"... Abstract—Web-scale image search engines (e.g. Google Image Search, Bing Image Search) mostly rely on surrounding text features. It is difficult for them to interpret users ’ search intention only by query keywords and this leads to ambiguous and noisy search results which are far from satisfactory. ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
(Show Context)
Abstract—Web-scale image search engines (e.g. Google Image Search, Bing Image Search) mostly rely on surrounding text features. It is difficult for them to interpret users ’ search intention only by query keywords and this leads to ambiguous and noisy search results which are far from satisfactory. It is important to use visual information in order to solve the ambiguity in text-based image retrieval. In this paper, we propose a novel Internet image search approach. It only requires the user to click on one query image with the minimum effort and images from a pool retrieved by text-based search are re-ranked based on both visual and textual content. Our key contribution is to capture the users ’ search intention from this one-click query image in four steps. (1) The query image is categorized into one of the predefined adaptive weight categories, which reflect users ’ search intention at a coarse level. Inside each category, a specific weight schema is used to combine visual features adaptive to this kind of images to better re-rank the text-based search result. (2) Based on the visual content of the query image selected by the user and through image clustering, query keywords are expanded to capture user intention. (3) Expanded keywords are used to enlarge the image pool to contain more relevant images. (4) Expanded keywords are also used to expand the query image to multiple positive visual examples from which new query specific visual and textual similarity metrics are learned to further improve content-based image re-ranking. All these steps are automatic without extra effort from the user. This is critically important for any commercial web-based image search engine, where the user interface has to be extremely simple. Besides this key contribution, a set of visual features which are both effective and efficient in Internet image search are designed. Experimental evaluation shows that our approach significantly improves the precision of top ranked images and also the user experience.
Polynomial Semantic Indexing
"... We present a class of nonlinear (polynomial) models that are discriminatively trained to directly map from the word content in a query-document or documentdocument pair to a ranking score. Dealing with polynomial models on word features is computationally challenging. We propose a low-rank (but diag ..."
Abstract
-
Cited by 16 (8 self)
- Add to MetaCart
(Show Context)
We present a class of nonlinear (polynomial) models that are discriminatively trained to directly map from the word content in a query-document or documentdocument pair to a ranking score. Dealing with polynomial models on word features is computationally challenging. We propose a low-rank (but diagonal preserving) representation of our polynomial models to induce feasible memory and computation requirements. We provide an empirical study on retrieval tasks based on Wikipedia documents, where we obtain state-of-the-art performance while providing realistically scalable methods. 1