Results 1  10
of
31
Nonlinear metric learning
 In NIPS
, 2012
"... In this paper, we introduce two novel metric learning algorithms, χ2LMNN and GBLMNN, which are explicitly designed to be nonlinear and easytouse. The two approaches achieve this goal in fundamentally different ways: χ2LMNN inherits the computational benefits of a linear mapping from linear met ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
(Show Context)
In this paper, we introduce two novel metric learning algorithms, χ2LMNN and GBLMNN, which are explicitly designed to be nonlinear and easytouse. The two approaches achieve this goal in fundamentally different ways: χ2LMNN inherits the computational benefits of a linear mapping from linear metric learning, but uses a nonlinear χ2distance to explicitly capture similarities within histogram data sets; GBLMNN applies gradientboosting to learn nonlinear mappings directly in function space and takes advantage of this approach’s robustness, speed, parallelizability and insensitivity towards the single additional hyperparameter. On various benchmark data sets, we demonstrate these methods not only match the current stateoftheart in terms of kNN classification error, but in the case of χ2LMNN, obtain best results in 19 out of 20 learning settings. 1
A.: Parametric local metric learning for nearest neighbor classification
 NIPS 2012 ECML/PKDD 2013 MACHINE LEARNING LAB 9/19/13 31/34 ECML/PKDD 2013
"... We study the problem of learning local metrics for nearest neighbor classification. Most previous works on local metric learning learn a number of local unrelated metrics. While this ”independence ” approach delivers an increased flexibility its downside is the considerable risk of overfitting. We p ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
We study the problem of learning local metrics for nearest neighbor classification. Most previous works on local metric learning learn a number of local unrelated metrics. While this ”independence ” approach delivers an increased flexibility its downside is the considerable risk of overfitting. We present a new parametric local metric learning method in which we learn a smooth metric matrix function over the data manifold. Using an approximation error bound of the metric matrix function we learn local metrics as linear combinations of basis metrics defined on anchor points over different regions of the instance space. We constrain the metric matrix function by imposing on the linear combinations manifold regularization which makes the learned metric matrix function vary smoothly along the geodesics of the data manifold. Our metric learning method has excellent performance both in terms of predictive power and scalability. We experimented with several largescale classification problems, tens of thousands of instances, and compared it with several state of the art metric learning methods, both global and local, as well as to SVM with automatic kernel selection, all of which it outperforms in a significant manner. 1
Learning Image Descriptors with the BoostingTrick
"... In this paper we apply boosting to learn complex nonlinear local visual feature representations, drawing inspiration from its successful application to visual object detection. The main goal of local feature descriptors is to distinctively represent a salient image region while remaining invariant ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
In this paper we apply boosting to learn complex nonlinear local visual feature representations, drawing inspiration from its successful application to visual object detection. The main goal of local feature descriptors is to distinctively represent a salient image region while remaining invariant to viewpoint and illumination changes. This representation can be improved using machine learning, however, past approaches have been mostly limited to learning linear feature mappings in either the original input or a kernelized input feature space. While kernelized methods have proven somewhat effective for learning nonlinear local feature descriptors, they rely heavily on the choice of an appropriate kernel function whose selection is often difficult and nonintuitive. We propose to use the boostingtrick to obtain a nonlinear mapping of the input to a highdimensional feature space. The nonlinear feature mapping obtained with the boostingtrick is highly intuitive. We employ gradientbased weak learners resulting in a learned descriptor that closely resembles the wellknown SIFT. As demonstrated in our experiments, the resulting descriptor can be learned directly from intensity patches achieving stateoftheart performance. 1
Similaritybased Learning via Data Driven
"... We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when worki ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when working with such good functions. Our framework unifies and generalizes the frameworks proposed by [1] and [2]. An attractive feature of our framework is its adaptability to data we do not promote a fixed notion of goodness but rather let data dictate it. We show, by giving theoretical guarantees that the goodness criterion best suited to a problem can itself be learned which makes our approach applicable to a variety of domains and problems. We propose a landmarkingbased approach to obtaining a classifier from such learned goodness criteria. We then provide a novel diversity based heuristic to perform taskdriven selection of landmark points instead of random selection. We demonstrate the effectiveness of our goodness criteria learning method as well as the landmark selection heuristic on a variety of similaritybased learning datasets and benchmark UCI datasets on which our method consistently outperforms existing approaches by a significant margin. 1
Online multimodal distance learning for scalable multimedia retrieval
 In WSDM
, 2013
"... In many realword scenarios, e.g., multimedia applications, data often originates from multiple heterogeneous sources or are represented by diverse types of representation, which is often referred to as “multimodal data”. The definition of distance between any two objects/items on multimodal data ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
In many realword scenarios, e.g., multimedia applications, data often originates from multiple heterogeneous sources or are represented by diverse types of representation, which is often referred to as “multimodal data”. The definition of distance between any two objects/items on multimodal data is a key challenge encountered by many realworld applications, including multimedia retrieval. In this paper, we present a novel online learning framework for learning distance functions on multimodal data through the combination of multiple kernels. In order to attack largescale multimedia applications, we propose Online Multimodal Distance Learning (OMDL) algorithms, which are significantly more efficient and scalable than the stateoftheart techniques. We conducted an extensive set of experiments on multimodal image retrieval applications, in which encouraging results validate the efficacy of the proposed technique.
Learning Graphs from Signal Observations under Smoothness Prior
, 2014
"... The construction of a meaningful graph plays a crucial role in the success of many graphbased data representations and algorithms, especially in the emerging field of signal processing on graphs. However, a meaningful graph is not always readily available from the data, nor easy to define depending ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
The construction of a meaningful graph plays a crucial role in the success of many graphbased data representations and algorithms, especially in the emerging field of signal processing on graphs. However, a meaningful graph is not always readily available from the data, nor easy to define depending on the application domain. In this paper, we address the problem of graph learning, where we are interested in learning graph topologies, namely, the relationships between data entities, that well explain the signal observations. In particular, we want to infer a graph such that the input data forms graph signals with smooth variations on the resulting topology. To this end, we adopt a factor analysis model for the graph signals and impose a Gaussian probabilistic prior on the latent variables that control these graph signals. We show that the Gaussian prior leads to an efficient representation that favors the smoothness property of the graph signals. We then propose an algorithm for learning graphs that enforce such smoothness property for the signal observations by minimizing the variations of the signals on the learned graph. Experiments on both synthetic and real world data demonstrate that the proposed graph learning framework can efficiently infer meaningful graph topologies from only the signal observations.
Riemannian Similarity Learning
"... We consider a similarityscore based paradigm to address scenarios where either the class labels are only partially revealed during learning, or the training and testing data are drawn from heterogeneous sources. The learning problem is subsequently formulated as optimization over a bilinear form of ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We consider a similarityscore based paradigm to address scenarios where either the class labels are only partially revealed during learning, or the training and testing data are drawn from heterogeneous sources. The learning problem is subsequently formulated as optimization over a bilinear form of fixed rank. Our paradigm bears similarity to metric learning, where the major difference lies in its aim of learning a rectangular similarity matrix, instead of a proper metric. We tackle this problem in a Riemannian optimization framework. In particular, we consider its applications in pairwisebased action recognition, and crossdomain imagebased object recognition. In both applications, the proposed algorithm produces competitive performance on respective benchmark datasets. 1.
Kernel Density Metric Learning
"... Abstract: This paper introduces a supervised metric learning algorithm, called kernel density metric learning (KDML), which is easy to use and provides nonlinear, probabilitybased distance measures. KDML constructs a direct nonlinear mapping from the original input space into a feature space based ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract: This paper introduces a supervised metric learning algorithm, called kernel density metric learning (KDML), which is easy to use and provides nonlinear, probabilitybased distance measures. KDML constructs a direct nonlinear mapping from the original input space into a feature space based on kernel density estimation. The nonlinear mapping in KDML embodies established distance measures between probability density functions, and leads to correct classification on datasets for which linear metric learning methods would fail. Existing metric learning algorithms, such as large margin nearest neighbors (LMNN), can then be applied to the KDML features to learn a Mahalanobis distance. We also propose an integrated optimization algorithm that learns not only the Mahalanobis matrix but also kernel bandwidths, the only hyperparameters in the nonlinear mapping. KDML can naturally handle not only numerical features, but also categorical ones, which is rarely found in previous metric learning algorithms. Extensive experimental results on various benchmark datasets show that KDML significantly improves existing metric learning algorithms in terms of kNN classification accuracy.
Crossdomain object recognition via inputoutput kernel analysis
 IEEE transactions on image processing
"... Abstract—It is of great importance to investigate the domain adaptation problem of image object recognition since now image data is available from a variety of source domains. To understand the changes in data distributions across domains, we study both the input and output kernel spaces for crossd ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract—It is of great importance to investigate the domain adaptation problem of image object recognition since now image data is available from a variety of source domains. To understand the changes in data distributions across domains, we study both the input and output kernel spaces for crossdomain learning situations, where most of the labeled training images are from a source domain and the testing images are from a different target domain. To address the feature distribution change issue in the Reproducing Kernel Hilbert Space induced by vectorvalued functions, we propose a Domain Adaptive InputOutput Kernel Learning (DAIOKL) algorithm, which simultaneously learns both the input and output kernels with a discriminative vectorvalued decision function by reducing the data mismatch and minimizing the structural error. We also extend the proposed method to the cases of having multiple source domains. We demonstrate the ability of the proposed model to adapt across domains by examining two crossdomain object recognition benchmark data sets. The proposed method consistently outperforms the the stateofart domain adaptation and multiple kernel learning methods.