Results 1  10
of
29
Nonlinear metric learning
 In NIPS
, 2012
"... In this paper, we introduce two novel metric learning algorithms, χ2LMNN and GBLMNN, which are explicitly designed to be nonlinear and easytouse. The two approaches achieve this goal in fundamentally different ways: χ2LMNN inherits the computational benefits of a linear mapping from linear met ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
(Show Context)
In this paper, we introduce two novel metric learning algorithms, χ2LMNN and GBLMNN, which are explicitly designed to be nonlinear and easytouse. The two approaches achieve this goal in fundamentally different ways: χ2LMNN inherits the computational benefits of a linear mapping from linear metric learning, but uses a nonlinear χ2distance to explicitly capture similarities within histogram data sets; GBLMNN applies gradientboosting to learn nonlinear mappings directly in function space and takes advantage of this approach’s robustness, speed, parallelizability and insensitivity towards the single additional hyperparameter. On various benchmark data sets, we demonstrate these methods not only match the current stateoftheart in terms of kNN classification error, but in the case of χ2LMNN, obtain best results in 19 out of 20 learning settings. 1
A.: Parametric local metric learning for nearest neighbor classification
 NIPS 2012 ECML/PKDD 2013 MACHINE LEARNING LAB 9/19/13 31/34 ECML/PKDD 2013
"... We study the problem of learning local metrics for nearest neighbor classification. Most previous works on local metric learning learn a number of local unrelated metrics. While this ”independence ” approach delivers an increased flexibility its downside is the considerable risk of overfitting. We p ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
(Show Context)
We study the problem of learning local metrics for nearest neighbor classification. Most previous works on local metric learning learn a number of local unrelated metrics. While this ”independence ” approach delivers an increased flexibility its downside is the considerable risk of overfitting. We present a new parametric local metric learning method in which we learn a smooth metric matrix function over the data manifold. Using an approximation error bound of the metric matrix function we learn local metrics as linear combinations of basis metrics defined on anchor points over different regions of the instance space. We constrain the metric matrix function by imposing on the linear combinations manifold regularization which makes the learned metric matrix function vary smoothly along the geodesics of the data manifold. Our metric learning method has excellent performance both in terms of predictive power and scalability. We experimented with several largescale classification problems, tens of thousands of instances, and compared it with several state of the art metric learning methods, both global and local, as well as to SVM with automatic kernel selection, all of which it outperforms in a significant manner. 1
Learning Image Descriptors with the BoostingTrick
"... In this paper we apply boosting to learn complex nonlinear local visual feature representations, drawing inspiration from its successful application to visual object detection. The main goal of local feature descriptors is to distinctively represent a salient image region while remaining invariant ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
In this paper we apply boosting to learn complex nonlinear local visual feature representations, drawing inspiration from its successful application to visual object detection. The main goal of local feature descriptors is to distinctively represent a salient image region while remaining invariant to viewpoint and illumination changes. This representation can be improved using machine learning, however, past approaches have been mostly limited to learning linear feature mappings in either the original input or a kernelized input feature space. While kernelized methods have proven somewhat effective for learning nonlinear local feature descriptors, they rely heavily on the choice of an appropriate kernel function whose selection is often difficult and nonintuitive. We propose to use the boostingtrick to obtain a nonlinear mapping of the input to a highdimensional feature space. The nonlinear feature mapping obtained with the boostingtrick is highly intuitive. We employ gradientbased weak learners resulting in a learned descriptor that closely resembles the wellknown SIFT. As demonstrated in our experiments, the resulting descriptor can be learned directly from intensity patches achieving stateoftheart performance. 1
Similaritybased Learning via Data Driven
"... We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when worki ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when working with such good functions. Our framework unifies and generalizes the frameworks proposed by [1] and [2]. An attractive feature of our framework is its adaptability to data we do not promote a fixed notion of goodness but rather let data dictate it. We show, by giving theoretical guarantees that the goodness criterion best suited to a problem can itself be learned which makes our approach applicable to a variety of domains and problems. We propose a landmarkingbased approach to obtaining a classifier from such learned goodness criteria. We then provide a novel diversity based heuristic to perform taskdriven selection of landmark points instead of random selection. We demonstrate the effectiveness of our goodness criteria learning method as well as the landmark selection heuristic on a variety of similaritybased learning datasets and benchmark UCI datasets on which our method consistently outperforms existing approaches by a significant margin. 1
Online multimodal distance learning for scalable multimedia retrieval
 In WSDM
, 2013
"... In many realword scenarios, e.g., multimedia applications, data often originates from multiple heterogeneous sources or are represented by diverse types of representation, which is often referred to as “multimodal data”. The definition of distance between any two objects/items on multimodal data ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
In many realword scenarios, e.g., multimedia applications, data often originates from multiple heterogeneous sources or are represented by diverse types of representation, which is often referred to as “multimodal data”. The definition of distance between any two objects/items on multimodal data is a key challenge encountered by many realworld applications, including multimedia retrieval. In this paper, we present a novel online learning framework for learning distance functions on multimodal data through the combination of multiple kernels. In order to attack largescale multimedia applications, we propose Online Multimodal Distance Learning (OMDL) algorithms, which are significantly more efficient and scalable than the stateoftheart techniques. We conducted an extensive set of experiments on multimodal image retrieval applications, in which encouraging results validate the efficacy of the proposed technique.
Riemannian Similarity Learning
"... We consider a similarityscore based paradigm to address scenarios where either the class labels are only partially revealed during learning, or the training and testing data are drawn from heterogeneous sources. The learning problem is subsequently formulated as optimization over a bilinear form of ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We consider a similarityscore based paradigm to address scenarios where either the class labels are only partially revealed during learning, or the training and testing data are drawn from heterogeneous sources. The learning problem is subsequently formulated as optimization over a bilinear form of fixed rank. Our paradigm bears similarity to metric learning, where the major difference lies in its aim of learning a rectangular similarity matrix, instead of a proper metric. We tackle this problem in a Riemannian optimization framework. In particular, we consider its applications in pairwisebased action recognition, and crossdomain imagebased object recognition. In both applications, the proposed algorithm produces competitive performance on respective benchmark datasets. 1.
LargeMargin Metric Learning for Constrained Partitioning Problems
"... We consider unsupervised partitioning problems based explicitly or implicitly on the minimization of Euclidean distortions, such as clustering, image or video segmentation, and other changepoint detection problems. We emphasize on cases with specific structure, which include many practical situat ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We consider unsupervised partitioning problems based explicitly or implicitly on the minimization of Euclidean distortions, such as clustering, image or video segmentation, and other changepoint detection problems. We emphasize on cases with specific structure, which include many practical situations ranging from meanbased changepoint detection to image segmentation problems. We aim at learning a Mahalanobis metric for these unsupervised problems, leading to feature weighting and/or selection. This is done in a supervised way by assuming the availability of several (partially) labeled datasets that share the same metric. We cast the metric learning problem as a largemargin structured prediction problem, with proper definition of regularizers and losses, leading to a convex optimization problem which can be solved efficiently. Our experiments show how learning the metric can significantly improve performance on bioinformatics, video or image segmentation problems. 1.
Kernel Density Metric Learning
"... Abstract: This paper introduces a supervised metric learning algorithm, called kernel density metric learning (KDML), which is easy to use and provides nonlinear, probabilitybased distance measures. KDML constructs a direct nonlinear mapping from the original input space into a feature space based ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract: This paper introduces a supervised metric learning algorithm, called kernel density metric learning (KDML), which is easy to use and provides nonlinear, probabilitybased distance measures. KDML constructs a direct nonlinear mapping from the original input space into a feature space based on kernel density estimation. The nonlinear mapping in KDML embodies established distance measures between probability density functions, and leads to correct classification on datasets for which linear metric learning methods would fail. Existing metric learning algorithms, such as large margin nearest neighbors (LMNN), can then be applied to the KDML features to learn a Mahalanobis distance. We also propose an integrated optimization algorithm that learns not only the Mahalanobis matrix but also kernel bandwidths, the only hyperparameters in the nonlinear mapping. KDML can naturally handle not only numerical features, but also categorical ones, which is rarely found in previous metric learning algorithms. Extensive experimental results on various benchmark datasets show that KDML significantly improves existing metric learning algorithms in terms of kNN classification accuracy.
MatchNet: Unifying Feature and Metric Learning for PatchBased Matching
"... Motivated by recent successes on learning feature representations and on learning feature comparison functions, we propose a unified approach to combining both for training a patch matching system. Our system, dubbed MatchNet, consists of a deep convolutional network that extracts features from p ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Motivated by recent successes on learning feature representations and on learning feature comparison functions, we propose a unified approach to combining both for training a patch matching system. Our system, dubbed MatchNet, consists of a deep convolutional network that extracts features from patches and a network of three fully connected layers that computes a similarity between the extracted features. To ensure experimental repeatability, we train MatchNet on standard datasets and employ an input sampler to augment the training set with synthetic exemplar pairs that reduce overfitting. Once trained, we achieve better computational efficiency during matching by disassembling MatchNet and separately applying the feature computation and similarity networks in two sequential stages. We perform a comprehensive set of experiments on standard datasets to carefully study the contributions of each aspect of MatchNet, with direct comparisons to established methods. Our results confirm that our unified approach improves accuracy over previous stateoftheart results on patch matching datasets, while reducing the storage requirement for descriptors. We make pretrained MatchNet publicly available.1 1.