Results 1 
8 of
8
Metric and Kernel Learning Using a Linear Transformation
"... Metric and kernel learning arise in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over lowdimensional data, while existing kernel learning algorithms are often limited to the transductive setting and do not generalize to new ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
(Show Context)
Metric and kernel learning arise in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over lowdimensional data, while existing kernel learning algorithms are often limited to the transductive setting and do not generalize to new data points. In this paper, we study the connections between metric learning and kernel learning that arise when studying metric learning as a linear transformation learning problem. In particular, we propose a general optimization framework for learning metrics via linear transformations, and analyze in detail a special case of our framework—that of minimizing the LogDet divergence subject to linear constraints. We then propose a general regularized framework for learning a kernel matrix, and show it to be equivalent to our metric learning framework. Our theoretical connections between metric and kernel learning have two main consequences: 1) the learned kernel matrix parameterizes a linear transformation kernel function and can be applied inductively to new data points, 2) our result yields a constructive method for kernelizing most existing Mahalanobis metric learning formulations. We demonstrate our learning approach by applying it to largescale real world problems in computer vision, text mining and semisupervised kernel dimensionality reduction. Keywords: divergence metric learning, kernel learning, linear transformation, matrix divergences, logdet 1.
Inductive regularized learning of kernel functions
"... In this paper we consider the fundamental problem of semisupervised kernel function learning. We first propose a general regularized framework for learning a kernel matrix, and then demonstrate an equivalence between our proposed kernel matrix learning framework and a general linear transformatio ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
In this paper we consider the fundamental problem of semisupervised kernel function learning. We first propose a general regularized framework for learning a kernel matrix, and then demonstrate an equivalence between our proposed kernel matrix learning framework and a general linear transformation learning problem. Our result shows that the learned kernel matrices parameterize a linear transformation kernel function and can be applied inductively to new data points. Furthermore, our result gives a constructive method for kernelizing most existing Mahalanobis metric learning formulations. To make our results practical for largescale data, we modify our framework to limit the number of parameters in the optimization process. We also consider the problem of kernelized inductive dimensionality reduction in the semisupervised setting. To this end, we introduce a novel method for this problem by considering a special case of our general kernel learning framework where we select the trace norm function as the regularizer. We empirically demonstrate that our framework learns useful kernel functions, improving the kNN classification accuracy significantly in a variety of domains. Furthermore, our kernelized dimensionality reduction technique significantly reduces the dimensionality of the feature space while achieving competitive classification accuracies.
Convex perturbations for scalable semidefinite programming
 In International Conference on Artificial Intelligence and Statistics (AISTATS
, 2009
"... Abstract Many important machine learning problems are modeled and solved via semidefinite programs; examples include metric learning, nonlinear embedding, and certain clustering problems. Often, offtheshelf software is invoked for the associated optimization, which can be inappropriate due to exc ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Abstract Many important machine learning problems are modeled and solved via semidefinite programs; examples include metric learning, nonlinear embedding, and certain clustering problems. Often, offtheshelf software is invoked for the associated optimization, which can be inappropriate due to excessive computational and storage requirements. In this paper, we introduce the use of convex perturbations for solving semidefinite programs (SDPs), and for a specific perturbation we derive an algorithm that has several advantages over existing techniques: a) it is simple, requiring only a few lines of MATLAB, b) it is a firstorder method, and thereby scalable, and c) it can easily exploit the structure of a given SDP (e.g., when the constraint matrices are lowrank, a situation common to several machine learning SDPs). A pleasant byproduct of our method is a fast, kernelized version of the largemargin nearest neighbor metric learning algorithm
Mirror Descent for Metric Learning: A Unified Approach
"... Abstract. Most metric learning methods are characterized by diverse loss functions andprojection methods, whichnaturallybegsthequestion: is there a wider framework that can generalize many of these methods? In addition, ever persistent issues are those of scalability to large data sets and the quest ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Most metric learning methods are characterized by diverse loss functions andprojection methods, whichnaturallybegsthequestion: is there a wider framework that can generalize many of these methods? In addition, ever persistent issues are those of scalability to large data sets and the question of kernelizability. We propose a unified approach to Mahalanobis metric learning: an online regularized metric learning algorithm based on the ideas of composite objective mirror descent (comid). The metric learning problem is formulated as a regularized positive semidefinite matrix learning problem, whose update rules can be derived using the comid framework. This approach aims to be scalable, kernelizable, and admissible to many different types of Bregman and loss functions, which allows for the tailoring of several different classes of algorithms. The most novel contribution is the use of the trace norm, which yields a sparse metric in its eigenspectrum, thus simultaneously performing feature selection along with metric learning. 1
MIRROR DESCENT FOR METRIC LEARNING
"... findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the view of the ..."
Abstract
 Add to MetaCart
(Show Context)
findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the view of the
Formulating the
"... Abstract. We propose a unified approach to Mahalanobis metric learning: an online, regularized, positive semidefinite matrix learning problem, whose update rules can be derived using the composite objective mirror descent (COMID) framework. This approach admits different Bregman and loss functions, ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. We propose a unified approach to Mahalanobis metric learning: an online, regularized, positive semidefinite matrix learning problem, whose update rules can be derived using the composite objective mirror descent (COMID) framework. This approach admits different Bregman and loss functions, which yields several different classes of algorithms. The most novel contribution is the trace norm regularization, which yields a metric sparse in its eigenspectrum, thus performing feature selection. The regularized update rules are parallelizable and can be computed efficiently. The proposed approach is also kernelizable, which allows for metric learning in nonlinear domains.
Posterior regularization and attribute assessment of underdetermined linear mappings
"... Abstract. Linear mappings are omnipresent in data processing analysis ranging from regression to distance metric learning. The interpretation of coefficients from underdetermined mappings raises an unexpected challenge when the original modeling goal does not impose regularization. Therefore, a g ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Linear mappings are omnipresent in data processing analysis ranging from regression to distance metric learning. The interpretation of coefficients from underdetermined mappings raises an unexpected challenge when the original modeling goal does not impose regularization. Therefore, a general posterior regularization strategy is presented for inducing unique results, and additional sensitivity analysis enables attribute assessment for facilitating model interpretation. An application to infrared spectra reflects data smoothness and indicates improved generalization. 1
An Overview of Unsupervised and SemiSupervised Fuzzy Kernel Clustering
, 2013
"... For realworld clustering tasks, the input data is typically not easily separable due to the highly complex data structure or when clusters vary in size, density and shape. Kernelbased clustering has proven to be an effective approach to partition such data. In this paper, we provide an overview of ..."
Abstract
 Add to MetaCart
For realworld clustering tasks, the input data is typically not easily separable due to the highly complex data structure or when clusters vary in size, density and shape. Kernelbased clustering has proven to be an effective approach to partition such data. In this paper, we provide an overview of several fuzzy kernel clustering algorithms. We focus on methods that optimize an fuzzy Cmeantype objective function. We highlight the advantages and disadvantages of each method. In addition to the completely unsupervised algorithms, we also provide an overview of some semisupervised fuzzy kernel clustering algorithms. These algorithms use partial supervision information to guide the optimization process and avoid local minima. We also provide an overview of the different approaches that have been used to extend kernel clustering to handle very large data sets.