Results 1  10
of
51
Bundle Methods for Regularized Risk Minimization
"... A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers. Examples include linear Support Vector Machines (SVMs), Gaussian Processes, Logistic Regression, Conditional ..."
Abstract

Cited by 78 (4 self)
 Add to MetaCart
A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers. Examples include linear Support Vector Machines (SVMs), Gaussian Processes, Logistic Regression, Conditional Random Fields (CRFs), and Lasso amongst others. This paper describes the theory and implementation of a scalable and modular convex solver which solves all these estimation problems. It can be parallelized on a cluster of workstations, allows for datalocality, and can deal with regularizers such as L1 and L2 penalties. In addition to the unified framework we present tight convergence bounds, which show that our algorithm converges in O(1/ɛ) steps to ɛ precision for general convex problems and in O(log(1/ɛ)) steps for continuously differentiable problems. We demonstrate the performance of our general purpose solver on a variety of publicly available datasets.
A scalable modular convex solver for regularized risk minimization
 In KDD. ACM
, 2007
"... A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers. Examples include linear Support Vector Machines (SVMs), Logistic Regression, Conditional Random Fields (CRFs ..."
Abstract

Cited by 78 (15 self)
 Add to MetaCart
(Show Context)
A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers. Examples include linear Support Vector Machines (SVMs), Logistic Regression, Conditional Random Fields (CRFs), and Lasso amongst others. This paper describes the theory and implementation of a highly scalable and modular convex solver which solves all these estimation problems. It can be parallelized on a cluster of workstations, allows for datalocality, and can deal with regularizers such as ℓ1 and ℓ2 penalties. At present, our solver implements 20 different estimation problems, can be easily extended, scales to millions of observations, and is up to 10 times faster than specialized solvers for many applications. The open source code is freely available as part of the ELEFANT toolbox.
A kernel method for the two sample problem
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 19
, 2007
"... We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert ..."
Abstract

Cited by 72 (19 self)
 Add to MetaCart
We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS). We present two tests based on large deviation bounds for the test statistic, while a third is based on the asymptotic distribution of this statistic. The test statistic can be computed in quadratic time, although efficient linear time approximations are available. Several classical metrics on distributions are recovered when the function space used to compute the difference in expectations is allowed to be more general (eg. a Banach space). We apply our twosample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests.
Nonparametric conditional density estimation using piecewiselinear solution path of kernel quantile regression
 Neural Computation
, 2009
"... The goal of regression analysis is to describe the stochastic relationship between a vector of inputs x and a scalar output y. This can be achieved by estimating the entire conditional density p(yx). In this paper we present a new approach for nonparametric conditional density estimation. We develo ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
The goal of regression analysis is to describe the stochastic relationship between a vector of inputs x and a scalar output y. This can be achieved by estimating the entire conditional density p(yx). In this paper we present a new approach for nonparametric conditional density estimation. We develop a piecewiselinear pathfollowing method for kernelbased quantile regression. It enables us to estimate the cumulative distribution function (CDF) of conditional density p(yx) in piecewiselinear form. After smoothing the estimated piecewiselinear CDF, we obtain nonparametric conditional density estimate p̂(yx) for all x in the input domain. Theoretical analyses and numerical experiments are presented for showing the effectiveness of the approach. 1
Bouligand Derivatives and Robustness of Support Vector Machines for Regression
 Journal of Machine Learning Research
, 2008
"... We investigate robustness properties for a broad class of support vector machines with nonsmooth loss functions. These kernel methods are inspired by convex risk minimization in infinite dimensional Hilbert spaces. Leading examples are the support vector machine based on the εinsensitive loss func ..."
Abstract

Cited by 12 (7 self)
 Add to MetaCart
We investigate robustness properties for a broad class of support vector machines with nonsmooth loss functions. These kernel methods are inspired by convex risk minimization in infinite dimensional Hilbert spaces. Leading examples are the support vector machine based on the εinsensitive loss function, and kernel based quantile regression based on the pinball loss function. Firstly, we propose with the Bouligand influence function (BIF) a modification of F.R. Hampel’s influence function. The BIF has the advantage of being positive homogeneous which is in general not true for Hampel’s influence function. Secondly, we show that many support vector machines based on a Lipschitz continuous loss function and a bounded kernel have a bounded BIF and are thus robust in the sense of robust statistics based on influence functions.
A Densityratio Framework for Statistical Data Processing
 IPSJ TRANSACTIONS ON COMPUTER VISION AND APPLICATIONS, VOL.1, PP.183–208
, 2009
"... In statistical pattern recognition, it is important to avoid density estimation since density estimation is often more difficult than pattern recognition itself. Following this idea—known as Vapnik’s principle, a statistical data processing framework that employs the ratio of two probability density ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
In statistical pattern recognition, it is important to avoid density estimation since density estimation is often more difficult than pattern recognition itself. Following this idea—known as Vapnik’s principle, a statistical data processing framework that employs the ratio of two probability density functions has been developed recently and is gathering a lot of attention in the machine learning and data mining communities. The purpose of this paper is to introduce to the computer vision community recent advances in density ratio estimation methods and their usage in various statistical data processing tasks such as nonstationarity adaptation, outlier detection, feature selection, and independent component analysis.
Machine Learning Techniques—Reductions Between Prediction Quality Metrics
"... Abstract Machine learning involves optimizing a loss function on unlabeled data points given examples of labeled data points, where the loss function measures the performance of a learning algorithm. We give an overview of techniques, called reductions, for converting a problem of minimizing one los ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
Abstract Machine learning involves optimizing a loss function on unlabeled data points given examples of labeled data points, where the loss function measures the performance of a learning algorithm. We give an overview of techniques, called reductions, for converting a problem of minimizing one loss function into a problem of minimizing another, simpler loss function. This tutorial discusses how to create robust reductions that perform well in practice. The reductions discussed here can be used to solve any supervised learning problem with a standard binary classification or regression algorithm available in any machine learning toolkit. We also discuss common design flaws in folklore reductions. 1
How SVMs can estimate quantiles and the median
 Advances in Neural Information Processing Systems 20
"... We investigate quantile regression based on the pinball loss and the ǫinsensitive loss. For the pinball loss a condition on the datagenerating distribution P is given that ensures that the conditional quantiles are approximated with respect to ‖ · ‖1. This result is then used to derive an oracle ..."
Abstract

Cited by 8 (8 self)
 Add to MetaCart
(Show Context)
We investigate quantile regression based on the pinball loss and the ǫinsensitive loss. For the pinball loss a condition on the datagenerating distribution P is given that ensures that the conditional quantiles are approximated with respect to ‖ · ‖1. This result is then used to derive an oracle inequality for an SVM based on the pinball loss. Moreover, we show that SVMs based on the ǫinsensitive loss estimate the conditional median only under certain conditions on P. 1
Estimating conditional quantiles with the help of the pinball loss. Bernoulli, accepted with minor revision
"... ar ..."
(Show Context)
Creating Safety Nets Through Semiparametric IndexBased Insurance: A Simulation for Northern Ghana. Agricultural Finance Review 68(1
, 2008
"... In West Africa, farm income is highly exposed to risks from crop failure in the drier, inland areas, and from fluctuations in (world market) prices in the wetter coastal areas. As individuals and even extended families are poorly equipped to deal with these, provision of social safety nets is requir ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
In West Africa, farm income is highly exposed to risks from crop failure in the drier, inland areas, and from fluctuations in (world market) prices in the wetter coastal areas. As individuals and even extended families are poorly equipped to deal with these, provision of social safety nets is required Our paper reviews the situation in Ghana and the way in which the new financial instrument of indexbased insurance might contribute to better it, focusing on the estimation of a crop indemnification scheme for farmers in Northern Ghana. It recalls that in a poor rural area like Northern Ghana, provision of social safety almost coincides with food security management, and must, therefore, distinguish three basic subtasks: distributing income entitlements (possibly indemnification payments from insurance) to the poor, ensuring collection of taxes (possibly insurance premiums) to fund the arrangement, and assuring delivery of staple goods, such as food to the all households, including the poor. We point out that crop insurance, in any form can at best entitle the poor, and with adequate premiums, become adequately funded, albeit that current experience suggests that farmers tend to be reluctant and to find it difficult to fulfill their obligations. Our main remark is, however, that unless the actual availability of goods is assured, the indemnification from crop insurance will under droughts only cause prices to rise and channel away scarce food from the uninsured to the insured. In short, in poor areas such as Northern Ghana coordinated food