Results 1  10
of
43
NoRegret Reductions for Imitation Learning and Structured Prediction
 In AISTATS
, 2011
"... Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. This leads to poor performance in theory and often in practice. Some recent approaches (Daumé III et al., ..."
Abstract

Cited by 65 (13 self)
 Add to MetaCart
(Show Context)
Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. This leads to poor performance in theory and often in practice. Some recent approaches (Daumé III et al., 2009; Ross and Bagnell, 2010) provide stronger guarantees in this setting, but remain somewhat unsatisfactory as they train either nonstationary or stochastic policies and require a large number of iterations. In this paper, we propose a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting. We show that any such no regret algorithm, combined with additional reduction assumptions, must find a policy with good performance under the distribution of observations it induces in such sequential settings. We demonstrate that this new approach outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem. 1
Unsupervised Searchbased Structured Prediction
, 2009
"... We describe an adaptation and application of a searchbased structured prediction algorithm “Searn” to unsupervised learning problems. We show that it is possible to reduce unsupervised learning to supervised learning and demonstrate a highquality unsupervised shiftreduce parsing model. We additio ..."
Abstract

Cited by 51 (1 self)
 Add to MetaCart
We describe an adaptation and application of a searchbased structured prediction algorithm “Searn” to unsupervised learning problems. We show that it is possible to reduce unsupervised learning to supervised learning and demonstrate a highquality unsupervised shiftreduce parsing model. We additionally show a close connection between unsupervised Searn and expectation maximization. Finally, we demonstrate the efficacy of a semisupervised extension. The key idea that enables this is an application of the predictself idea for unsupervised learning.
Good Practice in LargeScale Learning for Image Classification
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (TPAMI)
, 2013
"... We benchmark several SVM objective functions for largescale image classification. We consider onevsrest, multiclass, ranking, and weighted approximate ranking SVMs. A comparison of online and batch methods for optimizing the objectives shows that online methods perform as well as batch methods i ..."
Abstract

Cited by 51 (7 self)
 Add to MetaCart
We benchmark several SVM objective functions for largescale image classification. We consider onevsrest, multiclass, ranking, and weighted approximate ranking SVMs. A comparison of online and batch methods for optimizing the objectives shows that online methods perform as well as batch methods in terms of classification accuracy, but with a significant gain in training speed. Using stochastic gradient descent, we can scale the training to millions of images and thousands of classes. Our experimental evaluation shows that rankingbased algorithms do not outperform the onevsrest strategy when a large number of training examples are used. Furthermore, the gap in accuracy between the different algorithms shrinks as the dimension of the features increases. We also show that learning through crossvalidation the optimal rebalancing of positive and negative examples can result in a significant improvement for the onevsrest strategy. Finally, early stopping can be used as an effective regularization strategy when training with online algorithms. Following these “good practices”, we were able to improve the stateoftheart on a large subset of 10K classes and 9M images of ImageNet from 16.7 % Top1 accuracy to 19.1%.
Efficient reduction for imitation learning
 In AISTATS
, 2010
"... Imitation Learning, while applied successfully on many large realworld problems, is typically addressed as a standard supervised learning problem, where it is assumed the training and testing data are i.i.d.. This is not true in imitation learning as the learned policy influences the future test in ..."
Abstract

Cited by 42 (8 self)
 Add to MetaCart
Imitation Learning, while applied successfully on many large realworld problems, is typically addressed as a standard supervised learning problem, where it is assumed the training and testing data are i.i.d.. This is not true in imitation learning as the learned policy influences the future test inputs (states) upon which it will be tested. We show that this leads to compounding errors and a regret bound that grows quadratically in the time horizon of the task. We propose two alternative algorithms for imitation learning where training occurs over several episodes of interaction. These two approaches share in common that the learner’s policy is slowly modified from executing the expert’s policy to the learned policy. We show that this leads to stronger performance guarantees and demonstrate the improved performance on two challenging problems: training a learner to play 1) a 3D racing game (Super Tux Kart) and 2) Mario Bros.; given input images from the games and corresponding actions taken by a human expert and nearoptimal planner respectively. 1
Information, Divergence and Risk for Binary Experiments
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2009
"... We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all ..."
Abstract

Cited by 37 (8 self)
 Add to MetaCart
We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all are related to costsensitive binary classification. As well as developing relationships between generative and discriminative views of learning, the new machinery leads to tight and more general surrogate regret bounds and generalised Pinsker inequalities relating fdivergences to variational divergence. The new viewpoint also illuminates existing algorithms: it provides a new derivation of Support Vector Machines in terms of divergences and relates Maximum Mean Discrepancy to Fisher Linear Discriminants.
Analysis of a Classificationbased Policy Iteration Algorithm
"... Wepresentaclassificationbasedpolicyiteration algorithm, called Direct PolicyIteration, and provide its finitesample analysis. Our results state a performance bound in terms of the number of policy improvement steps, the number of rollouts used in each iteration, the capacity of the considered poli ..."
Abstract

Cited by 31 (9 self)
 Add to MetaCart
Wepresentaclassificationbasedpolicyiteration algorithm, called Direct PolicyIteration, and provide its finitesample analysis. Our results state a performance bound in terms of the number of policy improvement steps, the number of rollouts used in each iteration, the capacity of the considered policy space, and a new capacity measure which indicates how well the policy space can approximate policiesthataregreedyw.r.t. anyofitsmembers. The analysis reveals a tradeoff between the estimation and approximation errors in this classificationbased policy iteration setting. We also study the consistency of the method when there exists a sequence of policy spaces with increasing capacity. 1.
SemiSupervised Novelty Detection
, 2010
"... A common setting for novelty detection assumes that labeled examples from the nominal class are available, but that labeled examples of novelties are unavailable. The standard (inductive) approach is to declare novelties where the nominal density is low, which reduces the problem to density level se ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
A common setting for novelty detection assumes that labeled examples from the nominal class are available, but that labeled examples of novelties are unavailable. The standard (inductive) approach is to declare novelties where the nominal density is low, which reduces the problem to density level set estimation. In this paper, we consider the setting where an unlabeled and possibly contaminated sample is also available at learning time. We argue that novelty detection in this semisupervised setting is naturally solved by a general reduction to a binary classification problem. In particular, a detector with a desired false positive rate can be achieved through a reduction to NeymanPearson classification. Unlike the inductive approach, semisupervised novelty detection (SSND) yields detectors that are optimal (e.g., statistically consistent) regardless of the distribution on novelties. Therefore, in novelty detection, unlabeled data have a substantial impact on the theoretical properties of the decision rule. We validate the practical utility of SSND with an extensive experimental study. We also show that SSND provides distributionfree, learningtheoretic solutions to two well known problems in hypothesis testing. First, our results provide a general solution to the general twosample problem, that is, the problem of determining whether two random samples arise from the same distribution. Second, a specialization of SSND coincides with the standard pvalue approach to multiple testing under the socalled random effects model. Unlike standard rejection regions based on thresholded pvalues, the general SSND framework allows for adaptation to arbitrary alternative distributions in multiple dimensions.
Regression level set estimation via costsensitive classification
 IEEE Trans. Signal Process
, 2007
"... Abstract—Regression level set estimation is an important yet understudied learning task. It lies somewhere between regression function estimation and traditional binary classification, and in many cases is a more appropriate setting for questions posed in these more common frameworks. This note expl ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
Abstract—Regression level set estimation is an important yet understudied learning task. It lies somewhere between regression function estimation and traditional binary classification, and in many cases is a more appropriate setting for questions posed in these more common frameworks. This note explains how estimating the level set of a regression function from training examples can be reduced to costsensitive classification. We discuss the theoretical and algorithmic benefits of this learning reduction, demonstrate several desirable properties of the associated risk, and report experimental results for histograms, support vector machines, and nearest neighbor rules on synthetic and real data. Index Terms—Costsensitive classification, learning reduction, regression level set estimation, supervised learning.