Results 1  10
of
185
Active learning literature survey
, 2010
"... The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., ..."
Abstract

Cited by 311 (1 self)
 Add to MetaCart
(Show Context)
The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., a human annotator). Active learning is wellmotivated in many modern machine learning problems, where unlabeled data may be abundant but labels are difficult, timeconsuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for active learning, a summary of several problem setting variants, and a discussion
A general agnostic active learning algorithm
, 2007
"... We present a simple, agnostic active learning algorithm that works for any hypothesis class of bounded VC dimension, and any data distribution. Our algorithm extends a scheme of Cohn, Atlas, and Ladner to the agnostic setting, by (1) reformulating it using a reduction to supervised learning and (2) ..."
Abstract

Cited by 113 (12 self)
 Add to MetaCart
(Show Context)
We present a simple, agnostic active learning algorithm that works for any hypothesis class of bounded VC dimension, and any data distribution. Our algorithm extends a scheme of Cohn, Atlas, and Ladner to the agnostic setting, by (1) reformulating it using a reduction to supervised learning and (2) showing how to apply generalization bounds even for the noni.i.d. samples that result from selective sampling. We provide a general characterization of the label complexity of our algorithm. This quantity is never more than the usual PAC sample complexity of supervised learning, and is exponentially smaller for some hypothesis classes and distributions. We also demonstrate improvements experimentally.
A bound on the label complexity of agnostic active learning
 In Proc. of the 24th international conference on Machine learning
, 2007
"... We study the label complexity of poolbased active learning in the agnostic PAC model. Specifically, we derive general bounds on the number of label requests made by the A 2 algorithm proposed by Balcan, Beygelzimer & Langford (Balcan et al., 2006). This represents the first nontrivial generalp ..."
Abstract

Cited by 97 (11 self)
 Add to MetaCart
(Show Context)
We study the label complexity of poolbased active learning in the agnostic PAC model. Specifically, we derive general bounds on the number of label requests made by the A 2 algorithm proposed by Balcan, Beygelzimer & Langford (Balcan et al., 2006). This represents the first nontrivial generalpurpose upperboundonlabelcomplexityintheagnostic PAC model. 1.
Importance Weighted Active Learning
"... We present a practical and statistically consistent scheme for actively learning binary classifiers under general loss functions. Our algorithm uses importance weighting to correct sampling bias, and by controlling the variance, we are able to give rigorous label complexity bounds for the learning p ..."
Abstract

Cited by 94 (9 self)
 Add to MetaCart
(Show Context)
We present a practical and statistically consistent scheme for actively learning binary classifiers under general loss functions. Our algorithm uses importance weighting to correct sampling bias, and by controlling the variance, we are able to give rigorous label complexity bounds for the learning process. 1.
Minimax bounds for active learning
 In COLT
, 2007
"... Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results cle ..."
Abstract

Cited by 87 (10 self)
 Add to MetaCart
(Show Context)
Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results clearly indicate the conditions under which one can expect significant gains through active learning. Furthermore we show that the learning rates derived are tight for “boundary fragment ” classes in ddimensional feature spaces when the feature marginal density is bounded from above and below. 1
Analysis of perceptronbased active learning
 In COLT
, 2005
"... Abstract. We start by showing that in an active learning setting, the Perceptron algorithm needs \Omega ( 1ffl2) labels to learn linear separators within generalization error ffl. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron ..."
Abstract

Cited by 82 (10 self)
 Add to MetaCart
(Show Context)
Abstract. We start by showing that in an active learning setting, the Perceptron algorithm needs \Omega ( 1ffl2) labels to learn linear separators within generalization error ffl. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron update with an adaptive filtering rule for deciding which points to query. For data distributed uniformly over the unit sphere, we show that our algorithm reaches generalization error ffl after asking for just ~O(d log 1ffl) labels. This exponential improvement over the usual sample complexity of supervised learning has previously been demonstrated only for the computationally more complex querybycommittee algorithm. 1 Introduction In many machine learning applications, unlabeled data is abundant but labelingis expensive. This distinction is not captured in the standard PAC or online models of supervised learning, and has motivated the field of active learning, inwhich the labels of data points are initially hidden, and the learner must pay for each label it wishes revealed. If query points are chosen randomly, the numberof labels needed to reach a target generalization error ffl, at a target confidencelevel 1 ffi, is similar to the sample complexity of supervised learning. The hopeis that there are alternative querying strategies which require significantly fewer
Formal Theory of Creativity, Fun, and Intrinsic Motivation (19902010)
"... The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditio ..."
Abstract

Cited by 75 (15 self)
 Add to MetaCart
(Show Context)
The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical but nonoptimal implementations (1991, 1995, 19972002) are reviewed, as well as several recent variants by others (2005). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.
Hierarchical sampling for active learning
 Proceedings of the 25th International Conference on Machine learning
, 2008
"... We present an active learning scheme that exploits cluster structure in data. 1. ..."
Abstract

Cited by 71 (5 self)
 Add to MetaCart
(Show Context)
We present an active learning scheme that exploits cluster structure in data. 1.
The True Sample Complexity of Active Learning
"... We describe and explore a new perspective on the sample complexity of active learning. In many situations where it was generally believed that active learning does not help, we find that active learning does help in the limit, often with exponential improvements in sample complexity. This contrasts ..."
Abstract

Cited by 64 (17 self)
 Add to MetaCart
(Show Context)
We describe and explore a new perspective on the sample complexity of active learning. In many situations where it was generally believed that active learning does not help, we find that active learning does help in the limit, often with exponential improvements in sample complexity. This contrasts with the traditional analysis of active learning problems such as nonhomogeneous linear separators or depthlimited decision trees, in which Ω(1/ɛ) lower bounds are common; we point out that such results must be interpreted carefully, and that finding an ɛgood classifier can always be accomplished with a number of samples asymptotically smaller than any such bound. These new insights arise from a subtle variation on the traditional definition of sample complexity, not previously recognized in the active learning literature. 1
Margin based active learning
 Proc. of the 20 th Conference on Learning Theory
, 2007
"... Abstract. We present a framework for margin based active learning of linear separators. We instantiate it for a few important cases, some of which have been previously considered in the literature. We analyze the effectiveness of our framework both in the realizable case and in a specific noisy sett ..."
Abstract

Cited by 56 (10 self)
 Add to MetaCart
Abstract. We present a framework for margin based active learning of linear separators. We instantiate it for a few important cases, some of which have been previously considered in the literature. We analyze the effectiveness of our framework both in the realizable case and in a specific noisy setting related to the Tsybakov small noise condition. 1