Results 1 - 10
of
326
Active learning of inverse models with intrinsically motivated goal exploration in robots
- ROBOTICS AND AUTONOMOUS SYSTEMS
, 2013
"... ..."
(Show Context)
Active learning by labeling features
- In Proc. of EMNLP
, 2009
"... Methods that learn from prior information about input features such as generalized expectation (GE) have been used to train accurate models with very little effort. In this paper, we propose an active learning approach in which the machine solicits “labels ” on features rather than instances. In bot ..."
Abstract
-
Cited by 43 (11 self)
- Add to MetaCart
(Show Context)
Methods that learn from prior information about input features such as generalized expectation (GE) have been used to train accurate models with very little effort. In this paper, we propose an active learning approach in which the machine solicits “labels ” on features rather than instances. In both simulated and real user experiments on two sequence labeling tasks we show that our active learning method outperforms passive learning with features as well as traditional active learning with instances. Preliminary experiments suggest that novel interfaces which intelligently solicit labels on multiple features facilitate more efficient annotation. 1
Active Learning for Reward Estimation in Inverse Reinforcement Learning
, 2009
"... Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, we introduce active learning for inverse reinforcement learning. We propose an algorithm that allows the agent to query the demonst ..."
Abstract
-
Cited by 41 (14 self)
- Add to MetaCart
(Show Context)
Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, we introduce active learning for inverse reinforcement learning. We propose an algorithm that allows the agent to query the demonstrator for samples at specific states, instead of relying only on samples provided at “arbitrary” states. The purpose of our algorithm is to estimate the reward function with similar accuracy as other methods from the literature while reducing the amount of policy samples required from the expert. We also discuss the use of our algorithm in higher dimensional problems, using both Monte Carlo and gradient methods. We present illustrative results of our algorithm in several simulated examples of different complexities.
Active Learning for Networked Data
"... We introduce a novel active learning algorithm for classification of network data. In this setting, training instances are connected by a set of links to form a network, the labels of linked nodes are correlated, and the goal is to exploit these dependencies and accurately label the nodes. This prob ..."
Abstract
-
Cited by 40 (3 self)
- Add to MetaCart
We introduce a novel active learning algorithm for classification of network data. In this setting, training instances are connected by a set of links to form a network, the labels of linked nodes are correlated, and the goal is to exploit these dependencies and accurately label the nodes. This problem arises in many domains, including social and biological network analysis and document classification, and there has been much recent interest in methods that collectively classify the nodes in the network. While in many cases labeled examples are expensive, often network information is available. We show how an active learning algorithm can take advantage of network structure. Our algorithm effectively exploits the links between instances and the interaction between the local and collective aspects of a classifier to improve the accuracy of learning from fewer labeled examples. We experiment with two real-world benchmark collective classification domains, and show that we are able to achieve extremely accurate results even when only a small fraction of the data is labeled. 1.
Co-Training for Domain Adaptation
"... Domain adaptation algorithms seek to generalize a model trained in a source domain to a new target domain. In many practical cases, the source and target distributions can differ substantially, and in some cases crucial target features may not have support in the source domain. In this paper we intr ..."
Abstract
-
Cited by 34 (4 self)
- Add to MetaCart
(Show Context)
Domain adaptation algorithms seek to generalize a model trained in a source domain to a new target domain. In many practical cases, the source and target distributions can differ substantially, and in some cases crucial target features may not have support in the source domain. In this paper we introduce an algorithm that bridges the gap between source and target domains by slowly adding to the training set both the target features and instances in which the current algorithm is the most confident. Our algorithm is a variant of co-training [7], and we name it CODA (Co-training for domain adaptation). Unlike the original co-training work, we do not assume a particular feature split. Instead, for each iteration of cotraining, we formulate a single optimization problem which simultaneously learns a target predictor, a split of the feature space into views, and a subset of source and target features to include in the predictor. CODA significantly out-performs the state-of-the-art on the 12-domain benchmark data set of Blitzer et al. [4]. Indeed, over a wide range (65 of 84 comparisons) of target supervision CODA achieves the best performance. 1
Active co-analysis of a set of shapes
- ACM Trans. on Graph (SIGGRAPH Asia
, 2012
"... Figure 1: Overview of our active co-analysis: (a) We start with an initial unsupervised co-segmentation of the input set. (b) During active learning, the system automatically suggests constraints which would refine results and the user interactively adds constraints as appropriate. In this example, ..."
Abstract
-
Cited by 33 (9 self)
- Add to MetaCart
Figure 1: Overview of our active co-analysis: (a) We start with an initial unsupervised co-segmentation of the input set. (b) During active learning, the system automatically suggests constraints which would refine results and the user interactively adds constraints as appropriate. In this example, the user adds a cannot-link constraint (in red) and a must-link constraint (in blue) between segments. (c) The constraints are propagated to the set and the co-segmentation is refined. The process from (b) to (c) is repeated until the desired result is obtained. Unsupervised co-analysis of a set of shapes is a difficult problem since the geometry of the shapes alone cannot always fully describe the semantics of the shape parts. In this paper, we propose a semi-supervised learning method where the user actively assists in the co-analysis by iteratively providing inputs that progressively constrain the system. We introduce a novel constrained clustering method based on a spring system which embeds elements to better respect their inter-distances in feature space together with the usergiven set of constraints. We also present an active learning method that suggests to the user where his input is likely to be the most effective in refining the results. We show that each single pair of constraints affects many relations across the set. Thus, the method requires only a sparse set of constraints to quickly converge toward a consistent and error-free semantic labeling of the set.
Pairwise Ranking Aggregation in a Crowdsourced Setting
"... Inferring rankings over elements of a set of objects, such as documents or images, is a key learning problem for such important applications as Web search and recommender systems. Crowdsourcing services provide an inexpensive and efficient means to acquire preferences over objects via labeling by se ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
(Show Context)
Inferring rankings over elements of a set of objects, such as documents or images, is a key learning problem for such important applications as Web search and recommender systems. Crowdsourcing services provide an inexpensive and efficient means to acquire preferences over objects via labeling by sets of annotators. We propose a new model to predict a gold-standard ranking that hinges on combining pairwise comparisons via crowdsourcing. In contrast to traditional ranking aggregation methods, the approach learns about and folds into consideration the quality of contributions of each annotator. In addition, we minimize the cost of assessment by introducing a generalization of the traditional active learning scenario to jointly select the annotator and pair to assess while taking into account the annotator quality, the uncertainty over ordering of the pair, and the current model uncertainty. We formalize this as an active learning strategy that incorporates an exploration-exploitation tradeoff and implement it using an efficient online Bayesian updating scheme. Using simulated and real-world data, we demonstrate that the active learning strategy achieves significant reductions in labeling cost while maintaining accuracy.
Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning
"... We consider the problem of retrieving the database points nearest to a given hyperplane query without exhaustively scanning the database. We propose two hashingbased solutions. Our first approach maps the data to two-bit binary keys that are locality-sensitive for the angle between the hyperplane no ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
(Show Context)
We consider the problem of retrieving the database points nearest to a given hyperplane query without exhaustively scanning the database. We propose two hashingbased solutions. Our first approach maps the data to two-bit binary keys that are locality-sensitive for the angle between the hyperplane normal and a database point. Our second approach embeds the data into a vector space where the Euclidean norm reflects the desired distance between the original points and hyperplane query. Both use hashing to retrieve near points in sub-linear time. Our first method’s preprocessing stage is more efficient, while the second has stronger accuracy guarantees. We apply both to pool-based active learning: taking the current hyperplane classifier as a query, our algorithm identifies those points (approximately) satisfying the well-known minimal distance-to-hyperplane selection criterion. We empirically demonstrate our methods ’ tradeoffs, and show that they make it practical to perform active selection with millions of unlabeled points. 1
Defining (Human) Computation
"... Human computation is a term that has been used synonymously with other related concepts, including ―crowdsourcing, ‖ ―social computing, ‖ and ―collective intelligence. ‖ Defining more precisely what human computation means will help to distinguish its research focus from other subfields, and isolate ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
(Show Context)
Human computation is a term that has been used synonymously with other related concepts, including ―crowdsourcing, ‖ ―social computing, ‖ and ―collective intelligence. ‖ Defining more precisely what human computation means will help to distinguish its research focus from other subfields, and isolate a set of fundamental research questions to pursue. In this position paper, we propose a definition of human computation that is grounded on familiar computer science concepts, such as computation and algorithm. Based on our proposed definition, we then outline the three main aspects of a human computation system and the key research questions associated with each aspect.
A unifying framework for computational reinforcement learning theory
, 2009
"... Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervised-learning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understand ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervised-learning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understanding the nature of supervised learning, they have not been as successful in reinforcement learning (RL). Here, the fundamental barrier is the need for active exploration in sequential decision problems. An RL agent tries to maximize long-term utility by exploiting its knowledge about the problem, but this knowledge has to be acquired by the agent itself through exploring the problem that may reduce short-term utility. The need for active exploration is common in many problems in daily life, engineering, and sciences. For example, a Backgammon program strives to take good moves to maximize the probability of winning a game, but sometimes it may try novel and possibly harmful moves to discover how the opponent reacts in the hope of discovering a better game-playing strategy. It has been known since the early days of RL that a good tradeoff between exploration and exploitation is critical for the agent to learn fast (i.e., to reach near-optimal strategies