Results 1  10
of
342
Active learning literature survey
, 2010
"... The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., ..."
Abstract

Cited by 326 (1 self)
 Add to MetaCart
The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., a human annotator). Active learning is wellmotivated in many modern machine learning problems, where unlabeled data may be abundant but labels are difficult, timeconsuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for active learning, a summary of several problem setting variants, and a discussion
Nearoptimal sensor placements: Maximizing information while minimizing communication cost
 In IPSN
, 2006
"... When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this paper, we present a datadriven approach that addresses the three ..."
Abstract

Cited by 152 (19 self)
 Add to MetaCart
(Show Context)
When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this paper, we present a datadriven approach that addresses the three central aspects of this problem: measuring the predictive quality of a set of sensor locations (regardless of whether sensors were ever placed at these locations), predicting the communication cost involved with these placements, and designing an algorithm with provable quality guarantees that optimizes the NPhard tradeoff. Specifically, we use data from a pilot deployment to build nonparametric probabilistic models called Gaussian Processes (GPs) both for the spatial phenomena of interest and for the spatial variability of link qualities, which allows us to estimate predictive power and communication cost of unsensed locations. Surprisingly, uncertainty in the representation of link qualities plays an important role in estimating communication costs. Using these models, we present a novel, polynomialtime, datadriven algorithm, pSPIEL, which selects Sensor Placements at Informative and costEffective Locations. Our approach exploits two important properties of this problem: submodularity, formalizing the intuition that adding a node to a small deployment can help more than adding a node to a large deployment; and locality, under which nodes that are far from each other provide almost independent information. Exploiting these properties, we prove strong approximation guarantees for our pSPIEL approach. We also provide extensive experimental validation of this practical approach on several realworld placement problems, and built a complete system implementation on 46 Tmote Sky motes, demonstrating significant advantages over existing methods.
Nearoptimal nonmyopic value of information in graphical models
 IN ANNUAL CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
, 2005
"... A fundamental issue in realworld systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present ..."
Abstract

Cited by 146 (25 self)
 Add to MetaCart
(Show Context)
A fundamental issue in realworld systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1 − 1/e − ε) approximation guarantee for any ε> 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 − 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex realworld datasets.
Maximizing nonmonotone submodular functions
 IN PROCEEDINGS OF 48TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS
, 2007
"... Submodular maximization generalizes many important problems including Max Cut in directed/undirected graphs and hypergraphs, certain constraint satisfaction problems and maximum facility location problems. Unlike the problem of minimizing submodular functions, the problem of maximizing submodular fu ..."
Abstract

Cited by 146 (18 self)
 Add to MetaCart
Submodular maximization generalizes many important problems including Max Cut in directed/undirected graphs and hypergraphs, certain constraint satisfaction problems and maximum facility location problems. Unlike the problem of minimizing submodular functions, the problem of maximizing submodular functions is NPhard. In this paper, we design the first constantfactor approximation algorithms for maximizing nonnegative submodular functions. In particular, we give a deterministic local search 1 2approximation and a randomizedapproximation algo
Sensor Selection via Convex Optimization
 IEEE Transactions on Signal Processing
, 2009
"... We consider the problem of choosing a set of k sensor measurements, from a set of m possible or potential sensor measurements, that minimizes the error in estimating some parameters. Solving this problem by evaluating the performance for each of the(m k possible choices of sensor measurements is not ..."
Abstract

Cited by 96 (2 self)
 Add to MetaCart
We consider the problem of choosing a set of k sensor measurements, from a set of m possible or potential sensor measurements, that minimizes the error in estimating some parameters. Solving this problem by evaluating the performance for each of the(m k possible choices of sensor measurements is not practical unless m and k are small. In this paper we describe a heuristic, based on convex optimization, for approximately solving this problem. Our heuristic gives a subset selection as well as a bound on the best performance that can be achieved by any selection of k sensor measurements. There is no guarantee that the gap between the performance of the chosen subset and the performance bound is always small; but numerical experiments suggest that the gap is small in many cases. Our heuristic method requires on the order of m3 operations; for m = 1000 possible sensors, we can carry out sensor selection in a few seconds on a 2 GHz personal computer. 1
A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning
, 2010
"... We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utilitybased se ..."
Abstract

Cited by 91 (11 self)
 Add to MetaCart
We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utilitybased selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments—active user modelling with preferences, and hierarchical reinforcement learning— and a discussion of the pros and cons of Bayesian optimization based on our experiences.
Nearoptimal observation selection using submodular functions
 In AAAI Nectar
, 2007
"... AI problems such as autonomous robotic exploration, automatic diagnosis and activity recognition have in common the need for choosing among a set of informative but possibly expensive observations. When monitoring spatial phenomena with sensor networks or mobile robots, for example, we need to decid ..."
Abstract

Cited by 90 (13 self)
 Add to MetaCart
AI problems such as autonomous robotic exploration, automatic diagnosis and activity recognition have in common the need for choosing among a set of informative but possibly expensive observations. When monitoring spatial phenomena with sensor networks or mobile robots, for example, we need to decide which locations to observe in order to most effectively decrease the uncertainty, at minimum cost. These problems usually are NPhard. Many observation selection objectives satisfy submodularity, an intuitive diminishing returns property – adding a sensor to a small deployment helps more than adding it to a large deployment. In this paper, we survey recent advances in systematically exploiting this submodularity property to efficiently achieve nearoptimal observation selections, under complex constraints. We illustrate the effectiveness of our approaches on problems of monitoring environmental phenomena and water distribution networks.
Efficient planning of informative paths for multiple robots
 In IJCAI
, 2007
"... In many sensing applications, including environmental monitoring, measurement systems must cover a large space with only limited sensing resources. One approach to achieve required sensing coverage is to use robots to convey sensors within this space.Planning the motion of these robots – coordinatin ..."
Abstract

Cited by 62 (15 self)
 Add to MetaCart
(Show Context)
In many sensing applications, including environmental monitoring, measurement systems must cover a large space with only limited sensing resources. One approach to achieve required sensing coverage is to use robots to convey sensors within this space.Planning the motion of these robots – coordinating their paths in order to maximize the amount of information collected while placing bounds on their resources (e.g., path length or energy capacity) – is a NPhard problem. In this paper, we present an efficient path planning algorithm that coordinates multiple robots, each having a resource constraint, to maximize the “informativeness ” of their visited locations. In particular, we use a Gaussian Process to model the underlying phenomenon, and use the mutual information between the visited locations and remainder of the space to characterize the amount of information collected. We provide strong theoretical approximation guarantees for our algorithm by exploiting the submodularity property of mutual information. In addition, we improve the efficiency of our approach by extending the algorithm using branch and bound and a regionbased decomposition of the space. We provide an extensive empirical analysis of our algorithm, comparing with existing heuristics on datasets from several real world sensing applications.
Active learning via transductive experimental design
 In Machine Learning, Proceedings of the TwentyThird International Conference (ICML
, 2006
"... This paper considers the problem of selecting the most informative experiments x to get measurements y for learning a regression model y = f(x). We propose a novel and simple concept for active learning, transductive experimental design, that explores available unmeasured experiments (i.e.,unlabeled ..."
Abstract

Cited by 59 (2 self)
 Add to MetaCart
(Show Context)
This paper considers the problem of selecting the most informative experiments x to get measurements y for learning a regression model y = f(x). We propose a novel and simple concept for active learning, transductive experimental design, that explores available unmeasured experiments (i.e.,unlabeled data) and has a better scalability in comparison with classic experimental design methods. Our indepth analysis shows that the new method tends to favor experiments that are on the one side hardtopredict and on the other side representative for the rest of the experiments. Efficient optimization of the new design problem is achieved through alternating optimization and sequential greedy search. Extensive experimental results on synthetic problems and three realworld tasks, including questionnaire design for preference learning, active learning for text categorization, and spatial sensor placement, highlight the advantages of the proposed approaches. 1.