Results 1  10
of
28
Adaptive submodularity: Theory and applications in active learning and stochastic optimization
 J. Artificial Intelligence Research
, 2011
"... Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive subm ..."
Abstract

Cited by 70 (15 self)
 Add to MetaCart
(Show Context)
Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations. 1.
Active classification based on value of classifier
 In NIPS
, 2011
"... Abstract Modern classification tasks usually involve many class labels and can be informed by a broad range of features. Many of these tasks are tackled by constructing a set of classifiers, which are then applied at test time and then pieced together in a fixed procedure determined in advance or a ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
(Show Context)
Abstract Modern classification tasks usually involve many class labels and can be informed by a broad range of features. Many of these tasks are tackled by constructing a set of classifiers, which are then applied at test time and then pieced together in a fixed procedure determined in advance or at training time. We present an active classification process at the test time, where each classifier in a large ensemble is viewed as a potential observation that might inform our classification process. Observations are then selected dynamically based on previous observations, using a valuetheoretic computation that balances an estimate of the expected classification gain from each observation as well as its computational cost. The expected classification gain is computed using a probabilistic model that uses the outcome from previous observations. This active classification process is applied at test time for each individual test instance, resulting in an efficient instancespecific decision path. We demonstrate the benefit of the active scheme on various realworld datasets, and show that it can achieve comparable or even higher classification accuracy at a fraction of the computational costs of traditional methods.
A Utilitytheoretic Approach to Privacy in Online Services
, 2010
"... Online offerings such as web search, news portals, and ecommerce applications face the challenge of providing highquality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on specia ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
Online offerings such as web search, news portals, and ecommerce applications face the challenge of providing highquality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user’s demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or ondemand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably nearoptimal optimization of the utilityprivacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users preferences about privacy and utility via a largescale survey, aimed at eliciting preferences about peoples willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users.
Robust Sensor Placements at Informative and CommunicationEfficient Locations
, 2010
"... When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this paper, we present a datadriven approach that addresses the three ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this paper, we present a datadriven approach that addresses the three central aspects of this problem: measuring the predictive quality of a set of hypothetical sensor locations, predicting the communication cost involved with these placements, and designing an algorithm with provable quality guarantees that optimizes the NPhard tradeoff. Specifically, we use data from a pilot deployment to build nonparametric probabilistic models called Gaussian Processes (GPs) both for the spatial phenomena of interest and for the spatial variability of link qualities, which allows us to estimate predictive power and communication cost of unsensed locations. Surprisingly, uncertainty in the representation of link qualities plays an important role in estimating communication costs. Using these models, we present a novel, polynomialtime, datadriven algorithm, PSPIEL, which selects Sensor Placements at Informative and communicationEfficient Locations. Our approach exploit two important properties of this problem: submodularity, formalizing the intuition that adding a node to a small deployment can help more than adding it to a large deployment; and locality, under which nodes that are far from each other provide almost independent information. Exploiting these properties, we prove strong approximation guarantees for our approach. We also show how our placements can be made robust against changes in the environment,
MultiTask Active Learning with Output Constraints
"... Many problems in information extraction, text mining, natural language processing and other fields exhibit the same property: multiple prediction tasks are related in the sense that their outputs (labels) satisfy certain constraints. In this paper, we propose an active learning framework exploiting ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
(Show Context)
Many problems in information extraction, text mining, natural language processing and other fields exhibit the same property: multiple prediction tasks are related in the sense that their outputs (labels) satisfy certain constraints. In this paper, we propose an active learning framework exploiting such relations among tasks. Intuitively, with task outputs coupled by constraints, active learning can utilize not only the uncertainty of the prediction in a single task but also the inconsistency of predictions across tasks. We formalize this idea as a crosstask value of information criteria, in which the reward of a labeling assignment is propagated and measured over all relevant tasks reachable through constraints. A specific example of our framework leads to the cross entropy measure on the predictions of coupled tasks, which generalizes the entropy in the classical singletask uncertain sampling. We conduct experiments on two realworld problems: web information extraction and document classification. Empirical results demonstrate the effectiveness of our framework in actively collecting labeled examples for multiple related tasks. 1
1 Dynamic Processing Allocation in Video
"... Large stores of digital video pose severe computational challenges to existing video analysis algorithms. In applying these algorithms, users must often tradeoff processing speed for accuracy, as many sophisticated and effective algorithms require large computational resources that make it impracti ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Large stores of digital video pose severe computational challenges to existing video analysis algorithms. In applying these algorithms, users must often tradeoff processing speed for accuracy, as many sophisticated and effective algorithms require large computational resources that make it impractical to apply them throughout long videos. One can save considerable effort by applying these expensive algorithms sparingly, directing their application using the results of more limited processing. We show how to do this for retrospective video analysis by modeling a video using a chain graphical model and performing inference both to analyze the video and to direct processing. To accomplish this, we develop a new algorithm to direct processing. This algorithm approximates the optimal solution efficiently. We apply our algorithm to problems in background subtraction and face detection and show in experiments that this leads to significant improvements over baseline algorithms.
Learning adaptive value of information for structured prediction
 In NIPS
, 2013
"... Discriminative methods for learning structured models have enabled widespread use of very rich feature representations. However, the computational cost of feature extraction is prohibitive for largescale or timesensitive applications, often dominating the cost of inference in the models. Signifi ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Discriminative methods for learning structured models have enabled widespread use of very rich feature representations. However, the computational cost of feature extraction is prohibitive for largescale or timesensitive applications, often dominating the cost of inference in the models. Significant efforts have been devoted to sparsitybased model selection to decrease this cost. Such feature selection methods control computation statically and miss the opportunity to finetune feature extraction to each input at runtime. We address the key challenge of learning to control finegrained feature extraction adaptively, exploiting nonhomogeneity of the data. We propose an architecture that uses a rich feedback loop between extraction and prediction. The runtime control policy is learned using efficient valuefunction approximation, which adaptively determines the value of information of features at the level of individual variables for each input. We demonstrate significant speedups over stateoftheart methods on two challenging datasets. For articulated pose estimation in video, we achieve a more accurate stateoftheart model that is also faster, with similar results on an OCR task. 1
Adaptive Informative Path Planning in Metric Spaces
"... Abstract. In contrast to classic robot motion planning, informative path planning (IPP) seeks a path for a robot to sense the world and gain information. In adaptive IPP, the robot chooses the next location on the path using all information acquired so far. The goal is to minimize the robot’s trave ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In contrast to classic robot motion planning, informative path planning (IPP) seeks a path for a robot to sense the world and gain information. In adaptive IPP, the robot chooses the next location on the path using all information acquired so far. The goal is to minimize the robot’s travel cost required to identify a true hypothesis. Adaptive IPP is NPhard. This paper presents Recursive Adaptive Identification (RAId), a new polynomialtime approximation algorithm for adaptive IPP. We prove a polylogarithmic approximation bound when the robot travels in a metric space. Furthermore, our experiments suggest that RAId is efficient in practice and provides good approximate solutions for several distinct robot planning tasks. Although RAId is designed primarily for noiseless observations, a simple extension allows it to handle some tasks with noisy observations. 1
Sensor Selection in HighDimensional Gaussian Trees with Nuisances
"... We consider the sensor selection problem on multivariate Gaussian distributions where only a subset of latent variables is of inferential interest. For pairs of vertices connected by a unique path in the graph, we show that there exist decompositions of nonlocal mutual information into local informa ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
We consider the sensor selection problem on multivariate Gaussian distributions where only a subset of latent variables is of inferential interest. For pairs of vertices connected by a unique path in the graph, we show that there exist decompositions of nonlocal mutual information into local information measures that can be computed efficiently from the output of message passing algorithms. We integrate these decompositions into a computationally efficient greedy selector where the computational expense of quantification can be distributed across nodes in the network. Experimental results demonstrate the comparative efficiency of our algorithms for sensor selection in highdimensional distributions. We additionally derive an onlinecomputable performance bound based on augmentations of the relevant latent variable set that, when such a valid augmentation exists, is applicable for any distribution with nuisances. 1
Submodular Surrogates for Value of Information
"... How should we gather information to make effective decisions? A classical answer to this fundamental problem is given by the decisiontheoretic value of information. Unfortunately, optimizing this objective is intractable, and myopic (greedy) approximations are known to perform poorly. In this paper ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
How should we gather information to make effective decisions? A classical answer to this fundamental problem is given by the decisiontheoretic value of information. Unfortunately, optimizing this objective is intractable, and myopic (greedy) approximations are known to perform poorly. In this paper, we introduce DIRECT, an efficient yet nearoptimal algorithm for nonmyopically optimizing value of information. Crucially, DIRECT uses a novel surrogate objective that is: (1) aligned with the value of information problem (2) efficient to evaluate and (3) adaptive submodular. This latter property enables us to utilize an efficient greedy optimization while providing strong approximation guarantees. We demonstrate the utility of our approach on four diverse casestudies: touchbased robotic localization, comparisonbased preference learning, wildlife conservation management, and preference elicitation in behavioral economics. In the first application, we demonstrate DIRECT in closedloop on an actual robotic platform.