Results 1  10
of
9,860
Adaptive submodular maximization in bandit setting
 In Advances in Neural Information Processing Systems 26
, 2013
"... Maximization of submodular functions has wide applications in machine learning and artificial intelligence. Adaptive submodular maximization has been traditionally studied under the assumption that the model of the world, the expected gain of choosing an item given previously selected items and the ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
and their states, is known. In this paper, we study the setting where the expected gain is initially unknown, and it is learned by interacting repeatedly with the optimized function. We propose an efficient algorithm for solving our problem and prove that its expected cumulative regret increases logarithmically
The Nonstochastic Multiarmed Bandit Problem
 SIAM JOURNAL OF COMPUTING
, 2002
"... In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the tradeoff between exploration (trying out ..."
Abstract

Cited by 490 (34 self)
 Add to MetaCart
In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the tradeoff between exploration (trying
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
"... Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regre ..."
Abstract

Cited by 125 (13 self)
 Add to MetaCart
Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving
Online Geometric Optimization in the Bandit Setting against an Adaptive Adversary
, 2004
"... We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this problem we are given a bounded set S R of feasible points. At each time step t, the online algorithm must select a poi ..."
Abstract

Cited by 71 (7 self)
 Add to MetaCart
We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this problem we are given a bounded set S R of feasible points. At each time step t, the online algorithm must select a
Cheap but clever: Human active learning in a bandit setting
 In Proceedings of the Cognitive Science Society Conference
, 2013
"... How people achieve longterm goals in an imperfectly known environment, via repeated tries and noisy outcomes, is an important problem in cognitive science. There are two interrelated questions: how humans represent information, both what has been learned and what can still be learned, and how the ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
they choose actions, in particular how they negotiate the tension between exploration and exploitation. In this work, we examine human behavioral data in a multiarmed bandit setting, in which the subject choose one of four “arms ” to pull on each trial and receives a binary outcome (win/lose). We im
Forgetful bayes and myopic planning: Human learning and decisionmaking in a bandit setting
 In Advances in Neural Information Processing Systems
, 2013
"... How humans achieve longterm goals in an uncertain environment, via repeated trials and noisy observations, is an important problem in cognitive science. We investigate this behavior in the context of a multiarmed bandit task. We compare human behavior to a variety of models that vary in their rep ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
How humans achieve longterm goals in an uncertain environment, via repeated trials and noisy observations, is an important problem in cognitive science. We investigate this behavior in the context of a multiarmed bandit task. We compare human behavior to a variety of models that vary
Beyond banditron: A conservative and efficient reduction for online multiclass prediction with bandit setting model
 in Proc. of the 9th IEEE ICDM
"... Abstract—In this paper, we consider a recently proposed supervised learning problem, called online multiclass prediction with bandit setting model. Aiming at learning from partial feedback of online classification results, i.e. “true ” when the predicting label is right or “false ” when the predicti ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract—In this paper, we consider a recently proposed supervised learning problem, called online multiclass prediction with bandit setting model. Aiming at learning from partial feedback of online classification results, i.e. “true ” when the predicting label is right or “false ” when
Informationtheoretic regret bounds for Gaussian process optimization in the bandit setting. Information Theory
 IEEE Transactions on
, 2012
"... Abstract—Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low norm in a reproducing kernel Hilbert space. We resolve th ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
Abstract—Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low norm in a reproducing kernel Hilbert space. We resolve
Results 1  10
of
9,860