Results 1  10
of
96
A simple distributionfree approach to the max karmed bandit problem
 In Proceedings of the Twelfth International Conference on Principles and Practice of Constraint Programming
, 2006
"... Abstract. The max karmed bandit problem is a recentlyintroduced online optimization problem with practical applications to heuristic search. Given a set of k slot machines, each yielding payoff from a fixed (but unknown) distribution, we wish to allocate trials to the machines so as to maximize th ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Abstract. The max karmed bandit problem is a recentlyintroduced online optimization problem with practical applications to heuristic search. Given a set of k slot machines, each yielding payoff from a fixed (but unknown) distribution, we wish to allocate trials to the machines so as to maximize
An asymptotically optimal algorithm for the max karmed bandit problem
 In Proceedings of the TwentyFirst National Conference on Artificial Intelligence (AAAI
, 2006
"... We present an asymptotically optimal algorithm for the max variant of the karmed bandit problem. Given a set of k slot machines, each yielding payoff from a fixed (but unknown) distribution, we wish to allocate trials to the machines so as to maximize the expected maximum payoff received over a se ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
We present an asymptotically optimal algorithm for the max variant of the karmed bandit problem. Given a set of k slot machines, each yielding payoff from a fixed (but unknown) distribution, we wish to allocate trials to the machines so as to maximize the expected maximum payoff received over a
Choosing MultiIssue Negotiating Object Based on Trust and KArmed Bandit Problem
"... Wang LM, Huang HK, Chai YM. Choosing multiissue negotiating object based on trust and Karmed bandit ..."
Abstract
 Add to MetaCart
Wang LM, Huang HK, Chai YM. Choosing multiissue negotiating object based on trust and Karmed bandit
Selecting Among Heuristics by Solving Thresholded kArmed Bandit Problems
"... Suppose we are given k randomized heuristics to use in solving a combinatorial problem. Each heuristic, when run, produces a solution with an associated quality or value. Given a budget of n runs, our goal is to allocate runs to the heuristics so as to maximize the number of sampled solutions whose ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
value exceeds a specified threshold. For this special case of the classical karmed bandit problem, we present a strategy with O ( √ np ∗ k ln n) additive regret, where p ∗ is the probability of sampling an abovethreshold solution using the best single heuristic. We demonstrate the usefulness of our
PAC Lower Bounds and Efficient Algorithms for The Max KArmed Bandit Problem
"... Abstract We consider the Max KArmed Bandit problem, where a learning agent is faced with several stochastic arms, each a source of i.i.d. rewards of unknown distribution. At each time step the agent chooses an arm, and observes the reward of the obtained sample. Each sample is considered here as a ..."
Abstract
 Add to MetaCart
Abstract We consider the Max KArmed Bandit problem, where a learning agent is faced with several stochastic arms, each a source of i.i.d. rewards of unknown distribution. At each time step the agent chooses an arm, and observes the reward of the obtained sample. Each sample is considered here
An Asymptotically Optimal Algorithm for the Max kArmed Bandit Problem
"... We present an asymptotically optimal algorithm for the max variant of the karmed bandit problem. Given a set of k slot machines, each yielding payoff from a fixed (but unknown) distribution, we wish to allocate trials to the machines so as to maximize the expected maximum payoff received over a ser ..."
Abstract
 Add to MetaCart
We present an asymptotically optimal algorithm for the max variant of the karmed bandit problem. Given a set of k slot machines, each yielding payoff from a fixed (but unknown) distribution, we wish to allocate trials to the machines so as to maximize the expected maximum payoff received over a
The max karmed bandit: A new model of exploration applied to search heuristic selection
 In AAAI
, 2005
"... The multiarmed bandit is often used as an analogy for the tradeoff between exploration and exploitation in search problems. The classic problem involves allocating trials to the arms of a multiarmed slot machine to maximize the expected sum of rewards. We pose a new variation of the multiarmed bandi ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
bandit—the Max KArmed Bandit—in which trials must be allocated among the arms to maximize the expected best single sample reward of the series of trials. Motivation for the Max KArmed Bandit is the allocation of restarts among a set of multistart stochastic search algorithms. We present an analysis
Relative Upper Confidence Bound for the KArmed Dueling Bandit Problem
"... This paper proposes a new method for the Karmed dueling bandit problem, a variation on the regularKarmed bandit problem that offers only relative feedback about pairs of arms. Our approach extends the Upper Confidence Bound algorithm to the relative setting by using estimates of the pairwise pro ..."
Abstract
 Add to MetaCart
This paper proposes a new method for the Karmed dueling bandit problem, a variation on the regularKarmed bandit problem that offers only relative feedback about pairs of arms. Our approach extends the Upper Confidence Bound algorithm to the relative setting by using estimates of the pairwise
Relative upper confidence bound for the karmed dueling bandit problem. arXiv preprint arXiv:1312.3393
, 2013
"... This paper proposes a new method for the Karmed dueling bandit problem, a variation on the regular Karmed bandit problem that offers only relative feedback about pairs of arms. Our approach extends the Upper Confidence Bound algorithm to the relative setting by using estimates of the pairwise pr ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
This paper proposes a new method for the Karmed dueling bandit problem, a variation on the regular Karmed bandit problem that offers only relative feedback about pairs of arms. Our approach extends the Upper Confidence Bound algorithm to the relative setting by using estimates of the pairwise
The Karmed Dueling Bandits Problem
, 2009
"... We study a partialinformation onlinelearning problem where actions are restricted to noisy comparisons between pairs of strategies (also known as bandits). In contrast to conventional approaches that require the absolute reward of the chosen strategy to be quantifiable and observable, our setting ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
We study a partialinformation onlinelearning problem where actions are restricted to noisy comparisons between pairs of strategies (also known as bandits). In contrast to conventional approaches that require the absolute reward of the chosen strategy to be quantifiable and observable, our setting
Results 1  10
of
96