• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Optimistic Optimization of Deterministic Functions without the Knowledge of its Smoothness. (2011)

by Remi Munos
Venue:In Neural Information Processing Systems,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 20
Next 10 →

Exponential regret bounds for Gaussian process bandits with deterministic observations

by Nando De Freitas, Alex J. Smola, Masrour Zoghi - In ICML , 2012
"... This paper analyzes the problem of Gaussian process (GP) bandits with deterministic observations. The analysis uses a branch and bound algorithm that is related to the UCB algorithm of (Srinivas et al., 2010). For GPs with Gaussian observation noise, with variance strictly greater than zero, (Sriniv ..."
Abstract - Cited by 17 (9 self) - Add to MetaCart
This paper analyzes the problem of Gaussian process (GP) bandits with deterministic observations. The analysis uses a branch and bound algorithm that is related to the UCB algorithm of (Srinivas et al., 2010). For GPs with Gaussian observation noise, with variance strictly greater than zero, (Srinivas et al., 2010) proved that the regret ( ) vanishes at the approximate 1√t rate of O, where t is the number of observations. To complement their result, we attack the deterministic case and attain a much faster exponential convergence rate. Under some regularity assumptions, we show that the regret decreases asymptotically according to O e − τt (ln t) d/4 with high probability. Here, d is the dimension of the search space and τ is a constant that depends on the behaviour of the objective function near its global maximum. 1.

Stochastic simultaneous optimistic optimization

by Michal Valko, Alexandra Carpentier, Rémi Munos - In International Conference on Machine Learning , 2013
"... We study the problem of global maximization of a function f given a finite number of evaluations perturbed by noise. We consider a very weak assumption on the function, namely that it is locally smooth (in some precise sense) with respect to some semi-metric, around one of its global maxima. Compare ..."
Abstract - Cited by 13 (7 self) - Add to MetaCart
We study the problem of global maximization of a function f given a finite number of evaluations perturbed by noise. We consider a very weak assumption on the function, namely that it is locally smooth (in some precise sense) with respect to some semi-metric, around one of its global maxima. Compared to previous works on bandits in general spaces (Kleinberg et al., 2008; Bubeck et al., 2011a) our algorithm does not require the knowledge of this semi-metric. Our algorithm, StoSOO, follows an optimistic strategy to iteratively construct upper confidence bounds over the hierarchical partitions of the function domain to decide which point to sample next. A finite-time analysis of StoSOO shows that it performs almost as well as the best specifically-tuned algorithms even though the local smoothness of the function is not known. 1.
(Show Context)

Citation Context

...aier, 2008). There has been also an interest in designing sample-efficient strategies, only requiring local smoothness around (one) of the global maxima (Kleinberg et al., 2008; Bubeck et al., 2011a; =-=Munos, 2011-=-). However, these approaches still assume the knowledge of this smoothness, i.e., the metric under which the function is smooth, which may not be available to the optimizer. Recently, Munos (2011) pro...

On correlation and budget constraints in model-based bandit optimization

by Matthew W. Hoffman, Bobak Shahriari, Nando Freitas
"... with application to automatic machine learning ..."
Abstract - Cited by 8 (5 self) - Add to MetaCart
with application to automatic machine learning
(Show Context)

Citation Context

...[Srinivas et al., 2010, Hoffman et al., 2011]. In the realm of optimizing deterministic functions, a few works have proven exponential rates of convergence for simple regret [de Freitas et al., 2012, =-=Munos, 2011-=-]. A stochastic variant of the work of Munos has been recently proposed by Valko et al. [2013]; this approach takes a tree-based structure for expanding areas of the optimization problem in question, ...

Bayesian Multi-Scale Optimistic Optimization

by Ziyu Wang, Babak Shakibi, Lin Jin, Nando Freitas
"... Bayesian optimization is a powerful global op-timization technique for expensive black-box functions. One of its shortcomings is that it re-quires auxiliary optimization of an acquisition function at each iteration. This auxiliary opti-mization can be costly and very hard to carry out in practice. M ..."
Abstract - Cited by 6 (3 self) - Add to MetaCart
Bayesian optimization is a powerful global op-timization technique for expensive black-box functions. One of its shortcomings is that it re-quires auxiliary optimization of an acquisition function at each iteration. This auxiliary opti-mization can be costly and very hard to carry out in practice. Moreover, it creates serious theoret-ical concerns, as most of the convergence results assume that the exact optimum of the acquisition function can be found. In this paper, we intro-duce a new technique for efficient global opti-mization that combines Gaussian process confi-dence bounds and treed simultaneous optimistic optimization to eliminate the need for auxiliary optimization of acquisition functions. The exper-iments with global optimization benchmarks and a novel application to automatic information ex-traction demonstrate that the resulting technique is more efficient than the two approaches from which it draws inspiration. Unlike most theo-retical analyses of Bayesian optimization with Gaussian processes, our finite-time convergence rate proofs do not require exact optimization of an acquisition function. That is, our approach eliminates the unsatisfactory assumption that a difficult, potentially NP-hard, problem has to be solved in order to obtain vanishing regret rates. 1
(Show Context)

Citation Context

...s the necessity of re-starting the auxiliary optimization at each iteration. Recent optimistic optimizationmethods provide a viable alternative to BO (Kocsis & Szepesvári, 2006; Bubeck et al., 2011; =-=Munos, 2011-=-). Instead of estimating a posterior distribution over the unknown objective function, these methods build space partitioning trees by expanding leaves with high function values or upper-bounds. The t...

Relative confidence sampling for efficient on-line ranker evaluation

by Masrour Zoghi, Shimon Whiteson, Maarten De Rijke, Remi Munos - In WSDM ’14 , 2014
"... A key challenge in information retrieval is that of on-line ranker evaluation: determining which one of a finite set of rankers performs the best in expectation on the basis of user clicks on presented document lists. When the presented lists are constructed using interleaved comparison methods, whi ..."
Abstract - Cited by 3 (3 self) - Add to MetaCart
A key challenge in information retrieval is that of on-line ranker evaluation: determining which one of a finite set of rankers performs the best in expectation on the basis of user clicks on presented document lists. When the presented lists are constructed using interleaved comparison methods, which interleave lists proposed by two different candidate rankers, then the problem of minimizing the total regret accumulated while evaluating the rankers can be formalized as a K-armed dueling bandits problem. In this paper, we propose a new method called relative confidence sampling (RCS) that aims to reduce cumulative regret by being less conservative than existing methods in eliminating rankers from contention. In addition, we present an empirical comparison between RCS and two state-of-the-art methods, relative upper confidence bound and SAVAGE. The results demonstrate that RCS can substantially outperform these alternatives on several large learning to rank datasets.
(Show Context)

Citation Context

...lated in practice, as shown in Section 6.1. One potential remedy for dealing with this difficulty is to use correlation information between rankers and make use of continuous armed bandits algorithms =-=[3, 10, 25, 30, 33]-=-, although both the algorithm and the theoretical analysis would be considerably more intricate in this setting, since the goal is not simply to find the maximum of a continuous function. As explained...

Optimistic planning for belief-augmented Markov decision processes

by Raphael Fonteneau - In IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL , 2013
"... Abstract—This paper presents the Bayesian Optimistic Plan-ning (BOP) algorithm, a novel model-based Bayesian reinforce-ment learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition mode ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
Abstract—This paper presents the Bayesian Optimistic Plan-ning (BOP) algorithm, a novel model-based Bayesian reinforce-ment learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet dis-tributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance. I.
(Show Context)

Citation Context

...2] and stochastic systems [25], [39], [7], [3], [10], [9], [40] when the system dynamics / transition model is known, and also (iii) optimization of unknown functions only accessible through sampling =-=[27]-=-. The optimistic principle has also been used for addressing the E/E dilemma for MDPs when the transition model is unknown and progressively learned through interactions with the environment. For inst...

Optimistic planning for continuous–action deterministic systems.

by Lucian Buşoniu , Alexander Daniels , Rémi Munos , Robert Babuška - In 2013 IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-13), , 2013
"... Abstract : We consider the optimal control of systems with deterministic dynamics, continuous, possibly large-scale state spaces, and continuous, low-dimensional action spaces. We describe an online planning algorithm called SOOP, which like other algorithms in its class has no direct dependence on ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract : We consider the optimal control of systems with deterministic dynamics, continuous, possibly large-scale state spaces, and continuous, low-dimensional action spaces. We describe an online planning algorithm called SOOP, which like other algorithms in its class has no direct dependence on the state space structure. Unlike previous algorithms, SOOP explores the true solution space, consisting of infinite sequences of continuous actions, without requiring knowledge about the smoothness of the system. To this end, it borrows the principle of the simultaneous optimistic optimization method, and develops a nontrivial adaptation of this principle to the planning problem. Experiments on four problems show SOOP reliably ranks among the best algorithms, fully dominating competing methods when the problem requires both long horizons and fine discretization.

Bayesian Optimization with Exponential Convergence

by Kenji Kawaguchi , Leslie Pack , Kaelbling , Tomás Lozano-Pérez
"... Abstract This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the δ-cover sampling. Most Bayesian optimization methods require auxiliary optimization: an additional non-convex global optimization problem, which can be ..."
Abstract - Add to MetaCart
Abstract This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the δ-cover sampling. Most Bayesian optimization methods require auxiliary optimization: an additional non-convex global optimization problem, which can be time-consuming and hard to implement in practice. Also, the existing Bayesian optimization method with exponential convergence [1] requires access to the δ-cover sampling, which was considered to be impractical
(Show Context)

Citation Context

...on for introducing a new optimization problem lies in the assumption that the cost of evaluating the objective function f dominates that of solving additional optimization problem. For deterministic function, de Freitas et al. [1] recently presented a theoretical procedure that maintains exponential convergence rate. However, their own paper and the follow-up research [1, 2] point out that this result relies on an impractical sampling procedure, the δ-cover sampling. To overcome this issue, Wang et al. [2] combined GP-UCB with a hierarchical partitioning optimization method, the SOO algorithm [18], providing a regret bound with polynomial dependence on the number of function evaluations. They concluded that creating a GP-based algorithm with an exponential convergence rate without the impractical sampling procedure remained an open problem. 3 Infinite-Metric GP Optimization 3.1 Overview The GP-UCB algorithm can be seen as a member of the class of bound-based search methods, which includes Lipschitz optimization, A* search, and PAC-MDP algorithms with optimism in the 2 face of uncertainty. Bound-based search methods have a common property: the tightness of the bound determines its effec...

Black-box optimization of noisy functions with unknown smoothness

by Jean-Bastien Grill , Michal Valko , Rémi Munos , Google Deepmind , Uk
"... Abstract We study the problem of black-box optimization of a function f of any dimension, given function evaluations perturbed by noise. The function is assumed to be locally smooth around one of its global optima, but this smoothness is unknown. Our contribution is an adaptive optimization algorit ..."
Abstract - Add to MetaCart
Abstract We study the problem of black-box optimization of a function f of any dimension, given function evaluations perturbed by noise. The function is assumed to be locally smooth around one of its global optima, but this smoothness is unknown. Our contribution is an adaptive optimization algorithm, POO or parallel optimistic optimization, that is able to deal with this setting. POO performs almost as well as the best known algorithms requiring the knowledge of the smoothness. Furthermore, POO works for a larger class of functions than what was previously considered, especially for functions that are difficult to optimize, in a very precise sense. We provide a finite-time analysis of POO's performance, which shows that its error after n evaluations is at most a factor of √ ln n away from the error of the best known optimization algorithms using the knowledge of the smoothness.
(Show Context)

Citation Context

...x −1.0 −0.8 −0.6 −0.4 −0.2 0.0 f (x ) 0.0 0.2 0.4 0.6 0.8 1.0 ρ 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 si m pl e re gr et af te r 50 00 ev al ua tio ns Figure 1: Difficult function f : x → s (log2 |x− 0.5|) · ( √ |x− 0.5 |− (x− 0.5)2) − √ |x− 0.5| where, s(x) = 1 if the fractional part of x, that is, x − bxc, is in [0, 0.5] and s(x) = 0, if it is in (0.5, 1). Left: Oscillation between two envelopes of different smoothness leading to a nonzero d for a standard partitioning. Right: Regret of HOO after 5000 evaluations for different values of ρ. Another direction has been followed by Munos [11], where in the deterministic case (the function evaluations are not perturbed by noise), their SOO algorithm performs almost as well as the best known algorithms without the knowledge of the function smoothness. SOO was later extended to StoSOO [15] for the stochastic case. However StoSOO only extends SOO for a limited case of easy instances of functions for which there exists a semi-metric under which d = 0. Also, Bull [6] provided a similar regret bound for the ATB algorithm for a class of functions, called zooming continuous functions, which is related to the class of functions for which th...

(2013)" Consensus for Agents with General Dynamics Using Optimistic Optimization

by Irinel-constantin Morărescu , 2013
"... Abstract—An important challenge in multiagent systems is consensus, in which the agents must agree on certain controlled variables of interest. So far, most consensus algorithms for agents with nonlinear dynamics exploit the specific form of the nonlinearity. Here, we propose an approach that only r ..."
Abstract - Add to MetaCart
Abstract—An important challenge in multiagent systems is consensus, in which the agents must agree on certain controlled variables of interest. So far, most consensus algorithms for agents with nonlinear dynamics exploit the specific form of the nonlinearity. Here, we propose an approach that only requires a blackbox simulation model of the dynamics, and is therefore applicable to a wide class of nonlinearities. This approach works for agents communicating on a fixed, connected network. It designs a reference behavior with a classical consensus protocol, and then finds control actions that drive the nonlinear agents towards the reference states, using a recent optimistic optimization algorithm. By exploiting the guarantees of optimistic optimization, we prove that the agents achieve practical consensus. A representative example is further analyzed, and simulation results on nonlinear robotic arms are provided. I.
(Show Context)

Citation Context

...e pour des systèmes interconnectés”. The main ingredient lending this approach its generality is the global optimization algorithm used at the control stage, called sequential optimistic optimization =-=[13]-=-. This algorithm is selected since it guarantees closeness to the reference states for very general dynamics, and crucially, it only uses a black-box simulation model of the dynamics rather than their...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University