Results 11  20
of
44
Convergence of trustregion methods based on probabilistic models
, 2013
"... In this paper we consider the use of probabilistic or random models within a classical trustregion framework for optimization of deterministic smooth general nonlinear functions. Our method and setting differs from many stochastic optimization approaches in two principal ways. Firstly, we assume tha ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
In this paper we consider the use of probabilistic or random models within a classical trustregion framework for optimization of deterministic smooth general nonlinear functions. Our method and setting differs from many stochastic optimization approaches in two principal ways. Firstly, we assume that the value of the function itself can be computed without noise, in other words, that the function is deterministic. Secondly, we use random models of higher quality than those produced by usual stochastic gradient methods. In particular, a first order model based on random approximation of the gradient is required to provide sufficient quality of approximation with probability greater than or equal to 1/2. This is in contrast with stochastic gradient approaches, where the model is assumed to be “correct” only in expectation. As a result of this particular setting, we are able to prove convergence, with probability one, of a trustregion method which is almost identical to the classical method. Moreover, the new method is simpler than its deterministic counterpart as it does not require a criticality step. Hence we show that a standard optimization framework can be used in cases when
Informationtheoretic lower bounds for convex optimization with erroneous oracles
"... Abstract We consider the problem of optimizing convex and concave functions with access to an erroneous zerothorder oracle. In particular, for a given function x → f (x) we consider optimization when one is given access to absolute error oracles that return values in [f (x) − , f (x) + ] or relati ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract We consider the problem of optimizing convex and concave functions with access to an erroneous zerothorder oracle. In particular, for a given function x → f (x) we consider optimization when one is given access to absolute error oracles that return values in [f (x) − , f (x) + ] or relative error oracles that return value in , for some > 0. We show stark information theoretic impossibility results for minimizing convex functions and maximizing concave functions over polytopes in this model.
Penalty Methods with Stochastic Approximation for Stochastic Nonlinear Programming
, 2013
"... ..."
(Show Context)
Trustregion methods without using derivatives: Worst case complexity and the nonsmooth case
, 2015
"... Trustregion methods are a broad class of methods for continuous optimization that found application in a variety of problems and contexts. In particular, they have been studied and applied for problems without using derivatives. The analysis of trustregion derivativefree methods has focused on gl ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Trustregion methods are a broad class of methods for continuous optimization that found application in a variety of problems and contexts. In particular, they have been studied and applied for problems without using derivatives. The analysis of trustregion derivativefree methods has focused on global convergence, and they have been proved to generate a sequence of iterates converging to stationarity independently of the starting point. Most of such an analysis is carried out in the smooth case, and, moreover, little is known about the complexity or global rate of these methods. In this paper, we start by analyzing the worst case complexity of general trustregion derivativefree methods for smooth functions. For the nonsmooth case, we propose a smoothing approach, for which we prove global convergence and bound the worst case complexity effort. For the special case of nonsmooth functions that result of the composition of smooth and nonsmooth/convex components, we show how to improve the existing results of the literature and make them applicable to the general methodology.
ON PROVING LINEAR CONVERGENCE OF COMPARISONBASED STEPSIZE ADAPTIVE RANDOMIZED SEARCH ON SCALINGINVARIANT FUNCTIONS VIA STABILITY OF MARKOV CHAINS
"... Abstract. In the context of numerical optimization, this paper develops a methodology to analyze the linear convergence of comparisonbased stepsize adaptive randomized search (CBSARS), a class of probabilistic derivativefree optimization algorithms where the function is solely used through comp ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. In the context of numerical optimization, this paper develops a methodology to analyze the linear convergence of comparisonbased stepsize adaptive randomized search (CBSARS), a class of probabilistic derivativefree optimization algorithms where the function is solely used through comparisons of candidate solutions. Various algorithms are included in the class of CBSARS algorithms. On the one hand, a few methods introduced already in the 60’s: the stepsize adaptive random search by Schumer and Steiglitz, the compound random search by Devroye and simplified versions of Matyas ’ random optimization algorithm or Kjellstrom and Taxen Gaussian adaptation. On the other hand, it includes simplified versions of several recent algorithms: the covariancematrixadaptation evolution strategy algorithm (CMAES), the exponential natural evolution strategy (xNES), or the cross entropy method. CBSARS algorithms typically exhibit several invariances. First of all, invariance to composing the objective function with a strictly monotonic transformation which is a direct consequence of the fact that the algorithms only use comparisons. Second, scale invariance that translates the fact that the algorithm has no intrinsic absolute notion of scale. The algorithms are investigated on scalinginvariant functions defined as functions that preserve
Finite sample convergence rates of zeroorder stochastic optimization methods
 In Advances in Neural Information Processing Systems 25
, 2012
"... Abstract We consider derivativefree algorithms for stochastic optimization problems that use only noisy function values rather than gradients, analyzing their finitesample convergence rates. We show that if pairs of function values are available, algorithms that use gradient estimates based on ra ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract We consider derivativefree algorithms for stochastic optimization problems that use only noisy function values rather than gradients, analyzing their finitesample convergence rates. We show that if pairs of function values are available, algorithms that use gradient estimates based on random perturbations suffer a factor of at most √ d in convergence rate over traditional stochastic gradient methods, where d is the problem dimension. We complement our algorithmic development with informationtheoretic lower bounds on the minimax convergence rate of such problems, which show that our bounds are sharp with respect to all problemdependent quantities: they cannot be improved by more than constant factors.
On the InformationAdaptive Variants of the ADMM: an Iteration Complexity Perspective
, 2014
"... Designing algorithms for an optimization model often amounts to maintaining a balance between the degree of information to request from the model on the one hand, and the computational speed to expect on the other hand. Naturally, the more information is available, the faster one can expect the alg ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Designing algorithms for an optimization model often amounts to maintaining a balance between the degree of information to request from the model on the one hand, and the computational speed to expect on the other hand. Naturally, the more information is available, the faster one can expect the algorithm to converge. The popular algorithm of ADMM demands that objective function is easy to optimize once the coupled constraints are shifted to the objective with multipliers. However, in many applications this assumption does not hold; instead, only some noisy estimations of the gradient of the objective – or even only the objective itself – are available. This paper aims to bridge this gap. We present a suite of variants of the ADMM, where the tradeoffs between the required information on the objective and the computational complexity are explicitly given. The new variants allow the method to be applicable on a much broader class of problems where only noisy estimations of the gradient or the function values are accessible, yet the flexibility is achieved without sacrificing the computational complexity bounds.
On ZerothOrder Stochastic Convex Optimization via Random Walks
, 2014
"... We propose a method for zeroth order stochastic convex optimization that attains the suboptimality rate of Õ(n7T−1/2) after T queries for a convex bounded function f: Rn → R. The method is based on a random walk (the Ball Walk) on the epigraph of the function. The randomized approach circumvents t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We propose a method for zeroth order stochastic convex optimization that attains the suboptimality rate of Õ(n7T−1/2) after T queries for a convex bounded function f: Rn → R. The method is based on a random walk (the Ball Walk) on the epigraph of the function. The randomized approach circumvents the problem of gradient estimation, and appears to be less sensitive to noisy function evaluations compared to noiseless zeroth order methods.