Results 1  10
of
14
The complexity of largescale convex programming under a linear optimization oracle.
, 2013
"... Abstract This paper considers a general class of iterative optimization algorithms, referred to as linearoptimizationbased convex programming (LCP) methods, for solving largescale convex programming (CP) problems. The LCP methods, covering the classic conditional gradient (CG) method (a.k.a., Fra ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
Abstract This paper considers a general class of iterative optimization algorithms, referred to as linearoptimizationbased convex programming (LCP) methods, for solving largescale convex programming (CP) problems. The LCP methods, covering the classic conditional gradient (CG) method (a.k.a., FrankWolfe method) as a special case, can only solve a linear optimization subproblem at each iteration. In this paper, we first establish a series of lower complexity bounds for the LCP methods to solve different classes of CP problems, including smooth, nonsmooth and certain saddlepoint problems. We then formally establish the theoretical optimality or nearly optimality, in the largescale case, for the CG method and its variants to solve different classes of CP problems. We also introduce several new optimal LCP methods, obtained by properly modifying Nesterov's accelerated gradient method, and demonstrate their possible advantages over the classic CG for solving certain classes of largescale CP problems.
Stochastic block mirror descent methods for nonsmooth and stochastic optimization
, 2013
"... ar ..."
(Show Context)
BLOCK STOCHASTIC GRADIENT ITERATION FOR CONVEX AND NONCONVEX OPTIMIZATION
, 2015
"... The stochastic gradient (SG) method can quickly solve a problem with a large number of components in the objective, or a stochastic optimization problem, to a moderate accuracy. The block coordinate descent/update (BCD) method, on the other hand, can quickly solve problems with multiple (blocks of ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
The stochastic gradient (SG) method can quickly solve a problem with a large number of components in the objective, or a stochastic optimization problem, to a moderate accuracy. The block coordinate descent/update (BCD) method, on the other hand, can quickly solve problems with multiple (blocks of) variables. This paper introduces a method that combines the great features of SG and BCD for problems with many components in the objective and with multiple (blocks of) variables. This paper proposes a block SG (BSG) method for both convex and nonconvex programs. BSG generalizes SG by updating all the blocks of variables in the Gauss–Seidel type (updating the current block depends on the previously updated block), in either a fixed or randomly shuffled order. Although BSG has slightly more work at each iteration, it typically outperforms SG because of BSG’s Gauss–Seidel updates and larger step sizes, the latter of which are determined by the smaller perblock Lipschitz constants. The convergence of BSG is established for both convex and nonconvex cases. In the convex case, BSG has the same order of convergence rate as SG. In the nonconvex case, its convergence is established in terms of the expected violation of a firstorder optimality condition. In both cases our analysis is nontrivial since the typical unbiasedness assumption no longer holds. BSG is numerically evaluated on the following problems: stochastic least squares and logistic regression, which are convex, and lowrank tensor recovery and bilinear logistic regression, which are nonconvex. On the convex problems, BSG performed significantly better than SG. On the nonconvex problems, BSG significantly outperformed the deterministic BCD method because the latter tends to stagnate early near local minimizers. Overall, BSG inherits the benefits of both SG approximation and block coordinate updates and is especially useful for solving largescale nonconvex problems.
Convergence of trustregion methods based on probabilistic models
, 2013
"... In this paper we consider the use of probabilistic or random models within a classical trustregion framework for optimization of deterministic smooth general nonlinear functions. Our method and setting differs from many stochastic optimization approaches in two principal ways. Firstly, we assume tha ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
In this paper we consider the use of probabilistic or random models within a classical trustregion framework for optimization of deterministic smooth general nonlinear functions. Our method and setting differs from many stochastic optimization approaches in two principal ways. Firstly, we assume that the value of the function itself can be computed without noise, in other words, that the function is deterministic. Secondly, we use random models of higher quality than those produced by usual stochastic gradient methods. In particular, a first order model based on random approximation of the gradient is required to provide sufficient quality of approximation with probability greater than or equal to 1/2. This is in contrast with stochastic gradient approaches, where the model is assumed to be “correct” only in expectation. As a result of this particular setting, we are able to prove convergence, with probability one, of a trustregion method which is almost identical to the classical method. Moreover, the new method is simpler than its deterministic counterpart as it does not require a criticality step. Hence we show that a standard optimization framework can be used in cases when
Penalty Methods with Stochastic Approximation for Stochastic Nonlinear Programming
, 2013
"... ..."
(Show Context)
S.: Iteration bounds for finding stationary points of structured nonconvex optimization. Working Paper
, 2014
"... In this paper we study proximal conditionalgradient (CG) and proximal gradientprojection type algorithms for a blockstructured constrained nonconvex optimization model, which arises naturally from tensor data analysis. First, we introduce a new notion of stationarity, which is suitable for the s ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
In this paper we study proximal conditionalgradient (CG) and proximal gradientprojection type algorithms for a blockstructured constrained nonconvex optimization model, which arises naturally from tensor data analysis. First, we introduce a new notion of stationarity, which is suitable for the structured problem under consideration. We then propose two types of firstorder algorithms for the model based on the proximal conditionalgradient (CG) method and the proximal gradientprojection method respectively. If the nonconvex objective function is in the form of mathematical expectation, we then discuss how to incorporate randomized sampling to avoid computing the expectations exactly. For the general block optimization model, the proximal subroutines are performed for each block according to either the blockcoordinatedescent (BCD) or the maximumblockimprovement (MBI) updating rule. If the gradient of the nonconvex part of the objective f satisfies ‖∇f(x) − ∇f(y)‖q ≤ M‖x − y‖δp where δ = p/q with 1/p + 1/q = 1, then we prove that the new algorithms have an overall iteration complexity bound of O(1/q) in finding an stationary solution. If f is concave then the iteration complexity reduces to O(1/). Our numerical experiments for tensor approximation problems show promising performances of the new solution algorithms.
PSEUDOMONOTONE STOCHASTIC VARIATIONAL INEQUALITY PROBLEMS: ANALYSIS AND OPTIMAL STOCHASTIC APPROXIMATION SCHEMES
"... Abstract. The variational inequality problem represents an effective tool for capturing a range of phenomena arising in engineering, economics, and applied sciences. Prompted by the role of uncertainty, recent efforts have considered both the analysis as well as the solution of the associated stocha ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The variational inequality problem represents an effective tool for capturing a range of phenomena arising in engineering, economics, and applied sciences. Prompted by the role of uncertainty, recent efforts have considered both the analysis as well as the solution of the associated stochastic variational inequality problem where the map is expectationvalued. Yet, a majority of the studies have been restricted to regimes where the map is monotone, a relatively restrictive assumption. The present work is motivated by the growing interest in pseudomonotone stochastic variational inequality problems (PSVIs); such problems emerge from product pricing, fractional optimization problems, and subclasses of economic equilibrium problems arising in uncertain regimes. Succinctly, we make two sets of contributions to the study of PSVIs. In the first part of the paper, we observe that a direct application of standard existence/uniqueness theory requires a tractable expression for the integrals arising from the expectation, a relative rarity when faced with general distributions. Instead, we develop integrationfree sufficiency conditions for the existence and uniqueness of solutions to PSVIs. Refinements of these statements are provided for stochastic complementarity problems with pseudomonotone maps. In the second part of the paper, we consider the solution of PSVIs via stochastic approximation (SA) schemes, motivated by the observation that almost all of the prior SA schemes can accommodate monotone SVIs. In this context, we make several contributions: (i) Under various forms of pseudomonotonicity, we prove that the solution iterates produced by extragradient SA schemes converge to the solution set in an almost sure sense. This result is further extended to mirrorprox regimes and an analogous statement is also provided for monotone regimes, under a weaksharpness requirement, where prior results have only shown convergence in terms of the gap function through the use of averaging; (ii) Under strong pseudomonotonicity, we derive the optimal initial steplength and show that the meansquared error in the solution iterates produced by the extragradient SA scheme converges at the optimal rate of O ( 1
STOCHASTIC GRADIENT METHODS FOR UNCONSTRAINED OPTIMIZATION
, 2014
"... This papers presents an overview of gradient based methods for minimization of noisy functions. It is assumed that the objective functions is either given with error terms of stochastic nature or given as the mathematical expectation. Such problems arise in the context of simulation based optimiza ..."
Abstract
 Add to MetaCart
This papers presents an overview of gradient based methods for minimization of noisy functions. It is assumed that the objective functions is either given with error terms of stochastic nature or given as the mathematical expectation. Such problems arise in the context of simulation based optimization. The focus of this presentation is on the gradient based Stochastic Approximation and Sample Average Approximation methods. The concept of stochastic gradient approximation of the true gradient can be successfully extended to deterministic problems. Methods of this kind are presented for the data fitting and machine learning problems.