Results 1  10
of
11
A linearly convergent conditional gradient algorithm with applications to online and stochastic optimization
, 2013
"... Linear optimization is many times algorithmically simpler than nonlinear convex optimization. Linear optimization over matroid polytopes, matching polytopes and path polytopes are example of problems for which we have simple and efficient combinatorial algorithms, but whose nonlinear convex count ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
Linear optimization is many times algorithmically simpler than nonlinear convex optimization. Linear optimization over matroid polytopes, matching polytopes and path polytopes are example of problems for which we have simple and efficient combinatorial algorithms, but whose nonlinear convex counterpart is harder and admits significantly less efficient algorithms. This motivates the computational model of convex optimization, including the offline, online and stochastic settings, using a linear optimization oracle. In this computational model we give several new results that improve over the previous stateoftheart. Our main result is a novel conditional gradient algorithm for smooth and strongly convex optimization over polyhedral sets that performs only a single linear optimization step over the domain on each iteration and enjoys a linear convergence rate. This gives an exponential improvement in convergence rate over previous results. Based on this new conditional gradient algorithm we give the first algorithms for online convex optimization over polyhedral sets that perform only a single linear optimization step over the domain while having optimal regret guarantees, answering an open question of Kalai and Vempala, and Hazan and Kale. Our online algorithms also imply conditional gradient algorithms for nonsmooth and stochastic convex optimization with the same convergence rates as projected (sub)gradient methods. Key words. frankwolfe algorithm; conditional gradient methods; linear programming; firstorder methods; online convex optimization; online learning; stochastic optimization AMS subject classifications. 65K05; 90C05; 90C06; 90C25; 90C30; 90C27; 90C15
New Analysis and Results for the Conditional Gradient Method
, 2013
"... We present new results for the conditional gradient method (also known as the FrankWolfe method). We derive computational guarantees for arbitrary stepsize sequences, which are then applied to various stepsize rules, including simple averaging and constant stepsizes. We also develop stepsize ru ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We present new results for the conditional gradient method (also known as the FrankWolfe method). We derive computational guarantees for arbitrary stepsize sequences, which are then applied to various stepsize rules, including simple averaging and constant stepsizes. We also develop stepsize rules and computational guarantees that depend naturally on the warmstart quality of the initial (and subsequent) iterates. Our results include computational guarantees for both duality/bound gaps and the socalled Wolfe gaps. Lastly, we present complexity bounds in the presence of approximate computation of gradients and/or linear optimization subproblem solutions.
On Lower Complexity Bounds for LargeScale Smooth Convex Optimization. ArXiv eprints,
, 2014
"... Abstract In this note we present tight lower bounds on the informationbased complexity of largescale smooth convex minimization problems. We demonstrate, in particular, that the kstep Conditional Gradient (a.k.a. FrankWolfe) algorithm as applied to minimizing smooth convex functions over the n ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Abstract In this note we present tight lower bounds on the informationbased complexity of largescale smooth convex minimization problems. We demonstrate, in particular, that the kstep Conditional Gradient (a.k.a. FrankWolfe) algorithm as applied to minimizing smooth convex functions over the ndimensional box with n ≥ k is optimal, up to an O(ln n)factor, in terms of informationbased complexity.
Conditional gradient sliding for convex optimization
, 2014
"... Abstract In this paper, we present a new conditional gradient type method for convex optimization by utilizing a linear optimization (LO) oracle to minimize a series of linear functions over the feasible set. Different from the classic conditional gradient method, the conditional gradient sliding ( ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract In this paper, we present a new conditional gradient type method for convex optimization by utilizing a linear optimization (LO) oracle to minimize a series of linear functions over the feasible set. Different from the classic conditional gradient method, the conditional gradient sliding (CGS) algorithm developed herein can skip the computation of gradients from time to time, and as a result, can achieve the optimal complexity bounds in terms of not only the number of calls to the LO oracle, but also the number of gradient evaluations. More specifically, we show that the CGS method requires O(1/ √ ) and O(log(1/ )) gradient evaluations, respectively, for solving smooth and strongly convex problems, while still maintaining the optimal O(1/ ) bound on the number of calls to the LO oracle. We also develop variants of the CGS method which can achieve the optimal complexity bounds for solving stochastic optimization problems and an important class of saddle point optimization problems. To the best of our knowledge, this is the first time that these types of projectionfree optimal firstorder methods have been developed in the literature. Some preliminary numerical results have also been provided to demonstrate the advantages of the CGS method.
S.: Iteration bounds for finding stationary points of structured nonconvex optimization. Working Paper
, 2014
"... In this paper we study proximal conditionalgradient (CG) and proximal gradientprojection type algorithms for a blockstructured constrained nonconvex optimization model, which arises naturally from tensor data analysis. First, we introduce a new notion of stationarity, which is suitable for the s ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
In this paper we study proximal conditionalgradient (CG) and proximal gradientprojection type algorithms for a blockstructured constrained nonconvex optimization model, which arises naturally from tensor data analysis. First, we introduce a new notion of stationarity, which is suitable for the structured problem under consideration. We then propose two types of firstorder algorithms for the model based on the proximal conditionalgradient (CG) method and the proximal gradientprojection method respectively. If the nonconvex objective function is in the form of mathematical expectation, we then discuss how to incorporate randomized sampling to avoid computing the expectations exactly. For the general block optimization model, the proximal subroutines are performed for each block according to either the blockcoordinatedescent (BCD) or the maximumblockimprovement (MBI) updating rule. If the gradient of the nonconvex part of the objective f satisfies ‖∇f(x) − ∇f(y)‖q ≤ M‖x − y‖δp where δ = p/q with 1/p + 1/q = 1, then we prove that the new algorithms have an overall iteration complexity bound of O(1/q) in finding an stationary solution. If f is concave then the iteration complexity reduces to O(1/). Our numerical experiments for tensor approximation problems show promising performances of the new solution algorithms.
Unifying lower bounds on the oracle complexity of nonsmooth convex optimization via information theory
, 2014
"... ..."
Suykens, “Hybrid conditional gradientsmoothing algorithms with applications to sparse and low rank regularization
 Regularization, Optimization, Kernels, and Support Vector Machines
, 2014
"... Conditional gradient methods are old and well studied optimization algorithms. Their origin dates at least to the 50’s and the FrankWolfe algorithm for quadratic programming [18] but they apply to much more general optimization problems. General formulations of conditional gradient algorithms have ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Conditional gradient methods are old and well studied optimization algorithms. Their origin dates at least to the 50’s and the FrankWolfe algorithm for quadratic programming [18] but they apply to much more general optimization problems. General formulations of conditional gradient algorithms have been studied in the
SemiProximal Mirror Prox Background Key Components SemiMP Algorithm Experiments
"... Firstorder methods for composite minimization minx∈X f (x) + h(x) f and h are convex, f is smooth, h is simple. (Acc)Proximal gradient methods (when h proximalfriendly) Proximal operator: proxh(η) = argminx∈X { 12‖x − η‖22 + h(x)} For example, when h(x) = ‖x‖1, reduces to soft thresholding. Wors ..."
Abstract
 Add to MetaCart
Firstorder methods for composite minimization minx∈X f (x) + h(x) f and h are convex, f is smooth, h is simple. (Acc)Proximal gradient methods (when h proximalfriendly) Proximal operator: proxh(η) = argminx∈X { 12‖x − η‖22 + h(x)} For example, when h(x) = ‖x‖1, reduces to soft thresholding. Worst complexity bound for firstorder oracles is O(1/√). Conditional gradient methods (when h is LMOfriendly) (Composite) linear minimization oracles(LMO): LMOh(η) = argminx∈X{〈η, x〉+ h(x)} For example, when h(x) = ‖x‖nuc or δ‖x‖nuc≤1(x), reduces to computing top pair of singular vectors. Worst (also optimal) complexity bound for LMOs is O(1/).