Results 1  10
of
10
A primaldual algorithmic framework for constrained convex minimization
, 2014
"... Abstract We present a primaldual algorithmic framework to obtain approximate solutions to a prototypical constrained convex optimization problem, and rigorously characterize how common structural assumptions affect the numerical efficiency. Our main analysis technique provides a fresh perspective ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract We present a primaldual algorithmic framework to obtain approximate solutions to a prototypical constrained convex optimization problem, and rigorously characterize how common structural assumptions affect the numerical efficiency. Our main analysis technique provides a fresh perspective on Nesterov's excessive gap technique in a structured fashion and unifies it with smoothing and primaldual methods. For instance, through the choices of a dual smoothing strategy and a center point, our framework subsumes decomposition algorithms, augmented Lagrangian as well as the alternating direction methodofmultipliers methods as its special cases, and provides optimal convergence rates on the primal objective residual as well as the primal feasibility gap of the iterates for all.
Constrained convex minimization via modelbased excessive gap
 in Proceedings of Neural Information Processing Systems Foundation (NIPS
, 2014
"... We introduce a modelbased excessive gap technique to analyze firstorder primaldual methods for constrained convex minimization. As a result, we construct firstorder primaldual methods with optimal convergence rates on the primal objective residual and the primal feasibility gap of their iterat ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
We introduce a modelbased excessive gap technique to analyze firstorder primaldual methods for constrained convex minimization. As a result, we construct firstorder primaldual methods with optimal convergence rates on the primal objective residual and the primal feasibility gap of their iterates separately. Through a dual smoothing and proxcenter selection strategy, our framework subsumes the augmented Lagrangian, alternating direction, and dual fastgradient methods as special cases, where our rates apply. 1
Parallel Direction Method of Multipliers
"... We consider the problem of minimizing blockseparable convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for twoblock linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, ef ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of minimizing blockseparable convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for twoblock linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve the optimization problems with multiblock linear constraints. PDMM randomly updates some primal blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than stateofthearts methods in two applications, robust principal component analysis and overlapping group lasso. 1
Splitting method for nonconvex composite optimization,” 2014, preprint. Arxiv: http://arxiv.org/pdf/1407.0753.pdf
"... ar ..."
Parameter selection and preconditioning for a graph form solver
, 2015
"... Abstract In a recent paper, Parikh and Boyd describe a method for solving a convex optimization problem, where each iteration involves evaluating a proximal operator and projection onto a subspace. In this paper we address the critical practical issues of how to select the proximal parameter in eac ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract In a recent paper, Parikh and Boyd describe a method for solving a convex optimization problem, where each iteration involves evaluating a proximal operator and projection onto a subspace. In this paper we address the critical practical issues of how to select the proximal parameter in each iteration, and how to scale the original problem variables, so as the achieve reliable practical performance. The resulting method has been implemented as an opensource software package called POGS (Proximal Graph Solver), that targets multicore and GPUbased systems, and has been tested on a wide variety of practical problems. Numerical results show that POGS can solve very large problems (with, say, more than a billion coefficients in the data), to modest accuracy in a few tens of seconds. As just one example, a radiation treatment planning problem with around 100 million coefficients in the data can be solved in a few seconds, as compared to around one hour with an interiorpoint method.
Global Convergence of Splitting Methods for Nonconvex Composite Optimization
 THE PDHG METHOD FOR SEMICONVEX SPLITTINGS 857 arXiv preprint, http://arxiv.org/abs/1407.0753
, 2014
"... Abstract We consider the problem of minimizing the sum of a smooth function h with a bounded Hessian, and a nonsmooth function. We assume that the latter function is a composition of a proper closed function P and a surjective linear map M, with the proximal mappings of τ P , τ > 0, simple to co ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract We consider the problem of minimizing the sum of a smooth function h with a bounded Hessian, and a nonsmooth function. We assume that the latter function is a composition of a proper closed function P and a surjective linear map M, with the proximal mappings of τ P , τ > 0, simple to compute. This problem is nonconvex in general and encompasses many important applications in engineering and machine learning. In this paper, we examined two types of splitting methods for solving this nonconvex optimization problem: alternating direction method of multipliers and proximal gradient algorithm. For the direct adaptation of the alternating direction method of multipliers, we show that, if the penalty parameter is chosen sufficiently large and the sequence generated has a cluster point, then it gives a stationary point of the nonconvex problem. We also establish convergence of the whole sequence under an additional assumption that the functions h and P are semialgebraic. Furthermore, we give simple sufficient conditions to guarantee boundedness of the sequence generated. These conditions can be satisfied for a wide range of applications including the least squares problem with the 1/2 regularization. Finally, when M is the identity so that the proximal gradient algorithm can be efficiently applied, we show that any cluster point is stationary under a slightly more flexible constant stepsize rule than what is known in the literature for a nonconvex h.
KullbackLeibler Proximal Variational Inference
"... Abstract We propose a new variational inference method based on a proximal framework that uses the KullbackLeibler (KL) divergence as the proximal term. We make two contributions towards exploiting the geometry and structure of the variational bound. First, we propose a KL proximalpoint algorithm ..."
Abstract
 Add to MetaCart
Abstract We propose a new variational inference method based on a proximal framework that uses the KullbackLeibler (KL) divergence as the proximal term. We make two contributions towards exploiting the geometry and structure of the variational bound. First, we propose a KL proximalpoint algorithm and show its equivalence to variational inference with natural gradients (e.g., stochastic variational inference). Second, we use the proximal framework to derive efficient variational algorithms for nonconjugate models. We propose a splitting procedure to separate nonconjugate terms from conjugate ones. We linearize the nonconjugate terms to obtain subproblems that admit a closedform solution. Overall, our approach converts inference in a nonconjugate model to subproblems that involve inference in wellknown conjugate models. We show that our method is applicable to a wide variety of models and can result in computationally efficient algorithms. Applications to realworld datasets show comparable performances to existing methods.
MultiStep Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition
"... Abstract In this paper, we consider a multistep version of the stochastic ADMM method with efficient guarantees for highdimensional problems. We first analyze the simple setting, where the optimization problem consists of a loss function and a single regularizer (e.g. sparse optimization), and th ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract In this paper, we consider a multistep version of the stochastic ADMM method with efficient guarantees for highdimensional problems. We first analyze the simple setting, where the optimization problem consists of a loss function and a single regularizer (e.g. sparse optimization), and then extend to the multiblock setting with multiple regularizers and multiple variables (e.g. matrix decomposition into sparse and low rank components). For the sparse optimization problem, our method achieves the minimax rate of O(s log d/T ) for ssparse problems in d dimensions in T steps, and is thus, unimprovable by any method up to constant factors. For the matrix decomposition problem with a general loss function, we analyze the multistep ADMM with multiple blocks. We establish O(1/T ) rate and efficient scaling as the size of matrix grows. For natural noise models (e.g. independent noise), our convergence rate is minimaxoptimal. Thus, we establish tight convergence guarantees for multiblock ADMM in high dimensions. Experiments show that for both sparse optimization and matrix decomposition problems, our algorithm outperforms the stateoftheart methods.
An optimal firstorder primaldual gap reduction framework for constrained convex optimization
"... ..."
Ecole Polytechnique Fédérale de Lausanne
"... We propose a new variational inference method based on a proximal framework that uses the KullbackLeibler (KL) divergence as the proximal term. We make two contributions towards exploiting the geometry and structure of the variational bound. Firstly, we propose a KL proximalpoint algorithm and sho ..."
Abstract
 Add to MetaCart
(Show Context)
We propose a new variational inference method based on a proximal framework that uses the KullbackLeibler (KL) divergence as the proximal term. We make two contributions towards exploiting the geometry and structure of the variational bound. Firstly, we propose a KL proximalpoint algorithm and show its equivalence to variational inference with natural gradients (e.g. stochastic variational inference). Secondly, we use the proximal framework to derive efficient variational algorithms for nonconjugate models. We propose a splitting procedure to separate nonconjugate terms from conjugate ones. We linearize the nonconjugate terms to obtain subproblems that admit a closedform solution. Overall, our approach converts inference in a nonconjugate model to subproblems that involve inference in wellknown conjugate models. We show that our method is applicable to a wide variety of models and can result in computationally efficient algorithms. Applications to realworld datasets show comparable performance to existing methods. 1