Results 1  10
of
969
On the linear convergence of the alternating direction method of multipliers
, 2013
"... ..."
(Show Context)
On nonergodic convergence rate of DouglasRachford alternating direction method of multipliers
, 2012
"... Abstract. Recently, a worstcase O(1/t) convergence rate was established for the DouglasRachford alternating direction method of multipliers in an ergodic sense. This note proposes a novel approach to derive the same convergence rate while in a nonergodic sense. ..."
Abstract

Cited by 88 (8 self)
 Add to MetaCart
(Show Context)
Abstract. Recently, a worstcase O(1/t) convergence rate was established for the DouglasRachford alternating direction method of multipliers in an ergodic sense. This note proposes a novel approach to derive the same convergence rate while in a nonergodic sense.
A reliable effective terascale linear learning system
, 2011
"... We present a system and a set of techniques for learning linear predictors with convex losses on terascale data sets, with trillions of features,1 billions of training examples and millions of parameters in an hour using a cluster of 1000 machines. Individually none of the component techniques are n ..."
Abstract

Cited by 72 (6 self)
 Add to MetaCart
(Show Context)
We present a system and a set of techniques for learning linear predictors with convex losses on terascale data sets, with trillions of features,1 billions of training examples and millions of parameters in an hour using a cluster of 1000 machines. Individually none of the component techniques are new, but the careful synthesis required to obtain an efficient implementation is. The result is, up to our knowledge, the most scalable and efficient linear learning system reported in the literature.2 We describe and thoroughly evaluate the components of the system, showing the importance of the various design choices.
Linearized Alternating Direction Method with Adaptive Penalty for LowRank Representation
"... Many machine learning and signal processing problems can be formulated as linearly constrained convex programs, which could be efficiently solved by the alternating direction method (ADM). However, usually the subproblems in ADM are easily solvable only when the linear mappings in the constraints ar ..."
Abstract

Cited by 55 (8 self)
 Add to MetaCart
(Show Context)
Many machine learning and signal processing problems can be formulated as linearly constrained convex programs, which could be efficiently solved by the alternating direction method (ADM). However, usually the subproblems in ADM are easily solvable only when the linear mappings in the constraints are identities. To address this issue, we propose a linearized ADM (LADM) method by linearizing the quadratic penalty term and adding a proximal term when solving the subproblems. For fast convergence, we also allow the penalty to change adaptively according a novel update rule. We prove the global convergence of LADM with adaptive penalty (LADMAP). As an example, we apply LADMAP to solve lowrank representation (LRR), which is an important subspace clustering technique yet suffers from high computation cost. By combining LADMAP with a skinny SVD representation technique, we are able to reduce the complexity O(n 3) of the original ADM based method to O(rn 2), where r and n are the rank and size of the representation matrix, respectively, hence making LRR possible for large scale applications. Numerical experiments verify that for LRR our LADMAP based methods are much faster than stateoftheart algorithms. 1
Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming
, 2011
"... Abstract. Recently, we have proposed to combine the alternating direction method (ADM) with a Gaussian back substitution procedure for solving the convex minimization model with linear constraints and a general separable objective function, i.e., the objective function is the sum of many functions w ..."
Abstract

Cited by 38 (4 self)
 Add to MetaCart
(Show Context)
Abstract. Recently, we have proposed to combine the alternating direction method (ADM) with a Gaussian back substitution procedure for solving the convex minimization model with linear constraints and a general separable objective function, i.e., the objective function is the sum of many functions without coupled variables. In this paper, we further study this topic and show that the decomposed subproblems in the ADM procedure can be substantially alleviated by linearizing the involved quadratic terms arising from the augmented Lagrangian penalty on the model’s linear constraints. When the resolvent operators of the separable functions in the objective have closedform representations, embedding the linearization into the ADM subproblems becomes necessary to yield easy subproblems with closedform solutions. We thus show theoretically that the blend of ADM, Gaussian back substitution and linearization works effectively for the separable convex minimization model under consideration.
Online Alternating Direction Method
 In ICML
, 2012
"... Online optimization has emerged as powerful tool in large scale optimization. In this paper, we introduce efficient online algorithms based on the alternating directions method (ADM). We introduce a new proof technique for ADM in the batch setting, which yields the O(1/T) convergence rate of ADM and ..."
Abstract

Cited by 37 (9 self)
 Add to MetaCart
(Show Context)
Online optimization has emerged as powerful tool in large scale optimization. In this paper, we introduce efficient online algorithms based on the alternating directions method (ADM). We introduce a new proof technique for ADM in the batch setting, which yields the O(1/T) convergence rate of ADM and forms the basis of regret analysis in the online setting. We consider two scenarios in the online setting, based on whether the solution needs to lie in the feasible set or not. In both settings, we establish regret bounds for both the objective function as well as constraint violation for general and strongly convex functions. Preliminary results are presented to illustrate the performance of the proposed algorithms. 1.
Incremental Gradient on the Grassmannian for Online Foreground and Background Separation in Subsampled Video
 IN PROCEEDINGS OF THE 2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR
, 2012
"... It has recently been shown that only a small number of samples from a lowrank matrix are necessary to reconstruct the entire matrix. We bring this to bear on computer vision problems that utilize lowdimensional subspaces, demonstrating that subsampling can improve computation speed while still al ..."
Abstract

Cited by 36 (1 self)
 Add to MetaCart
It has recently been shown that only a small number of samples from a lowrank matrix are necessary to reconstruct the entire matrix. We bring this to bear on computer vision problems that utilize lowdimensional subspaces, demonstrating that subsampling can improve computation speed while still allowing for accurate subspace learning. We present GRASTA, Grassmannian Robust Adaptive Subspace Tracking Algorithm, an online algorithm for robust subspace estimation from randomly subsampled data. We consider the specific application of background and foreground separation in video, and we assess GRASTA on separation accuracy and computation time. In one benchmark video example [16], GRASTA achieves a separation rate of 46.3 frames per second, even when run in MATLAB on a personal laptop.
Convex and network flow optimization for structured sparsity
 JMLR
, 2011
"... We consider a class of learning problems regularized by a structured sparsityinducing norm defined as the sum of ℓ2 or ℓ∞norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address ..."
Abstract

Cited by 35 (8 self)
 Add to MetaCart
We consider a class of learning problems regularized by a structured sparsityinducing norm defined as the sum of ℓ2 or ℓ∞norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of ℓ∞norms can be computed exactly in polynomial time by solving a quadratic mincost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with nonoverlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multitask learning of treestructured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.
Design of optimal sparse feedback gains via the alternating direction method of multipliers
 IEEE Trans. Automat. Control
"... Abstract—We design sparse and block sparse feedback gains that minimize the variance amplification (i.e., the norm) of distributed systems. Our approach consists of two steps. First, we identify sparsity patterns of feedback gains by incorporating sparsitypromoting penalty functions into the optim ..."
Abstract

Cited by 33 (8 self)
 Add to MetaCart
(Show Context)
Abstract—We design sparse and block sparse feedback gains that minimize the variance amplification (i.e., the norm) of distributed systems. Our approach consists of two steps. First, we identify sparsity patterns of feedback gains by incorporating sparsitypromoting penalty functions into the optimal control problem, where the added terms penalize the number of communication links in the distributed controller. Second, we optimize feedback gains subject to structural constraints determined by the identified sparsity patterns. In the first step, the sparsity structure of feedback gains is identified using the alternating direction method of multipliers, which is a powerful algorithm wellsuited to large optimization problems. This method alternates between promoting the sparsity of the controller and optimizing the closedloop performance, which allows us to exploit the structure of the corresponding objective functions. In particular, we take advantage of the separability of the sparsitypromoting penalty functions to decompose the minimization problem into subproblems that can be solved analytically. Several examples are provided to illustrate the effectiveness of the developed approach. Index Terms—Alternating direction method of multipliers (ADMM), communication architectures, continuation methods, minimization, optimization, separable penalty functions, sparsitypromoting optimal control, structured distributed design. I.