Results 1  10
of
945
On the linear convergence of the alternating direction method of multipliers,” arXiv preprint arXiv:1208.3922, 2012, submitted to Mathematical Programming
"... ar ..."
(Show Context)
private communication
"... A rigid interval graph is an interval graph which has only one clique tree. In 2009, Panda and Das show that all connected unit interval graphs are rigid interval graphs. Generalizing the two classic graph search algorithms, Lexicographic BreadthFirst Search (LBFS) and Maximum Cardinality Search (M ..."
Abstract

Cited by 88 (6 self)
 Add to MetaCart
A rigid interval graph is an interval graph which has only one clique tree. In 2009, Panda and Das show that all connected unit interval graphs are rigid interval graphs. Generalizing the two classic graph search algorithms, Lexicographic BreadthFirst Search (LBFS) and Maximum Cardinality Search (MCS), Corneil and Krueger propose in 2008 the socalled Maximal Neighborhood Search (MNS) and show that one sweep of MNS is enough to recognize chordal graphs. We develop the MNS properties of rigid interval graphs and characterize this graph class in several different ways. This allows us obtain several linear time multisweep MNS algorithms for recognizing rigid interval graphs and unit interval graphs, generalizing a corresponding 3sweep LBFS algorithm for unit interval graph recognition designed by Corneil in 2004. For unit interval graphs, we even present a new linear time 2sweep MNS certifying recognition algorithm. Submitted:
On nonergodic convergence rate of DouglasRachford alternating direction method of multipliers
, 2012
"... Abstract. Recently, a worstcase O(1/t) convergence rate was established for the DouglasRachford alternating direction method of multipliers in an ergodic sense. This note proposes a novel approach to derive the same convergence rate while in a nonergodic sense. ..."
Abstract

Cited by 79 (7 self)
 Add to MetaCart
(Show Context)
Abstract. Recently, a worstcase O(1/t) convergence rate was established for the DouglasRachford alternating direction method of multipliers in an ergodic sense. This note proposes a novel approach to derive the same convergence rate while in a nonergodic sense.
A reliable effective terascale linear learning system
, 2011
"... We present a system and a set of techniques for learning linear predictors with convex losses on terascale data sets, with trillions of features,1 billions of training examples and millions of parameters in an hour using a cluster of 1000 machines. Individually none of the component techniques are n ..."
Abstract

Cited by 65 (6 self)
 Add to MetaCart
(Show Context)
We present a system and a set of techniques for learning linear predictors with convex losses on terascale data sets, with trillions of features,1 billions of training examples and millions of parameters in an hour using a cluster of 1000 machines. Individually none of the component techniques are new, but the careful synthesis required to obtain an efficient implementation is. The result is, up to our knowledge, the most scalable and efficient linear learning system reported in the literature.2 We describe and thoroughly evaluate the components of the system, showing the importance of the various design choices.
Linearized Alternating Direction Method with Adaptive Penalty for LowRank Representation
"... Many machine learning and signal processing problems can be formulated as linearly constrained convex programs, which could be efficiently solved by the alternating direction method (ADM). However, usually the subproblems in ADM are easily solvable only when the linear mappings in the constraints ar ..."
Abstract

Cited by 53 (8 self)
 Add to MetaCart
(Show Context)
Many machine learning and signal processing problems can be formulated as linearly constrained convex programs, which could be efficiently solved by the alternating direction method (ADM). However, usually the subproblems in ADM are easily solvable only when the linear mappings in the constraints are identities. To address this issue, we propose a linearized ADM (LADM) method by linearizing the quadratic penalty term and adding a proximal term when solving the subproblems. For fast convergence, we also allow the penalty to change adaptively according a novel update rule. We prove the global convergence of LADM with adaptive penalty (LADMAP). As an example, we apply LADMAP to solve lowrank representation (LRR), which is an important subspace clustering technique yet suffers from high computation cost. By combining LADMAP with a skinny SVD representation technique, we are able to reduce the complexity O(n 3) of the original ADM based method to O(rn 2), where r and n are the rank and size of the representation matrix, respectively, hence making LRR possible for large scale applications. Numerical experiments verify that for LRR our LADMAP based methods are much faster than stateoftheart algorithms. 1
Online Alternating Direction Method
 In ICML
, 2012
"... Online optimization has emerged as powerful tool in large scale optimization. In this paper, we introduce efficient online algorithms based on the alternating directions method (ADM). We introduce a new proof technique for ADM in the batch setting, which yields the O(1/T) convergence rate of ADM and ..."
Abstract

Cited by 39 (9 self)
 Add to MetaCart
(Show Context)
Online optimization has emerged as powerful tool in large scale optimization. In this paper, we introduce efficient online algorithms based on the alternating directions method (ADM). We introduce a new proof technique for ADM in the batch setting, which yields the O(1/T) convergence rate of ADM and forms the basis of regret analysis in the online setting. We consider two scenarios in the online setting, based on whether the solution needs to lie in the feasible set or not. In both settings, we establish regret bounds for both the objective function as well as constraint violation for general and strongly convex functions. Preliminary results are presented to illustrate the performance of the proposed algorithms. 1.
Incremental Gradient on the Grassmannian for Online Foreground and Background Separation in Subsampled Video
 In proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR
, 2012
"... It has recently been shown that only a small number of samples from a lowrank matrix are necessary to reconstruct the entire matrix. We bring this to bear on computer vision problems that utilize lowdimensional subspaces, demonstrating that subsampling can improve computation speed while still al ..."
Abstract

Cited by 36 (1 self)
 Add to MetaCart
It has recently been shown that only a small number of samples from a lowrank matrix are necessary to reconstruct the entire matrix. We bring this to bear on computer vision problems that utilize lowdimensional subspaces, demonstrating that subsampling can improve computation speed while still allowing for accurate subspace learning. We present GRASTA, Grassmannian Robust Adaptive Subspace Tracking Algorithm, an online algorithm for robust subspace estimation from randomly subsampled data. We consider the specific application of background and foreground separation in video, and we assess GRASTA on separation accuracy and computation time. In one benchmark video example [16], GRASTA achieves a separation rate of 46.3 frames per second, even when run in MATLAB on a personal laptop. 1.
Convex and network flow optimization for structured sparsity
 JMLR
, 2011
"... We consider a class of learning problems regularized by a structured sparsityinducing norm defined as the sum of ℓ2 or ℓ∞norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address ..."
Abstract

Cited by 35 (9 self)
 Add to MetaCart
We consider a class of learning problems regularized by a structured sparsityinducing norm defined as the sum of ℓ2 or ℓ∞norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of ℓ∞norms can be computed exactly in polynomial time by solving a quadratic mincost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with nonoverlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multitask learning of treestructured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.
Linearized Alternating Direction Method with Gaussian Back Substitution for Separable Convex Programming
, 2011
"... Abstract. Recently, we have proposed to combine the alternating direction method (ADM) with a Gaussian back substitution procedure for solving the convex minimization model with linear constraints and a general separable objective function, i.e., the objective function is the sum of many functions w ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Recently, we have proposed to combine the alternating direction method (ADM) with a Gaussian back substitution procedure for solving the convex minimization model with linear constraints and a general separable objective function, i.e., the objective function is the sum of many functions without coupled variables. In this paper, we further study this topic and show that the decomposed subproblems in the ADM procedure can be substantially alleviated by linearizing the involved quadratic terms arising from the augmented Lagrangian penalty on the model’s linear constraints. When the resolvent operators of the separable functions in the objective have closedform representations, embedding the linearization into the ADM subproblems becomes necessary to yield easy subproblems with closedform solutions. We thus show theoretically that the blend of ADM, Gaussian back substitution and linearization works effectively for the separable convex minimization model under consideration.