Results 1  10
of
20
Accelerated and inexact forwardbackward algorithms
 Optimization Online, EPrint
"... Abstract. We propose a convergence analysis of accelerated forwardbackward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision. We prove that the 1/k 2 convergence rate for the functio ..."
Abstract

Cited by 18 (10 self)
 Add to MetaCart
(Show Context)
Abstract. We propose a convergence analysis of accelerated forwardbackward splitting methods for composite function minimization, when the proximity operator is not available in closed form, and can only be computed up to a certain precision. We prove that the 1/k 2 convergence rate for the function values can be achieved if the admissible errors are of a certain type and satisfy a sufficiently fast decay condition. Our analysis is based on the machinery of estimate sequences first introduced by Nesterov for the study of accelerated gradient descent algorithms. Furthermore, we give a global complexity analysis, taking into account the cost of computing admissible approximations of the proximal point. An experimental analysis is also presented.
Efficient first order methods for linear composite regularizers
, 2011
"... A wide class of regularization problems in machine learning and statistics employ a regularization term which is obtained by composing a simple convex function ω with a linear transformation. This setting includes Group Lasso methods, the Fused Lasso and other total variation methods, multitask l ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
(Show Context)
A wide class of regularization problems in machine learning and statistics employ a regularization term which is obtained by composing a simple convex function ω with a linear transformation. This setting includes Group Lasso methods, the Fused Lasso and other total variation methods, multitask learning methods and many more. In this paper, we present a general approach for computing the proximity operator of this class of regularizers, under the assumption that the proximity operator of the function ω is known in advance. Our approach builds on a recent line of research on optimal first order optimization methods and uses fixed point iterations for numerically computing the proximity operator. It is more general than current approaches and, as we show with numerical simulations, computationally more efficient
Online GroupStructured Dictionary Learning ∗
"... We develop a dictionary learning method which is (i) online, (ii) enables overlapping group structures with (iii) nonconvex sparsityinducing regularization and (iv) handles the partially observable case. Structured sparsity and the related group norms have recently gained widespread attention in g ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
We develop a dictionary learning method which is (i) online, (ii) enables overlapping group structures with (iii) nonconvex sparsityinducing regularization and (iv) handles the partially observable case. Structured sparsity and the related group norms have recently gained widespread attention in groupsparsity regularized problems in the case when the dictionary is assumed to be known and fixed. However, when the dictionary also needs to be learned, the problem is much more difficult. Only a few methods have been proposed to solve this problem, and they can handle two of these four desirable properties at most. To the best of our knowledge, our proposed method is the first one that possesses all of these properties. We investigate several interesting special cases of our framework, such as the online, structured, sparse nonnegative matrix factorization, and demonstrate the efficiency of our algorithm with several numerical experiments. 1.
Regularizers for structured sparsity
 Advances in Computational Mathematics
"... ar ..."
(Show Context)
A general framework for structured sparsity via proximal optimization
 In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS
"... We study a generalized framework for structured sparsity. It extends the well known methods of Lasso and Group Lasso by incorporating additional constraints on the variables as part of a convex optimization problem. This framework provides a straightforward way of favouring prescribed sparsity patte ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We study a generalized framework for structured sparsity. It extends the well known methods of Lasso and Group Lasso by incorporating additional constraints on the variables as part of a convex optimization problem. This framework provides a straightforward way of favouring prescribed sparsity patterns, such as orderings, contiguous regions and overlapping groups, among others. Available optimization methods are limited to specific constraint sets and tend to not scale well with sample size and dimensionality. We propose a first order proximal method, which builds upon results on fixed points and successive approximations. The algorithm can be applied to a general class of conic and norm constraints sets and relies on a proximity operator subproblem which can be computed numerically. Experiments on different regression problems demonstrate stateoftheart statistical performance, which improves over Lasso, Group Lasso and StructOMP. They also demonstrate the efficiency of the optimization algorithm and its scalability with the size of the problem. 1
Convergence of stochastic proximal gradient algorithm. arXiv:1403.5074
, 2014
"... ar ..."
(Show Context)
Inexact and accelerated proximal point algorithms
 J. Convex Anal
"... We present inexact accelerated proximal point algorithms for minimizing a proper lower semicontinuous and convex function. We carry on a convergence analysis under different types of errors in the evaluation of the proximity operator, and we provide corresponding convergence rates for the objective ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
We present inexact accelerated proximal point algorithms for minimizing a proper lower semicontinuous and convex function. We carry on a convergence analysis under different types of errors in the evaluation of the proximity operator, and we provide corresponding convergence rates for the objective function values. The proof relies on a generalization of the strategy proposed in [14] for generating estimate sequences according to the definition of Nesterov, and is based on the concept of εsubdifferential. We show that the convergence rate of the exact accelerated algorithm 1/k2 can be recovered by constraining the errors to be of a certain type.
Provably Correct Active Sampling Algorithms for Matrix Column Subset Selection with Missing Data ∗
, 2015
"... We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to numerous realworld data applications such as population gen ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to numerous realworld data applications such as population genetics summarization, electronic circuits testing and recommendation systems. In many applications the complete data matrix is unavailable and one needs to select representative columns by inspecting only a small portion of the input matrix. In this paper we propose the first provably correct column subset selection algorithms for partially observed data matrices. Our proposed algorithms exhibit different merits and drawbacks in terms of statistical accuracy, computational efficiency, sample complexity and sampling schemes, which provides a nice exploration of the tradeoff between these desired properties for column subset selection. The proposed methods employ the idea of feedback driven sampling and are inspired by several sampling schemes previously introduced for lowrank matrix approximation tasks [DMM08, FKV04, DV06, KS14]. Our analysis shows that two of the proposed algorithms enjoy a relative error bound, which is preferred for column subset selection and matrix approximation purposes. We also demonstrate through both theoretical and empirical analysis the power of feedback driven sampling compared to uniform random sampling on input matrices with highly correlated columns. 1