Results 1  10
of
26
Online learning for matrix factorization and sparse coding
, 2010
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to ad ..."
Abstract

Cited by 317 (31 self)
 Add to MetaCart
Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to adapt it to specific data. Variations of this problem include dictionary learning in signal processing, nonnegative matrix factorization and sparse principal component analysis. In this paper, we propose to address these tasks with a new online optimization algorithm, based on stochastic approximations, which scales up gracefully to large data sets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems. A proof of convergence is presented, along with experiments with natural images and genomic data demonstrating that it leads to stateoftheart performance in terms of speed and optimization for both small and large data sets.
Revisiting frankwolfe: Projectionfree sparse convex optimization
 In ICML
, 2013
"... We provide stronger and more general primaldual convergence results for FrankWolfetype algorithms (a.k.a. conditional gradient) for constrained convex optimization, enabled by a simple framework of duality gap certificates. Our analysis also holds if the linear subproblems are only solved approxi ..."
Abstract

Cited by 76 (2 self)
 Add to MetaCart
(Show Context)
We provide stronger and more general primaldual convergence results for FrankWolfetype algorithms (a.k.a. conditional gradient) for constrained convex optimization, enabled by a simple framework of duality gap certificates. Our analysis also holds if the linear subproblems are only solved approximately (as well as if the gradients are inexact), and is proven to be worstcase optimal in the sparsity of the obtained solutions. On the application side, this allows us to unify a large variety of existing sparse greedy methods, in particular for optimization over convex hulls of an atomic set, even if those sets can only be approximated, including sparse (or structured sparse) vectors or matrices, lowrank matrices, permutation matrices, or maxnorm bounded matrices. We present a new general framework for convex optimization over matrix factorizations, where every FrankWolfe iteration will consist of a lowrank update, and discuss the broad application areas of this approach. 1.
Structured Sparse Principal Component Analysis
, 2009
"... We present an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes. This structured sparse PCA is based on a structured regularization recently introduced by [1]. While ..."
Abstract

Cited by 73 (16 self)
 Add to MetaCart
(Show Context)
We present an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes. This structured sparse PCA is based on a structured regularization recently introduced by [1]. While classical sparse priors only deal with cardinality, the regularization we use encodes higherorder information about the data. We propose an efficient and simple optimization procedure to solve this problem. Experiments with two practical tasks, face recognition and the study of the dynamics of a protein complex, demonstrate the benefits of the proposed structured approach over unstructured approaches. 1
Structured Sparsity through Convex Optimization
"... Abstract. Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the ℓ1norm. In this paper, we cons ..."
Abstract

Cited by 48 (7 self)
 Add to MetaCart
(Show Context)
Abstract. Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the ℓ1norm. In this paper, we consider situations where we are not only interested in sparsity, but where some structural prior knowledge is available as well. We show that the ℓ1norm can then be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures. We present applications to unsupervised learning, for structured sparse principal component analysis and hierarchical dictionary learning, and to supervised learning in the context of nonlinear variable selection. Key words and phrases: Sparsity, convex optimization. 1.
Exact Recovery of SparselyUsed Dictionaries
 25TH ANNUAL CONFERENCE ON LEARNING THEORY
, 2012
"... We consider the problem of learning sparsely used dictionaries with an arbitrary square dictionary and a random, sparse coefficient matrix. We prove that O(n log n) samples are sufficient to uniquely determine the coefficient matrix. Based on this proof, we design a polynomialtime algorithm, called ..."
Abstract

Cited by 37 (2 self)
 Add to MetaCart
We consider the problem of learning sparsely used dictionaries with an arbitrary square dictionary and a random, sparse coefficient matrix. We prove that O(n log n) samples are sufficient to uniquely determine the coefficient matrix. Based on this proof, we design a polynomialtime algorithm, called Exact Recovery of SparselyUsed Dictionaries (ERSpUD), and prove that it probably recovers the dictionary and coefficient matrix when the coefficient matrix is sufficiently sparse. Simulation results show that ERSpUD reveals the true dictionary as well as the coefficients with probability higher than many stateoftheart algorithms.
Convex Coding
"... Inspired by recent work on convex formulations of clustering (Lashkari & Golland, 2008; Nowozin & Bakir, 2008) we investigate a new formulation of the Sparse Coding Problem (Olshausen & Field, 1997). In sparse coding we attempt to simultaneously represent a sequence of datavectors spars ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
Inspired by recent work on convex formulations of clustering (Lashkari & Golland, 2008; Nowozin & Bakir, 2008) we investigate a new formulation of the Sparse Coding Problem (Olshausen & Field, 1997). In sparse coding we attempt to simultaneously represent a sequence of datavectors sparsely (i.e. sparse approximation (Tropp et al., 2006)) in terms of a “code ” defined by a set of basis elements, while also finding a code that enables such an approximation. As existing alternating optimization procedures for sparse coding are theoretically prone to severe local minima problems, we propose a convex relaxation of the sparse coding problem and derive a boostingstyle algorithm, that (Nowozin & Bakir, 2008) serves as a convex “master problem” which calls a (potentially nonconvex) subproblem to identify the next code element to add. Finally, we demonstrate the properties of our boosted coding algorithm on an image denoising task. 1
Convex Collective Matrix Factorization
"... In many applications, multiple interlinked sources of data are available and they cannot be represented by a single adjacency matrix, to which large scale factorization method could be applied. Collective matrix factorization is a simple yet powerful approach to jointly factorize multiple matrices, ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
In many applications, multiple interlinked sources of data are available and they cannot be represented by a single adjacency matrix, to which large scale factorization method could be applied. Collective matrix factorization is a simple yet powerful approach to jointly factorize multiple matrices, each of which represents a relation between two entity types. Existing algorithms to estimate parameters of collective matrix factorization models are based on nonconvex formulations of the problem; in this paper, a convex formulation of this approach is proposed. This enables the derivation of large scale algorithms to estimate the parameters, including an iterative eigenvalue thresholding algorithm. Numerical experiments illustrate the benefits of this new approach. 1
The generalized tracenorm and its application to structurefrommotion problems
 In ICCV
, 2011
"... In geometric computer vision, the structure from motion (SfM) problem can be formulated as a optimization problem with a rank constraint. It is well known that the trace norm of a matrix can act as a convex proxy for a low rank constraint. Hence, in recent work [7], the tracenorm relaxation has b ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
In geometric computer vision, the structure from motion (SfM) problem can be formulated as a optimization problem with a rank constraint. It is well known that the trace norm of a matrix can act as a convex proxy for a low rank constraint. Hence, in recent work [7], the tracenorm relaxation has been applied to the SfM problem. However, SfM problems often exhibit a certain structure, for example a smooth camera path. Unfortunately, the trace norm relaxation can not make use of this additional structure. This observation motivates the main contribution of this paper. We present the socalled generalized trace norm which allows to encode prior knowledge about a specific problem into a convex regularization term which enforces a low rank solution while at the same time taking the problem structure into account. While deriving the generalized trace norm and stating its different formulations, we draw interesting connections to other fields, most importantly to the field of compressive sensing. Even though the generalized trace norm is a very general concept with a wide area of potential applications we are ultimately interested in applying it to SfM problems. Therefore, we also present an efficient algorithm to optimize the resulting generalized trace norm regularized optimization problems. Results show that the generalized trace norm indeed achieves its goals in providing a problemdependent regularization. 1.
Convex relaxations of structured matrix factorizations
, 2013
"... We consider the factorization of a rectangular matrix X into a positive linear combination of rankone factors of the form uv ⊤ , where u and v belongs to certain sets U and V, that may encode specific structures regarding the factors, such as positivity or sparsity. In this paper, we show that comp ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
We consider the factorization of a rectangular matrix X into a positive linear combination of rankone factors of the form uv ⊤ , where u and v belongs to certain sets U and V, that may encode specific structures regarding the factors, such as positivity or sparsity. In this paper, we show that computing the optimal decomposition is equivalent to computing a certain gauge function of X and we provide a detailed analysis of these gauge functions and their polars. Since these gaugefunctions are typically hard to compute, we present semidefinite relaxations and several algorithms that may recover approximate decompositions with approximation guarantees. We illustrate our results with simulations on finding decompositions with elements in {0,1}. As side contributions, we present a detailed analysis of variational quadratic representations of norms as well as a new iterative basis pursuit algorithm that can deal with inexact firstorder oracles. 1
Learning in Modular Systems
, 2009
"... ception for navigation, providing terrain classification techniques that been incorporated into fielded industrial and government systems [20].Acknowledgements ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
ception for navigation, providing terrain classification techniques that been incorporated into fielded industrial and government systems [20].Acknowledgements